INDEX
Explanations
proper nouns related to political figures
references to members of Congress
New Auto-Interp
Negative Logits
phal
-0.85
istically
-0.75
olean
-0.66
Siberian
-0.65
Finnish
-0.64
theless
-0.64
zsche
-0.63
Brist
-0.62
ahime
-0.62
Firefly
-0.62
POSITIVE LOGITS
rint
1.22
orters
1.19
utation
1.13
ository
1.08
orter
1.07
utations
1.01
orted
0.99
rieve
0.99
ublic
0.98
ositories
0.95
Activations Density 0.009%