INDEX
Explanations
phrases referring to specific actions or events
indicative phrases suggesting actions or decisions by groups or individuals
New Auto-Interp
Negative Logits
.","
-0.55
zag
-0.52
ige
-0.51
=-=-=-=-=-=-=-=-
-0.49
UCHIJ
-0.49
apan
-0.49
aer
-0.48
ixir
-0.48
iership
-0.48
cients
-0.47
POSITIVE LOGITS
latter
0.64
irony
0.63
atto
0.60
scathing
0.55
damning
0.54
NOTICE
0.53
arently
0.52
excerpts
0.52
furthermore
0.51
ironic
0.51
Activations Density 1.661%