INDEX
Explanations
auxiliary verbs and words indicating obligation or possibility
New Auto-Interp
Negative Logits
Į¨
-0.14
toMatch
-0.13
iglia
-0.12
εια
-0.12
PÅĻed
-0.12
िà¤ķल
-0.12
ojÃŃ
-0.11
aucoup
-0.11
inspace
-0.11
posables
-0.11
POSITIVE LOGITS
not
1.34
not
1.09
NOT
1.05
Not
0.98
Not
0.90
_not
0.84
-not
0.84
.not
0.83
not
0.79
NOT
0.77
Activations Density 1.082%