INDEX
Explanations
verbs related to influence, initiative, and movement
terms related to motivation or actions that lead to outcomes
New Auto-Interp
Negative Logits
Seym
-0.97
çĦ
-0.77
ereo
-0.75
roma
-0.74
aido
-0.71
osuke
-0.68
Anniversary
-0.68
Sud
-0.68
Lumpur
-0.68
iannopoulos
-0.66
POSITIVE LOGITS
driving
0.90
driving
0.76
away
0.75
bike
0.73
wheel
0.71
wedge
0.71
bys
0.70
ousel
0.69
iday
0.69
hunger
0.67
Activations Density 0.027%