INDEX
Explanations
choice-related phrases and expressions of preference
New Auto-Interp
Negative Logits
sient
-0.41
udra
-0.40
webElement
-0.40
Enabled
-0.39
Kary
-0.39
ابس
-0.37
ssan
-0.37
างเกง
-0.37
ulous
-0.37
addImage
-0.37
POSITIVE LOGITS
preferring
1.88
prefer
1.77
preference
1.77
prefers
1.73
Prefer
1.70
prefer
1.69
preferred
1.59
предпоч
1.54
Preference
1.54
préf
1.53
Activations Density 0.537%