INDEX
Explanations
references to personal growth and transformation experiences
New Auto-Interp
Negative Logits
amd
-0.18
atron
-0.17
iens
-0.16
anders
-0.16
iq
-0.15
ushima
-0.14
cak
-0.14
icana
-0.14
Ãĸr
-0.14
thouse
-0.14
POSITIVE LOGITS
many
0.52
many
0.41
everyone
0.41
everybody
0.38
Many
0.37
Many
0.35
majority
0.35
многиÑħ
0.34
MANY
0.33
meisten
0.32
Activations Density 0.564%