INDEX
Explanations
conjunctions that introduce reasoning or causation
New Auto-Interp
Negative Logits
iders
-0.15
еÑģÑı
-0.14
äº
-0.14
-setup
-0.14
conc
-0.14
inesis
-0.14
Ùĩ
-0.14
(æĹ¥
-0.14
ãĥĥãĥĪ
-0.14
Interceptor
-0.14
POSITIVE LOGITS
otherwise
0.18
divers
0.15
Sto
0.15
else
0.15
if
0.15
although
0.14
zá
0.14
OTHERWISE
0.14
bcc
0.14
Ingram
0.14
Activations Density 0.092%