INDEX
Explanations
punctuation marks
the presence of repeated punctuation marks, specifically commas and possibly parentheses
New Auto-Interp
Negative Logits
hens
-0.72
20439
-0.69
ability
-0.65
inar
-0.63
Nap
-0.61
outgoing
-0.58
CLE
-0.58
herent
-0.56
hed
-0.56
estate
-0.56
POSITIVE LOGITS
anecd
0.78
albeit
0.72
uyomi
0.72
lest
0.69
_>
0.68
bet
0.67
perhaps
0.67
although
0.67
alas
0.67
but
0.63
Activations Density 0.347%