INDEX
Explanations
text fragments containing special characters and keywords like "infowars," "gur," and "Facebook."
mentions of significant organizations or events
New Auto-Interp
Negative Logits
hement
-0.78
compan
-0.75
stride
-0.72
opp
-0.68
Cyborg
-0.66
favoured
-0.64
chosen
-0.63
existing
-0.62
awa
-0.62
depending
-0.61
POSITIVE LOGITS
Associated
0.94
News
0.85
SPONSORED
0.83
encer
0.82
edia
0.79
Topics
0.77
Morning
0.77
videos
0.77
eport
0.77
tnc
0.76
Activations Density 0.186%