INDEX
Explanations
attends to proper nouns or specific terms from subsequent mentions or variations of the same word.
references to specific names and entities, particularly related to people, locations, and environmental terms
New Auto-Interp
Head Attr Weights
0:0.61
1:0.02
2:0.03
3:0.02
4:0.07
5:0.02
6:0.05
7:0.03
8:0.02
9:0.01
10:0.04
11:0.02
Negative Logits
pause
-1.69
leap
-1.60
MIC
-1.59
Warn
-1.58
Ce
-1.55
deadlines
-1.54
BUS
-1.53
STA
-1.52
EUR
-1.52
LU
-1.52
POSITIVE LOGITS
burgh
2.03
Frameworks
2.02
velt
1.90
igl
1.81
idy
1.80
mathemat
1.77
luaj
1.76
anyahu
1.72
gerald
1.69
ollen
1.69
Activations Density 0.595%