INDEX
    Explanations

    expressions of perception or belief in social dynamics

    New Auto-Interp
    Negative Logits
     direct
    -0.14
    ernel
    -0.14
     جÙĦ
    -0.14
     pari
    -0.14
     Primitive
    -0.13
     unrelated
    -0.13
    irect
    -0.13
    ooth
    -0.13
    467
    -0.13
    Äįet
    -0.13
    POSITIVE LOGITS
     undecided
    0.47
     amb
    0.42
     uncertainty
    0.40
     ambiguity
    0.39
     neutral
    0.39
     ambiguous
    0.38
     inde
    0.38
     uncertain
    0.38
     unsure
    0.37
    ambiguous
    0.36
    Act Density 0.376%

    No Known Activations