INDEX
    Explanations

    themes of neutrality and balance in discourse

    New Auto-Interp
    Negative Logits
    Bounding
    -0.17
     âĨĵ
    -0.16
    rete
    -0.15
    shortcut
    -0.14
    coni
    -0.14
    iena
    -0.14
    tail
    -0.14
    ottage
    -0.14
    oard
    -0.14
    mise
    -0.14
    POSITIVE LOGITS
     neutral
    0.55
     neutrality
    0.54
     Neutral
    0.51
    neutral
    0.49
    -neutral
    0.47
    Neutral
    0.46
     impartial
    0.41
     neutr
    0.34
     neither
    0.28
     balanced
    0.28
    Act Density 0.197%

    No Known Activations