INDEX
    Explanations

    expressions of personal opinion and moral judgments

    New Auto-Interp
    Negative Logits
    bard
    -0.16
    qd
    -0.16
    yd
    -0.15
    /repos
    -0.15
    esz
    -0.14
    orary
    -0.14
    abbage
    -0.14
     Planet
    -0.14
    fé
    -0.13
    mojom
    -0.13
    POSITIVE LOGITS
     mosquito
    0.17
    ucch
    0.15
    iaux
    0.15
    ustum
    0.14
     myself
    0.14
    reau
    0.14
     siti
    0.14
     metic
    0.14
    .cx
    0.14
    OCK
    0.14
    Act Density 0.218%

    No Known Activations