INDEX
    Explanations

    references to personal growth and transformation experiences

    New Auto-Interp
    Negative Logits
    amd
    -0.18
    atron
    -0.17
    iens
    -0.16
    anders
    -0.16
    iq
    -0.15
    ushima
    -0.14
    cak
    -0.14
    icana
    -0.14
     Ãĸr
    -0.14
    thouse
    -0.14
    POSITIVE LOGITS
     many
    0.52
    many
    0.41
     everyone
    0.41
     everybody
    0.38
     Many
    0.37
    Many
    0.35
     majority
    0.35
     многиÑħ
    0.34
     MANY
    0.33
     meisten
    0.32
    Act Density 0.564%

    No Known Activations