INDEX
    Explanations

    phrases or constructs that emphasize comparison or simile

    New Auto-Interp
    Negative Logits
    alls
    -0.17
    оÑĢм
    -0.17
    ingt
    -0.16
    eters
    -0.15
    etary
    -0.15
    cott
    -0.14
    awe
    -0.14
    HORT
    -0.14
     saja
    -0.14
    instein
    -0.14
    POSITIVE LOGITS
     follows
    0.22
    cribed
    0.21
    paragus
    0.20
    sembl
    0.18
    cert
    0.17
    -is
    0.16
     having
    0.15
     souÄįást
    0.14
    dit
    0.14
    cribe
    0.14
    Act Density 0.148%

    No Known Activations