INDEX
    Explanations

    references to statistical or numerical data in research contexts

    New Auto-Interp
    Negative Logits
    -0.69
    -0.61
     in
    -0.61
    <eos>
    -0.60
    ↵↵
    -0.59
    .
    -0.59
     .
    -0.55
     ,
    -0.55
     de
    -0.54
     of
    -0.52
    POSITIVE LOGITS
    %:
    1.24
    ✨:
    1.23
    _:
    1.22
     autorytatywna
    1.21
    ®:
    1.21
    :
    1.20
    !:
    1.13
    +:
    1.11
    __:
    1.10
     :
    1.10
    Act Density 0.331%

    No Known Activations