INDEX
    Explanations

    Hebrew words or phrases

    sequences of special or non-Latin characters, potentially indicating text in a different script or language

    New Auto-Interp
    Negative Logits
     Cly
    -0.77
     Osc
    -0.75
    eson
    -0.74
    iko
    -0.72
    lyak
    -0.68
    annis
    -0.68
    abase
    -0.67
    upiter
    -0.67
     NX
    -0.66
    psey
    -0.66
    POSITIVE LOGITS
    Ù
    1.83
    ا
    1.75
    Ùĩ
    1.72
    ÙĨ
    1.66
    د
    1.65
    اØ
    1.63
    ÙĬ
    1.61
    ت
    1.60
    Ø
    1.58
    Ùħ
    1.57
    Act Density 0.005%

    No Known Activations