INDEX
    Explanations

    instances of the word "vice" and related terms

    Follows "or", "at", or "say"

    New Auto-Interp
    Negative Logits
     itſelf
    -1.47
     myſelf
    -1.46
     Monfieur
    -1.46
     Efq
    -1.43
     raiſ
    -1.42
     auffi
    -1.40
     ſtate
    -1.35
     Theſe
    -1.32
     houſe
    -1.32
     pleaſure
    -1.32
    POSITIVE LOGITS
     (
    0.82
     a
    0.77
    0.77
    ,
    0.72
     [
    0.70
    <eos>
    0.69
     B
    0.68
     in
    0.67
     “
    0.67
     b
    0.66
    Act Density 0.287%

    No Known Activations