Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APISteerSAE EvalsBlog/PodcastNEWSlackPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlog/PodcastGitHubSlackTwitterContact
    1. Home
    2. Dunefsky · Chlenski · Transcoders Enable Fine-Grained Interpretable Circuit Analysis
    3. GPT2-Small
    4. Transcoders Residuals
    5. 8-TRES-DC
    6. 0
    Prev
    Next
    INDEX
    Explanations

    occurrences of the word "first" related to achievements or milestones

    oai_token-act-pair · gpt-4o-miniTriggered by @bot
    New Auto-Interp
    Top Features by Cosine Similarity
    Embeds
    IFrame
    Link
    Not in Any Lists

    No Comments

    Negative Logits
    ulia
    -0.82
    Magikarp
    -0.74
    rez
    -0.73
    morrow
    -0.72
    rils
    -0.71
    NetMessage
    -0.70
    igslist
    -0.69
    utterstock
    -0.67
    bors
    -0.66
    estern
    -0.65
    POSITIVE LOGITS
     since
    0.74
     wartime
    0.64
     PW
    0.63
    volent
    0.63
     cartel
    0.62
    ��
    0.62
     martial
    0.61
     safety
    0.59
     WAR
    0.59
    achy
    0.59
    Activations Density 0.018%

    No Known Activations