Neuronpedia

APISteer SAE Evals Blog Slack Privacy & Terms Contact

© Neuronpedia 2025

Privacy & Terms Blog/RSS GitHub Slack Twitter Contact

Home
GPT2-Small
11
2172

INDEX

Explanations

"able"/"unable"; "said"; "up"; "have"; "addition"; "relation"; "response"; "regard"

verbs indicating the beginning or initiation of an action.

oai_token-act-pair · gpt-4-turbo

New Auto-Interp

Top Features by Cosine Similarity

Embeds

PlotsExplanationShow Test FieldDefault Test Text

IFrame

Link

Not in Any Lists

No Comments

No Known Activations