Neuronpedia

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Neuronpedia is an open source interpretability platform.

Explore, visualize, and steer the internals of AI models.

Natural Language Autoencoders 🔢➡️🔤

Our New Anthropic Collaboration, Contribution Opportunities, and More SAEs

Newsletter

No spam, unsubscribe anytime.

Featured Release

Fraser-Taliente, Kantamneni, Ong et al.

Natural Language Autoencoders

Translate a Model's Internal Thoughts Into Text

Lu et al.

Assistant Axis

Monitor and Stabilize the Character of an LLM

Multi-Org

Circuit Tracer

Trace the Internal Reasoning Steps of a Model

Google Deepmind

Gemma Scope 2

SAEs and Transcoders for Gemma 3

MIT Technology Review

Anthropic

Google DeepMind

VentureBeat

OpenMOSS, Fudan University

EleutherAI

Apollo Research

Releases and Models

Browse five+ terabytes of activations, explanations, and metadata.
Neuronpedia supports probes, latents/features, custom vectors, concepts, and more.

Releases

HeadVis: Browse & Investigate Attention Heads

Luger, Kamath, et al., Anthropic

Natural Language Autoencoders

Fraser-Taliente, Kantamneni, Ong et al., Anthropic

Circuit Tracing with Interpretable Attention

OpenMOSS Team, Fudan University

Assistant Axis: Situating and Stabilizing the Character of an LLM

Lu et al, Anthropic Fellows

Gemma Scope 2: Suite of SAEs and Transcoders for Gemma 3

Language Model Interpretability Team, Google DeepMind

Circuit Tracer: Tracing the Internal Reasoning Steps of a Model

Hanna & Piotrowski, Anthropic Fellows

Gemma Scope - Exploring the Inner Workings of Gemma 2

Language Model Interpretability Team, Google DeepMind

Weight-Sparse Transformers Have Interpretable Circuits

Temporal Feature Analysis

Lubana, Rager, Hindupur, et al.

gpt-oss BatchTopK SAEs

Finding Misaligned Persona Features in Open-Weight Models

Circuit Tracer Transcoders

Hanna & Piotrowski

A Bunch of Matryoshka SAEs

Llama 3.3 70B Instruct SAE

Llama Scope R1: SAEs for DeepSeek-R1-Distill-Llama-8B

OpenMOSS Team, Fudan University

AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders

pyvene.ai, The Stanford NLP Group

Llama Scope: SAEs for Llama-3.1-8B

OpenMOSS Team, Fudan University

Feature Splitting for GPT2-Small

Multi TopK SAE for Llama3.1-8B

Sparse Autoencoder for GPT2-Small - v5

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning

Apollo Research · Jordan Taylor

Transcoders Enable Fine-Grained Interpretable Circuit Analysis for Language Models

Jacob Dunefsky · Philippe Chlenski

Sparse Autoencoders for Pythia-70M-Deduped

Under Peer Review

Attention SAE Research Paper

Under Peer Review

Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small

Models

Qwen3.5-4BAlibaba

Gemma-4-E2BGoogle Deepmind

Gemma-4-E2BGoogle Deepmind

Qwen3.5-0.8BAlibaba

Olmo-3-1025-7BAllen AI

OLMO-3-1125-32B

Olmo-3-1125-32BAllen AI

Qwen3.5-27BAlibaba

Qwen3-32BAlibaba

Qwen3.5-9B BaseAlibaba

Qwen3.5-2B BaseAlibaba

Qwen3-14BAlibaba

Qwen3-8BAlibaba

CIRCUITGPT-PYTHON

CircuitGPT-PythonOpenAI

Gemma-3-27BGoogle Deepmind

Gemma-3-12BGoogle Deepmind

GEMMA-3-270M-IT

Gemma-3-270M-ITGoogle Deepmind

Gemma-3-1B-ITGoogle Deepmind

Gemma-3-4B-ITGoogle Deepmind

Gemma-3-12B-ITGoogle Deepmind

Gemma-3-27B-ITGoogle Deepmind

Gemma-3-270MGoogle Deepmind

Gemma-3-4BGoogle Deepmind

Gemma-3-1BGoogle Deepmind

Gemma-2-27BGoogle Deepmind

GPT-OSS-20BOpenAI

Llama3.1-8B-ITMeta

Qwen2.5-7B-ITAlibaba

Qwen3-1.7BAlibaba

Qwen3-4BAlibaba

LLAMA3.3-70B-IT

Llama3.3-70B-ITMeta

DEEPSEEK-R1-LLAMA-8B

DeepSeek-R1-Dist-Llama-8BDeepSeek

Gemma-2-2B-ITGoogle Deepmind

Gemma-2-9B-ITGoogle Deepmind

Llama3.1-8B (Base)Meta

Gemma-2-2BGoogle Deepmind

Gemma-2-9BGoogle Deepmind

Pythia-70M-DedupedEleutherAI

GPT2-SmallOpenAI

Jump To

Jump to Source/SAE

Jump to Feature

INDEX

Jump to Random

Graph

Visualize and trace the internal reasoning steps of a model with custom prompts, pioneered by Anthropic's circuit tracing papers.

Attribution graph example with Dallas gemma-2-2b

Steer

Modify model behavior by steering its activations using latents or custom vectors. Steering supports instruct (chat) and reasoning models, and has fully customizable temperature, strength, seed, etc.

Steering example with a cat feature

Search

Search over 50,000,000 latents/vectors, either by semantic similarity to explanation text, or by running custom text via inference through a model to find top matches.

Search via Inference

Run Example Search

API + Libraries

Neuronpedia hosts the world's first interpretability API (March 2024) - and all functionality is available by API or Python/TypeScript libraries. Most endpoints have an OpenAPI spec and interactive docs.

Steering example with a cat feature

Inspect

Go in depth on each probe/latent/feature with top activations, top logits, activation density, and live inference testing. All dashboards have unique links, can be compiled into sharable lists, and supports IFrame embedding, as demonstrated here.

Who We Are

Neuronpedia was created by Johnny Lin, an ex-Apple engineer who previously founded a privacy startup. Neuronpedia is supported by Decode Research, Open Philanthropy, the Long Term Future Fund, AISTOF, Anthropic, Manifund, and others.

Get Involved

Citation

@misc{neuronpedia,
    title = {Neuronpedia: Interactive Reference and Tooling for Analyzing Neural Networks},
    year = {2023},
    note = {Software available from neuronpedia.org},
    url = {https://www.neuronpedia.org},
    author = {Lin, Johnny}
}