⚠️ Rolling Release
An initial release of artifacts and datasets are available now, with all artifacts expected to be finalized by January 16, 2026.
Some data may be replaced or updated during this final verification and fine-tuning process. Please check the HuggingFace for details.
Gemma Scope 2
DemoExamining Safety-Relevant Features and Circuits in Gemma 3
👋 New Here?
If you're new to interpretability (the science of understanding what happens inside AI), we recommend you start with the original "Exploring Gemma Scope", which has more beginner-friendly interactive demos and content.
This Gemma Scope 2 demo focuses on exploring safety-relevant features in Gemma 3 27B-IT, the largest model in the new Gemma 3 model series. Since the Gemma Scope 2 release also includes transcoders, cross-layer transcoders, and crosscoders, Neuronpedia is also adding support for circuit tracing with those new artifacts.
🔢 Sections
🛡️
Safety & Alignment
Explore safety and alignment relevant features in Gemma 3.
Coming Soon
🔌
Circuit Tracing
Using prompts to activate and trace Gemma 3's internal reasoning steps.
📖
Dashboards + Inference
See top activating examples, search, and test with inference.