Initial Gemma-2B-IT SAEs

Joseph Bloom· May 2024

Exploring

Gemma Scope

An Introduction to AI Interpretability and the Inner Workings of Gemma 2 2B

👋 Hello!

The inner workings of modern AIs are a mystery. This is because AIs are language models that are grown, not designed.

The science of understanding what happens inside AI is called interpretability.

This demo is a beginner-friendly introduction to interpretability that explores an AI model called Gemma 2 2B. It also contains interesting and relevant content even for those already familiar with the topic.

HOVER TIPS

CLICKABLE TIPS

❕

Caveats and Warnings

🧑‍🔬

Advanced Technical Details

🔧 Get Started

❕

🧑‍🔬

Browse SAEs

Already know what SAEs are?