Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior

2025-12-16

1 min read

Read Original

Get the latest tech trends every morning

Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.

Endigest AI Core Summary

Google releases Gemma Scope 2, the largest open-source interpretability toolkit to date, covering all Gemma 3 model sizes from 270M to 27B parameters.

•Built using sparse autoencoders (SAEs) and transcoders to reveal internal model states and decision-making processes
•Training involved storing ~110 Petabytes of data and over 1 trillion total parameters
•Includes skip-transcoders and cross-layer transcoders for deciphering multi-step computations across model layers
•Uses Matryoshka training technique to improve concept detection and fix flaws found in the original Gemma Scope
•Provides chat-tuned model analysis tools targeting jailbreaks, refusal mechanisms, and chain-of-thought faithfulness

This summary was automatically generated by AI based on the original article and may not be fully accurate.

Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior

Get the latest tech trends every morning

Endigest AI Core Summary

Related Articles

Developer's guide to Gemini Enterprise and A2UI integration

Boston Children’s uses AI to unlock new diagnoses

How Braintrust turns customer requests into code with Codex

May 29, 2026