Gemma Scope: helping the safety community shed light on the inner workings of language models

Gemma Scope: helping the safety community shed light on the inner workings of language models

Technologies Published 31 July 2024 Authors Language Model Interpretability team Announcing a comprehensive, open suite of sparse autoencoders for language model interpretability. To create an artificial intelligence (AI) language model, researchers build a system that learns from vast amounts of data without human guidance. As a result, the inner workings of language models are often…

Read More