The Observatory Ledger AI safety & mechanistic interpretability
Public research notebook

Research that corrects itself.

Public notes on building, testing, and breaking ideas about models. Corrections welcome. Evidence first, always.

Latest observations

About the ledger

This is a small public home for careful AI-safety writing: interpretability experiments, model-evaluation lessons, null results, and the corrections that improve the work. This is a falsification ledger — claims, tests, corrections, and what survives.