OpenMythos

Visualizer

An interactive guide to the Recurrent-Depth Transformer — a theoretical reconstruction of the hypothesized Claude Mythos architecture. Same weights, more loops → deeper reasoning.

Embedding
vocab → dim
Prelude
dense blocks · ×N
Recurrent
MoE block · looped ×T
Coda
dense blocks · ×M
LM Head
→ logits

Prelude runs once, Recurrent loops with shared weights, Coda runs once. The recurrent block is the unique core.

ht+1 = A·ht+ B·e + Transformer(ht, e)

The recurrent update rule, applied once per loop. Everything on this site is computed client-side from the model’s formulas.

Why it is not a vanilla transformer

Looped recurrence
One transformer block run many times — depth comes from loops, not from more parameters.
Input injection
The encoded input e is re-injected every loop, keeping the original signal alive at any depth.
Adaptive compute
ACT halting lets easy tokens stop early while hard tokens keep reasoning — in the same batch.
Stable by construction
An LTI-constrained update guarantees the spectral radius ρ(A) < 1, so training never explodes.

Explore