References

Papers, threads, and citations behind OpenMythos.

OpenMythos is an independent, community-driven theoretical reconstruction based on publicly available research and informed speculation. It is not affiliated with, endorsed by, or connected to Anthropic.

Papers

The research the architecture draws on, grouped by theme.

Core — looped & recurrent-depth

The recurrent-depth reasoning lineage that OpenMythos reconstructs: looping a shared block to trade compute for effective depth.

  • Loop, Think, & Generalize — Implicit Reasoning in Recurrent Depth Transformers

    arXiv:2604.07822Open
  • Parcae — Scaling Laws for Stable Looped Language Models

    arXiv:2604.12946Open
  • Parcae (project blog)

    WebOpen
  • Reasoning with Latent Thoughts — On the Power of Looped Transformers (Saunshi et al., 2025)

    arXiv:2502.17416Open
  • Training Large Language Models to Reason in a Continuous Latent Space (COCONUT)

    arXiv:2412.06769Open
  • Relaxed Recursive Transformers — Effective Parameter Sharing with Layer-wise LoRA (Bae et al., 2024)

    arXiv:2410.20672Open
  • Universal Transformers (Dehghani et al., 2018)

    arXiv:1807.03819Open
  • Hyperloop Transformers

    arXiv:2604.21254Open
  • The Recurrent Transformer: Greater Effective Depth and Efficient Decoding

    arXiv:2604.21215Open
  • LT2: Linear-Time Looped Transformers

    arXiv:2605.20670Open

Attention

Compressed KV caches and grouped queries that keep the recurrent decode loop affordable.

  • Mixture-of-Depths Attention

    arXiv:2603.15619Open
  • DeepSeek-V2 (Multi-Latent Attention)

    arXiv:2405.04434Open
  • GQA: Training Generalized Multi-Query Transformer Models (Ainslie et al., 2023)

    arXiv:2305.13245Open

Mixture-of-Experts

Fine-grained expert segmentation and shared-expert isolation behind the routed FFN.

  • DeepSeekMoE — Fine-grained expert segmentation and shared expert isolation (Dai et al., 2024)

    arXiv:2401.06066Open

Foundations

The building blocks — adaptive computation, normalization, and positional encoding.

  • Adaptive Computation Time for Recurrent Neural Networks (Graves, 2016)

    arXiv:1603.08983Open
  • Root Mean Square Layer Normalization (Zhang & Sennrich, 2019)

    arXiv:1910.07467Open
  • RoFormer: Enhanced Transformer with Rotary Position Embedding (Su et al., 2021)

    arXiv:2104.09864Open

Threads & discussion

Community analysis and debate on X about looped transformers.

Source map

Every architectural concept on this site maps to a real span of the OpenMythos implementation.

  • Full forward pass: Prelude → Recurrent → Coda

    open_mythos/main.py · lines 992-1034 · OpenMythos.forward

  • The recurrent loop with ACT early exit

    open_mythos/main.py · lines 825-891 · RecurrentBlock.forward

  • Stable input injection — ρ(A) < 1 by construction

    open_mythos/main.py · lines 684-742 · LTIInjection

  • Per-position halting probability head

    open_mythos/main.py · lines 750-780 · ACTHalting

  • Sinusoidal loop-index signal over recurrence depth

    open_mythos/main.py · lines 541-570 · loop_index_embedding

  • Depth-wise LoRA: per-loop low-rank delta

    open_mythos/main.py · lines 578-624 · LoRAAdapter

  • Fine-grained MoE: routed + shared experts

    open_mythos/main.py · lines 456-533 · MoEFFN

  • Multi-Latent Attention with compressed KV cache

    open_mythos/main.py · lines 284-418 · MLAttention

  • Grouped Query Attention

    open_mythos/main.py · lines 177-276 · GQAttention

  • Pre-norm block: attention + (MoE | dense) FFN

    open_mythos/main.py · lines 627-676 · TransformerBlock

  • SwiGLU feed-forward expert

    open_mythos/main.py · lines 426-453 · Expert

  • Full configuration surface

    open_mythos/main.py · lines 17-81 · MythosConfig

  • Pre-configured scales 1B → 1T

    open_mythos/variants.py · lines 1-199 · variants

  • Depth-aware unified attention (experimental)

    open_mythos/moda.py · lines 671-821 · MoDAAttention

  • DeepSeek-style MoE with shared experts (experimental)

    open_mythos/moda.py · lines 452-630 · DeepSeekMoE

Citation

If you reference OpenMythos, please cite it as follows.

BibTeX
@software{gomez2026openmythos,
  author    = {Kye Gomez},
  title     = {OpenMythos: A Theoretical Reconstruction of the Claude Mythos Architecture},
  year      = {2026},
  url       = {https://github.com/kyegomez/OpenMythos},
  note      = {Recurrent-Depth Transformer with MoE, MLA, LTI-stable injection, and ACT halting}
}