Efficient Agentic Reasoning Through Self-Regulated Simulative Planning

Published in arXiv, 2026

Project website: https://sailing-lab.github.io/sr2am-self-regulated-planning/

SR²AM enables three modes within a single LLM: System I (reactive execution), System II (simulative planning), and System III (learned self-regulation, deciding when to plan, how far ahead, and when to act directly). A configurator regulates internal simulation—when to predict future states, how far, and when to skip. Thinking longer ≠ thinking smarter, SR²AM knows which one it needs. As a result, a 30B model can compete with 685B, 1T models at a fraction of the token cost. The LLM itself serves as the world model.

arXiv:2605.22138 · Project website

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Lara Sá Neves

Share on