
AI Engineer World's Fair · 2026
recursivecodingagents.com
Motivation
We all want outcomes.
Agents that work on our behalf — reliable co-workers — while we're out on a hike.
The bottleneck is not intelligence. It's reliability. It's trust.
- One day — my agents build me a full SaaS app from a single prompt.
- The next day — they empty the entire contents of my Solana wallet.

recursivecodingagents.com
Thesis
Today’s agents are mismanaged geniuses
The intelligence is there.
The missing layer is how we specify, manage, reuse, and verify the work.
recursivecodingagents.com
Recursive Language Models
Context itself is the object of computation
- Externalize — the full prompt lives in a REPL, not the context window.
- Operate — the model writes code to inspect, slice, and transform it.
- Recurse — it sub-queries itself over the slices.
arXiv:2512.24601Root RLM (depth=0)
├── Sub-RLM A (depth=1)
│ ├── LLM A1 (depth=2)
│ └── LLM A2 (depth=2)
└── Sub-RLM B (depth=1)
├── LLM B1 (depth=2)
└── LLM B2 (depth=2)recursivecodingagents.com
Recursive Language Models
Code Execution As Reasoning
- Can process inputs way beyond the context window. (Oolong) RLM paper
- RLM is itself a powerful memory system. LongMemEval results
- RLM can achieve SOTA on long reasoning tasks, even with very small models. LongCoT results

recursivecodingagents.com
RLMs: Too Hot To Benchmark
recursivecodingagents.com
The RLM rubric
Lots of things feel close.
| Executable environment | Prompt externalized | Code calls the model | Model picks decomposition | State stays symbolic | |
|---|---|---|---|---|---|
| Plain long-context call RAG / reasoning-only | |||||
| Coding agents + subagents including loops | |||||
| Hardcoded map-reduce developer-authored pipeline — e.g. λ-RLM | |||||
| Recursive Language Model passes every check |
ExecPromptCallPickState
Plain long-context call RAG / reasoning-only
Coding agents + subagents including loops
Hardcoded map-reduce developer-authored pipeline — e.g. λ-RLM
Recursive Language Model passes every check
recursivecodingagents.com
Towards Recursive Coding Agents
RLM / LLM
Root RLM (depth=0)
├── Sub-RLM A (depth=1)
│ ├── LLM A1 (depth=2)
│ └── LLM A2 (depth=2)
└── Sub-RLM B (depth=1)
├── LLM B1 (depth=2)
└── LLM B2 (depth=2)Agent / Sub-Agent
Root Agent (depth=0)
├── Sub-Agent A (depth=1)
│ ├── Sub-Agent A1 (depth=2)
│ └── Sub-Agent A2 (depth=2)
└── Sub-Agent B (depth=1)
├── Sub-Agent B1 (depth=2)
└── Sub-Agent B2 (depth=2)recursivecodingagents.com
Towards Recursive Coding Agents
Either... Trick question: RLMs are Recursive Coding Agents.
Or... How can we apply the principles of RLMs to coding agents?
recursivecodingagents.com
My Experiments
Finding ypi
Built on Pi (minimal, extensible). Previously pi extensions could not support recursion — so I forked it. Y is for the Y-combinator.
- Wrapper CLI —
ypi- a fully recursive Pi agent. - Pi Extension —
pi-recursive- make any existing Pi config recursive.
recursivecodingagents.com
The RLM ecosystem
Other notable projects
- alexzhang13/rlm Python The reference implementation from Alex Zhang and the RLM paper authors — the cleanest place to read the core recursion loop.
- stanfordnlp/dspy Python dspy.RLM exposes RLM as a composable module inside larger DSPy programs — what I use for most of my own experiments.
- ax-llm/ax TypeScript A TypeScript, DSPy-style framework with first-class RLM support: AxAgent recursive decomposition and a persistent JS runtime.
- openprose/unix-rlm Shell An RLM whose sandbox is a full Linux filesystem — one bash script, the whole computer as the environment.
- openprose/prose TypeScript A declarative markdown language the agent compiles into reliable, RLM-style execution.
recursivecodingagents.com
recursivecodingagents.com
Dynamic workflows made Claude Code recursive.
Claude can write an orchestration script, then run a fleet of subagents. The line is whether the model chooses the decomposition, or the script fixes it ahead of time.
Claude Code blog · harness for every task RLM example · model-chosen split file-handle-clean.workflow.js Decomposer reads a corpus handle, writes slice handles, then subagents extract and validate those slices. not-RLM contrast · script-fixed split hardcoded-map-reduce.workflow.js It has handles, subagents, and state, but the windows, fan-out, reducer, and stop rule are fixed in code.
recursivecodingagents.com
For (almost) any coding agent
A language compiled by the agent, not the computer.
A markdown spec plus a giant prompt, in logical English. No new syntax to learn.
The key: a declarative contract the agent must satisfy to be “done.” That answers the reliability question.
Any agent with a filesystem and subagents can run it — and behave like an RLM.
See “Stop Babysitting Agents, Start Authoring Outcomes” on Turing Post.
recursivecodingagents.com
OpenProse explicitly declares subagent work
Here are two OpenProse examples where the model turns an external handle into smaller handles, then verifies the child-work trace.
Recursive decomposition handle-recursive-reader.prose.md
- Starts from an external
prompt_handle; root does not read the whole thing. - The model decides terminal vs. nonterminal handle.
- Nonterminal handles produce child handles and call the same contract again.
if nonterminal:
for child in manifest:
recurse(child.path) Directory handle slicer directory-handle-slicer.prose.md - Starts from a repo or directory handle, not copied root context.
- The model uses search to choose relevant file handles for the question.
- Workers inspect only assigned handles; aggregation cites worker evidence.
manifest = model_slice(directory)
for child in manifest:
worker(child.path only)
validate worker provenancerecursivecodingagents.com
Applied recursive coding agents
What can you actually do with recursive coding agents?
- Repo-scale migrations Claude dynamic workflows Split a large refactor into modules, callsites, or failing tests; workers patch isolated worktrees, then reviewer agents check the merge.
- Repo-handle investigation OpenProse Start from a directory handle; the decomposer chooses file handles, workers inspect one file each, and aggregation cites their evidence.
- Audits and bug sweeps Claude dynamic workflows Parallel agents audit routes, auth, data access, and risky patterns; skeptic agents challenge findings before anything reaches the report.
- Golden sessions into programs OpenProse Mine high-quality Claude, Codex, or Pi sessions into versioned .prose.md workflows with phases, gates, loops, and validation evidence.
recursivecodingagents.com
Recursive Coding Agents FTW
Trust is reliability The next step is behavioral, not more model intelligence.
A new paradigm of inference-time compute RLMs are the new reasoning models → recursive coding agents are the new coding agents.
Coding agents can be RLMs Claude Code dynamic workflows and OpenProse show two concrete paths.
recursivecodingagents.com
Until Next Time...
Please Recurse Responsibly
Raymond Weitekamp
Presentation at recursivecodingagents.com | Companion GitHub repo
recursivecodingagents.com


