Home / Loop Engineering
Conceptual foundation

Loop Engineering

The 2026 meta for AI coding: stop prompting the agent; design the system that prompts it. A loop that keeps pushing a plan to completion, lap after lap, surviving every session boundary.

"My job is to write loops. The model is a subroutine; I'm the loop architect."
— Boris Cherny (Anthropic), who reported 100% of his 259 PRs in 30 days were written by Claude Code loops, Dec 2025.

0. The Output Metric

The dark factory's one number: given a VISION doc, how long and how faithfully can Sentigent push the plan before it needs a human? That's FAP — Faithful Autonomous Progress. Everything in the loop architecture serves this metric. It is real, per-run, and impossible to fabricate — it falls straight out of the loop's own state.

1. What the field has converged on

The Ralph loop — the seed

Geoffrey Huntley, 2025. while :; do cat PROMPT.md | claude; done. The insight that started it: progress accumulates in files + git + tests, NOT in the context window. Each lap gets a fresh context over a re-derived plan. Failures pipe back as a "contextual pressure cooker" that forces the model to fix its own mistakes. Huntley built a whole programming language this way for ~$297.

The five-part loop contract

Every production loop has exactly five parts:

PartMeaning
TRIGGERTimer (every 15m) or event (CI fail, PR comment)
SCOPEWhich repos / files / PRs the loop may touch
ACTIONWhat the agent does each lap (ideally a named, tested skill)
BUDGETMax laps, token/$ cap, max sub-agents
STOPDone-criteria, iteration ceiling, spend limit, no-progress halt

Open vs closed loops

Open loop: agent writes until it says done → demo only. No external verification. The agent is agreeing with itself on repeat.

Closed loop: runs tests/lint/typecheck each lap; failures feed back into the next lap's prompt → production-grade. "A loop with nothing to push back is the agent agreeing with itself on repeat."

Durable state across sessions

Context windows are finite; every reset/compaction loses something, and agents that sense low context do a "rushed finish." The fix: state-persistence files that let a new session resume unambiguously:

Plus anchor files re-injected every lap: VISION.md (goal + success criteria), CLAUDE.md/AGENTS.md (rules/guardrails), PROMPT.md (the injected tick). Context reset + structured handoff beat compaction for long runs.

Three-agent shape (Anthropic)

Planner (spec) → Generator (implements in sprints) → Evaluator (tests like a user via Playwright, grades on hard thresholds), communicating through files. "Sprint contracts" = the generator proposes the work + its own success criteria, evaluator approves, then it builds. Keep it as simple as the model allows.

Cost is the new constraint

Uber capped engineers at $1,500/mo after burning the annual budget in 4 months. Controls that matter: hard max-iterations, no-progress detection (halt if the same error repeats N×), and a pre-set $/token ceiling.

2. The gap nobody has solved — Sentigent's wedge

Every loop halts or runs off a cliff at two hard moments:

The decision "push through this myself vs. stop and ask" is exactly a judgment call. It's the one thing a generic loop can't do well. Sentigent has the parts to make that decision learned from your history — that is the differentiated loop:

3. The architecture

        VISION.md (goal + Done-criteria)         org guardrail packs
                  │                                      │  (per-lap safety)
                  ▼                                      ▼
   ┌──────────────── LOOP DRIVER (durable, cross-session) ────────────────┐
   │  state file: progress log · verification records · NEXT STEP          │
   │  (atomic, crash-safe)                                                  │
   │                                                                        │
   │  each lap:                                                             │
   │   1. read next step + anchor files (fresh context — Ralph discipline)  │
   │   2. run a FRESH `claude -p` over just that step                       │
   │   3. CLOSED-LOOP VERIFY (tests/typecheck/lint);                        │
   │      failure pipes into next lap's prompt                              │
   │   4. on blocker → CloneResolver decides push-or-ask                    │
   │      using LEARNED thresholds                                          │
   │   5. STOP checks: DoD satisfied? · no-progress (same fail N×)?        │
   │      · max laps? · budget? · kill?                                     │
   │   6. atomically persist → next step durably queued                     │
   └────────────────────────────────────────────────────────────────────────┘
                  │ every lap logged → real receipt (laps/verifies/resolves/$)
                  ▼
        resume(loop_id) picks up at stored next step after ANY session/crash

Implementation status

ComponentFileStatus
Fresh-context laps (Ralph)operator/loop.py✓ exists
Cross-session durable driveroperator/loop_driver.py✓ shipped
Push-vs-ask on blockersoperator/resolver.py CloneResolver✓ exists
Learned push-vs-ask thresholdsCloneResolver.thresholds_from_calibration✓ the wedge
Done-criteria STOPoperator/goal_dod.py GoalDoD✓ exists
Budget / kill STOPBudgetGovernor / KillSwitch✓ exists
Org guardrail packsguardrails/*.yaml + operator/guardrails.py✓ shipped
Closed-loop verify gateverifier.py✓ wired
No-progress detectionloop_driver same-fail-N× check✓ added
FAP receiptloop_driver receipt✓ shipped

P1–P5 are feature-complete. Next: real-world hardening — run it with --execute on live visions to gather actual FAP, and wire the driver into the MCP operator_* tools so the loop is callable from Claude Code directly.

4. Positioning (honest)

Sentigent is a loop harness with learned judgment: it keeps pushing your plan across session boundaries (Ralph's autonomy + durable resume), but it knows — from your decision history — when to push through a blocker vs. stop and ask, and it enforces org guardrails on every lap. Ralph is the engine; Sentigent is the engine that doesn't need babysitting and won't drive off a cliff.

Sources

← Back to home