Building Agent Loops That Don't Sprawl

The first agent loop I shipped ran for 47 turns on a single user question, used up a quarter of a daily budget, and ended in “I’m not sure how to help, can you rephrase?” The user did not rephrase.

Here’s the short, opinionated checklist I use now to keep agent loops from sprawling.

1. Set a tool budget, not just a turn budget

A turn limit is a blunt instrument. It saves you from infinite loops but doesn’t save you from an agent that calls the same expensive tool ten times in a row.

I set a per-tool budget — “no more than 3 search calls per session”, “at most 1 file write per turn” — and surface it to the model in the system prompt. The model respects budgets it knows exist.

2. Make termination a first-class action

Most agents terminate by not picking a tool. That’s bad. It means the loop ends because the model gave up, not because it succeeded.

Add an explicit done(answer) action. Require the agent to call it. Now you have a clear success signal you can log, evaluate, and reward.

3. The one observability hook that’s saved me the most time

Log, for every turn: the model’s reasoning, the chosen action, the action’s input, the action’s output, and the elapsed time. That’s it. Five fields. Keep it as a structured log, not a free-text dump.

When something goes wrong — and it will — you can reconstruct exactly where the agent went off the rails without re-running anything. This single habit has saved me probably 80% of the debugging time on agent projects.

4. Reject re-entry, not just loops

Agents don’t only fail by looping. They fail by almost looping — calling the same tool with subtly different arguments, getting subtly different results, and never converging.

A small “have I called this tool with arguments similar to this before?” check, with a similarity threshold, catches a lot of these before they burn the budget.

5. The escape hatch

Always have a human-fallback path that fires automatically when the budget is exhausted. The agent’s last act before giving up should be to write a clean summary of what it tried, what it learned, and what’s still uncertain. That summary is a gift to whoever picks it up next.

What I’d tell my past self

Don’t build agentic anything until a non-agentic version of the same task is too painful to maintain. Most “agent problems” are workflow problems with extra steps.
The model is not the bottleneck. The tooling around the model is.
A boring, deterministic loop that does the right thing 95% of the time beats a clever one that does the right thing 70% of the time and a brilliant thing 5% of the time.