The first agent loop I shipped ran for 47 turns on a single user question, used up a quarter of a daily budget, and ended in “I’m not sure how to help, can you rephrase?” The user did not rephrase.
Here’s the short, opinionated checklist I use now to keep agent loops from sprawling.
1. Set a tool budget, not just a turn budget
A turn limit is a blunt instrument. It saves you from infinite loops but doesn’t save you from an agent that calls the same expensive tool ten times in a row.
I set a per-tool budget — “no more than 3 search calls per session”, “at most 1 file write per turn” — and surface it to the model in the system prompt. The model respects budgets it knows exist.
2. Make termination a first-class action
Most agents terminate by not picking a tool. That’s bad. It means the loop ends because the model gave up, not because it succeeded.
Add an explicit done(answer) action. Require the agent to call it. Now you have a clear success signal you can log, evaluate, and reward.
3. The one observability hook that’s saved me the most time
Log, for every turn: the model’s reasoning, the chosen action, the action’s input, the action’s output, and the elapsed time. That’s it. Five fields. Keep it as a structured log, not a free-text dump.
When something goes wrong — and it will — you can reconstruct exactly where the agent went off the rails without re-running anything. This single habit has saved me probably 80% of the debugging time on agent projects.
4. Reject re-entry, not just loops
Agents don’t only fail by looping. They fail by almost looping — calling the same tool with subtly different arguments, getting subtly different results, and never converging.
A small “have I called this tool with arguments similar to this before?” check, with a similarity threshold, catches a lot of these before they burn the budget.
5. The escape hatch
Always have a human-fallback path that fires automatically when the budget is exhausted. The agent’s last act before giving up should be to write a clean summary of what it tried, what it learned, and what’s still uncertain. That summary is a gift to whoever picks it up next.
What I’d tell my past self
- Don’t build agentic anything until a non-agentic version of the same task is too painful to maintain. Most “agent problems” are workflow problems with extra steps.
- The model is not the bottleneck. The tooling around the model is.
- A boring, deterministic loop that does the right thing 95% of the time beats a clever one that does the right thing 70% of the time and a brilliant thing 5% of the time.