Let the agent manage its own progress
"The agent can track its own progress — and I can see it." Let the model make its own list, and then use a small mechanism to let it remember to update the list.
structured self-planning
Claude Code often has to do several steps when working: grep to find references → read a few files → change the code → run tests. If you let the model "advance by feel", you will see that it does well in the first few steps, then starts to forget in the middle, and finally gives up halfway.
The solution to s03 is to give it a manifest tool: the model adjusts itself todo The tool inserts task items into it, TodoManager verifies the structure, persists, and returns the current view. This has two advantages:
- The model is forced to explicit "what to do" - just writing it out helps it clarify its thinking.
- Humans can see what it is thinking. The debugging experience is ten times better.
# TODO view, each item is structured [ ] #1: grep "TODO" across src/ [>] #2: read src/app.py and list comments # in progress [ ] #3: generate summary markdown [ ] #4: write to TODO_LIST.md (0/4 completed)
A hard rule: there is only one in_progress at a time
There is a check in TodoManager.update():
if in_progress_count > 1: raise ValueError("Only one task can be in_progress at a time")
It may seem harsh, but it is actually helping the model. If you allow 3 "doings" at the same time, it will start fighting everywhere and never finish any of them. Forced single mission advancement, it is forced to complete each one before opening the next one.
The widget below allows you to act as a model, submit various todo payloads, and see which ones pass verification and which ones are rejected.
Nag reminder: No updates for 3 consecutive rounds? poke it
Even if the todo tool is given, the model will still occasionally "forget" to update the list - it has done a lot of things, but in_progress is still stuck in item 2. The move in s03 is a very simple counter:
rounds_since_todo = 0 while True: response = LLM(messages, tools) ... used_todo = any(b.name == "todo" for b in tool_uses) rounds_since_todo = 0 if used_todo else rounds_since_todo + 1 if rounds_since_todo >= 3: results.append({"type":"text", "text":"<reminder>Update your todos.</reminder>"})
When the count reaches 3, include a reminder in the next round of user messages. When the model sees it, it will instinctively adjust the todo. This is to use engineering means to turn a soft constraint ("Please keep it updated") into a forced stimulus.
What's the name of this routine?
In agent design circles, this is called structured self-planning with soft nudges—giving the model a structured state that it must write to, supplemented by opportunistic reminders. Claude Code uses a similar pattern in real code, but more restrained (low frequency, neutral wording).
Why not just write "update todo at every step" into the system prompt? Writing is possible, but the model's obedience to the general instructions in the system prompt will decrease as the conversation lengthens. Splitting the instructions into "reminders that are constantly re-injected" is much more stable.