Let the agent run by itself without blocking the agent.
"Fire and forget — the agent doesn't block while the command runs."
The pain of blocking calls
The bash tool in S02 is synchronous: subprocess.run(..., timeout=120), running a command like npm install takes 90 seconds, and the entire agent loop is stuck for 90 seconds. The user stares at the terminal, not knowing whether it is hung up or working.
Solution to s08: Give the agent a background_run tool. It immediately returns a task_id, and the command is run on another thread. The agent continues to loop and do other things; when the bg task is completed, the results are added to the notification queue.
def run(self, command: str) -> str: task_id = str(uuid.uuid4())[:8] self.tasks[task_id] = {"status":"running", ...} thread = threading.Thread(target=self._execute, args=(task_id, command), daemon=True) thread.start() return f"Background task {task_id} started" # Return immediately
How to return the result to the agent?
The key is a thread-safe queue: when the bg thread completes, it appends to the queue; before each LLM call by the main thread the queue is drained and the completion notification is stuffed into messages as user messages.
def agent_loop(messages): while True: # Drain bg notifications before each LLM call notifs = BG.drain_notifications() if notifs: messages.append({ "role": "user", "content": f"<background-results>{notif_text}</background-results>", }) response = client.messages.create(...) ...
In this way, the agent spawns a bg task in the Nth round. When the task is completed in the N+3 round, the next LLM call will automatically bring the results - the model will know when it sees the <background-results> block: "Oh, that task is finished, I will continue."
Timeline demo
The following widget allows you to simulate: the main thread ticks every second (simulating the agent cycle beat); you can spawn the bg task at any time. Watch how the two clues meet at the "drain point".
Which commands should be placed in the background?
Not all commands should be thrown into the background. There are two criteria:
- Time consuming: Synchronous running within a few seconds is simpler, eliminating the need to maintain queues.
- Result importance: If the result is used immediately in the next step (e.g.
cat file.txtimmediately followed bygrep), the background is meaningless - you still have to wait for it.