How to configure timeouts across all layers of your Autonomy application.
Autonomy applications have multiple timeout layers that work together to ensure
reliable execution. Understanding how these layers interact is essential for
building robust agents, especially for long-running tasks like research or
batch processing.
When a request flows through an Autonomy application, it passes through several
timeout boundaries:
Copy
Ask AI
HTTP API → Agent Execution → Model Calls → [Throttle Queue] → Gateway → LLM Provider
Each layer has its own timeout configuration. The outermost timeout (HTTP API)
acts as the ultimate limit—if inner operations exceed it, the entire request fails.
Agents don’t make single requests—they iterate through a loop of thinking, acting,
and gathering responses. A typical agent conversation involves multiple model calls:
Copy
Ask AI
User Message ↓┌─────────────────────────────────────────────┐│ Agent State Machine ││ ┌───────────────────────────────────────┐ ││ │ Iteration 1: Model call (~5-30s) │ ││ ├───────────────────────────────────────┤ ││ │ Iteration 2: Tool + Model call (~30s) │ ││ ├───────────────────────────────────────┤ ││ │ Iteration 3: Model call (~5-30s) │ ││ ├───────────────────────────────────────┤ ││ │ ... more iterations ... │ ││ └───────────────────────────────────────┘ │└─────────────────────────────────────────────┘ ↓Response to User
Time compounds across iterations. A 10-iteration agent with 30-second iterations
needs 300 seconds total—but the default HTTP timeout is only 180 seconds.
Control how long an agent can run and how many iterations it can perform:
images/main/main.py
Copy
Ask AI
from autonomy import Agent, Model, Nodeasync def main(node): await Agent.start( node=node, name="researcher", instructions="You are a research assistant", model=Model("claude-sonnet-4-v1"), max_execution_time=600.0, # Total execution limit (seconds) max_iterations=100, # Maximum reasoning loops )Node.start(main)
With throttling enabled, each iteration can wait in the queue:
Copy
Ask AI
Iteration 1: queue wait (up to 60s) + model call (up to 120s)Iteration 2: queue wait (up to 60s) + model call (up to 120s)Iteration 3: queue wait (up to 60s) + model call (up to 120s)...
Worst case for 3 iterations:
Queue waits: 3 × 60s = 180s
Model calls: 3 × 120s = 360s
Total: 540s
When using throttling, ensure your HTTP timeout accounts for queue wait time
multiplied by expected iterations.
from autonomy import Agent, FilesystemTools, Model, Nodeasync def main(node): await Agent.start( node=node, name="researcher", instructions=""" You are a thorough researcher. Take your time to: 1. Break down the problem 2. Research each aspect 3. Take notes in your filesystem 4. Synthesize findings """, model=Model( "claude-sonnet-4-v1", throttle=True, throttle_max_seconds_to_wait_in_queue=60.0, ), max_execution_time=1800.0, # 30 minutes max_iterations=100, tools=[FilesystemTools(visibility="conversation")], )Node.start(main)
HTTP timeout: 1860 seconds (31 minutes), or use streaming
Symptom: Agent task fails with HTTP timeout, even though agent should have more time.Cause: HTTP timeout (default 180s) < max_execution_time (default 600s)Solution: Increase HTTP timeout or use streaming:
Symptom: Agent stops before completing complex reasoning.Cause:max_execution_time too short for the number of iterations needed.Solution: Increase max_execution_time and max_iterations:
Symptom: Subagent tasks fail with timeout errors.Cause: Default subagent timeout (60s) too short for multi-step work.Solution: Increase subagent max_execution_time:
Symptom: Many requests fail with queue timeout when system is busy.Cause:throttle_max_seconds_to_wait_in_queue too short for the load.Solution: Increase queue timeout or reduce concurrency: