> ## Documentation Index
> Fetch the complete documentation index at: https://autonomy.computer/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Models

> Choose language models for your agents.

Large language models give agents the ability to make autonomous decisions.

The [Autonomy Computer](/what-is-autonomy#autonomy-computer) provides a **Model
Gateway** that gives your apps access to a wide range of models from different
providers, each optimized for different use cases and cost profiles.

You specify which model an agent should use by passing the `model` argument to `Agent.start()`:

```python images/main/main.py theme={null}
from autonomy import Agent, Model, Node


async def main(node):
  await Agent.start(
    node=node,
    name="henry",
    instructions="You are Henry, an expert legal assistant",
    model=Model("claude-sonnet-4")
  )


Node.start(main)
```

You can easily switch models by changing the model name in your `Model()` constructor.

## Supported Models

Autonomy supports a curated set of models from leading providers. All models are accessed
through the Model Gateway, which handles routing, load balancing, and failover automatically.

***

## Chat Models

Models for conversational AI and text generation, sorted by price from highest to lowest.

| Model                                   | Provider  | Input (per 1M tokens) | Output (per 1M tokens) |
| --------------------------------------- | --------- | --------------------- | ---------------------- |
| `claude-opus-4`, `claude-opus-4-v1`     | Anthropic | \$15.00               | \$75.00                |
| `o1`                                    | OpenAI    | \$15.00               | \$60.00                |
| `claude-opus-4-5`, `claude-opus-4-5-v1` | Anthropic | \$5.00                | \$25.00                |
| `claude-sonnet-4-5`                     | Anthropic | \$3.00                | \$15.00                |
| `claude-sonnet-4`, `claude-sonnet-4-v1` | Anthropic | \$3.00                | \$15.00                |
| `nova-premier`, `nova-premier-v1`       | Amazon    | \$2.50                | \$12.50                |
| `gpt-4o`                                | OpenAI    | \$2.50                | \$10.00                |
| `gpt-4.1`                               | OpenAI    | \$2.00                | \$8.00                 |
| `o3`                                    | OpenAI    | \$2.00                | \$8.00                 |
| `gpt-5.1`                               | OpenAI    | \$1.25                | \$10.00                |
| `gpt-5`                                 | OpenAI    | \$1.25                | \$10.00                |
| `o3-mini`                               | OpenAI    | \$1.10                | \$4.40                 |
| `o4-mini`                               | OpenAI    | \$1.10                | \$4.40                 |
| `claude-haiku-4-5`                      | Anthropic | \$1.00                | \$5.00                 |
| `nova-pro`, `nova-pro-v1`               | Amazon    | \$0.80                | \$3.20                 |
| `gpt-4.1-mini`                          | OpenAI    | \$0.40                | \$1.60                 |
| `gpt-5-mini`                            | OpenAI    | \$0.25                | \$2.00                 |
| `gpt-4o-mini`                           | OpenAI    | \$0.15                | \$0.60                 |
| `gpt-4.1-nano`                          | OpenAI    | \$0.10                | \$0.40                 |
| `nova-lite`, `nova-lite-v1`             | Amazon    | \$0.06                | \$0.24                 |
| `gpt-5-nano`                            | OpenAI    | \$0.05                | \$0.40                 |
| `nova-micro`, `nova-micro-v1`           | Amazon    | \$0.035               | \$0.14                 |

***

## Embedding Models

For text embeddings and semantic search.

| Model                                         | Provider | Input (per 1M tokens) |
| --------------------------------------------- | -------- | --------------------- |
| `titan-embed`                                 | Amazon   | \$0.20                |
| `text-embedding-3-large`                      | OpenAI   | \$0.13                |
| `embed-english`, `embed-english-v3`           | Cohere   | \$0.10                |
| `embed-multilingual`, `embed-multilingual-v3` | Cohere   | \$0.10                |
| `text-embedding-3-small`                      | OpenAI   | \$0.02                |

***

## Realtime Models

For real-time voice conversations.

| Model                     | Provider | Text Input (per 1M) | Text Output (per 1M) | Audio Input (per 1M) | Audio Output (per 1M) |
| ------------------------- | -------- | ------------------- | -------------------- | -------------------- | --------------------- |
| `gpt-4o-realtime-preview` | OpenAI   | \$5.00              | \$20.00              | \$40.00              | \$80.00               |
| `gpt-realtime`            | OpenAI   | \$4.00              | \$16.00              | \$32.00              | \$64.00               |
| `gpt-realtime-mini`       | OpenAI   | \$0.60              | \$2.40               | \$10.00              | \$20.00               |

***

## Audio Models

For text-to-speech and speech-to-text.

| Model       | Provider | Type           | Price                   |
| ----------- | -------- | -------------- | ----------------------- |
| `tts-1-hd`  | OpenAI   | Text-to-Speech | \$30.00 / 1M characters |
| `tts-1`     | OpenAI   | Text-to-Speech | \$15.00 / 1M characters |
| `whisper-1` | OpenAI   | Speech-to-Text | \$0.006 / minute        |

***

## Parameters

The `Model()` constructor accepts additional parameters that control the model's behavior.

* **`temperature`**: The sampling temperature to use, between 0 and 2. Higher values like 0.8 produce more random outputs, while lower values like 0.2 make outputs more focused and deterministic.

* **`top_p`**: An alternative to sampling with temperature. It instructs the model to consider the results of the tokens with top\_p probability. For example, 0.1 means only the tokens comprising the top 10% probability mass are considered.

```python images/main/main.py theme={null}
from autonomy import Agent, Model, Node


async def main(node):
  # Use a lower temperature for more focused responses
  await Agent.start(
    node=node,
    name="analyst",
    instructions="You are a financial analyst",
    model=Model("claude-sonnet-4", temperature=0.2)
  )


Node.start(main)
```

## Invoke Models Directly

For simple use cases, you can also invoke models directly. This is useful when
you need to make one-off completions without the full features of an agent.

```python images/main/main.py theme={null}
from autonomy import Model, Node, SystemMessage, UserMessage


async def main(node):
  model = Model("claude-sonnet-4")

  response = model.complete_chat([
    SystemMessage("You are a helpful assistant."),
    UserMessage("Explain gravity in simple terms")
  ])
  
  print(response)


Node.start(main)
```

### Streaming Responses

You can also stream responses from models by setting `stream=True`:

```python images/main/main.py theme={null}
from autonomy import Model, Node, SystemMessage, UserMessage


async def main(node):
  model = Model("claude-sonnet-4")

  streaming_response = model.complete_chat([
      SystemMessage("You are a helpful assistant."),
      UserMessage("Explain gravity")
  ], stream=True)

  async for chunk in streaming_response:
    if hasattr(chunk, "choices") and chunk.choices and chunk.choices[0].delta.content:
      content = chunk.choices[0].delta.content
      print(content, end="")


Node.start(main)
```

## Embeddings

Generate embeddings for semantic search and similarity:

```python images/main/main.py theme={null}
from autonomy import Model, Node


async def main(node):
  model = Model("embed-english")

  embeddings = await model.embeddings([
    "Hello world",
    "How are you?"
  ])
  
  print(f"Embedding dimension: {len(embeddings[0])}")


Node.start(main)
```
