IaaS vs PaaS
Autonomy is a PaaS that’s 1000x ‘better’ than DIY IaaS with a cloud provider (azure, gcp, AWS) on multiple dimensions**: cost, complexity, staffing, time to market, implementation risk.** 3 customers outlined what they would need to do themselves (DIY) with an IaaS if they did not use Autonomy:- Loan officer assistant product manager told us “an In-house (DIY) build is a high-risk, high-cost path: it requires an 8-person senior team, ~12–16 weeks just to reach MVP, and ~$850k–$1.15M in development before compliance polish—followed by sizable ongoing ops headcount. The architecture is complex (multi-tenant isolation, KMS/BYOK, Connect integration, OCR fast-paths, Textract async fan-out, LLM brokering, Step Functions/SQS orchestration), creating many failure modes and quota dependencies that threaten our 60-second SLA—especially with scanned, multi-page documents and peak concurrency. Time-to-market is long and uncertain due to parser accuracy work, contracts extraction quality, and security hardening. Maintenance burden is ongoing (model/version drift, quota management, cost tuning, incident response, audit trails) and the system is brittle under workload skew or vendor limit changes, making cost overruns and timeline slips likely. Net: this is difficult to staff, complex to implement, expensive to run, and carries material execution and compliance risk for a core workflow. “Concurrency explosion” is what happens when a seemingly modest real-time workload fans out into orders of magnitude more simultaneous units of work than your platform (and cloud quotas) can actually handle within the SLA. This is a serious problem with the DIY path. In our case, a single minute with 100 live calls × 100 docs/call ⇒ 10,000 docs. If just 40% are scanned, at ~3 pages/doc that’s 12,000 OCR jobs immediately—plus format detection, conversion, parsing, and at least one LLM call for some docs. Each step multiplies: Step Functions distributed maps spawn thousands of parallel tasks; Textract runs per page; retries on slow pages double load; visibility timeouts re-enqueue messages; “just one more” validation pass adds yet another sweep. The result is a burst that collides with hard service limits (Textract async concurrency, Bedrock TPM/RPS, Lambda concurrency, ENIs for VPC Lambdas, DynamoDB hot partitions on a single call_id/tenant key, KMS decrypt TPS) and soft limits (container cold starts, connection pools), causing queue pile-ups, head-of-line blocking, stragglers, and cascading retries that amplify the burst. Autoscaling can’t save you fast enough (scale-out latency is seconds to minutes), so the system thrashes: costs spike (you scale everything “just in case”), p95 latency blows past 60s, and you start dropping or timing out work—precisely when the business most needs reliability. That’s the concurrency explosion: a multiplication across pages × steps × retries that turns 100 calls into tens of thousands of concurrent operations, making SLA, cost, and stability extremely fragile.
- HR recruiting product’s engineering team told us that ‘To build a platform ourselves with IaaS is a high-risk, high-cost, long-tail undertaking: it requires a 7–9 person senior team for ~5 months just to reach MVP (12–16 weeks) and ~6 months to GA (20–24 weeks), with one-time engineering of ~$0.77–$1.25M, then ongoing headcount of 5–7 people ($110–$200k/month) and variable infra dominated by ASR/LLM that can reach ~$450k–$2.0M/month at 1k concurrent interviews—before adding compliance, support, and margin. Technically it’s complex (real-time ASR/TTS with sub-second turn-taking, LLM orchestration, multi-tenant isolation, observability, and ABAC controls), operationally fragile (vendor rate limits, packet loss, barge-in bugs, transcript drift), and financially volatile (token blowups, noisy-neighbor tenants, storage/egress creep). Key implementation risks include audio access/approvals, achieving consistent latency and accuracy across accents/noise, cost control at scale, and evidence collection for compliance—any of which can trigger overruns, delays, or outright failure to launch. In short: expensive to build, slow to ship, staff-intensive to maintain, and easy to get wrong.”
- A customer of Autonomy was about to build an Internal tool to extract new value from archived content in Box. They said that building this system in-house with IaaS would be a long, expensive, and high-risk endeavor. It requires a complex, multi-tenant cloud architecture with GPU orchestration, Step Functions, queues, and distributed state management across AWS services — the kind of system that typically takes a 6- to 8-person team at least three to four months to reach MVP and costs $300,000 to $500,000 to build, plus $30k to $100k per month to run. Ongoing maintenance would demand specialized DevOps, MLOps, and computer-vision talent to manage scaling, cost control, and model updates. The risks of overruns, cost blowouts, and stalled progress are significant: GPU workloads are expensive and unpredictable, multi-tenant security adds compliance complexity, and system integration between AI, storage, and workflow layers can fail in subtle, cascading ways. For a small team — especially inside a non-profit — this project carries a high likelihood of schedule slippage, budget exhaustion, and operational fragility.
How Autonomy works under the covers
The schema below follows the user experience:- Developer writes their app’s code in Python.
- This code uses the Autonomy Framework. This library was built in Rust and Python. It can create Agents, Flows, and fleets of collaborating Agents. It provides all the table stake features you would expect from an Agent framework; Agent to Agent (A2A), Streaming, Memory, Knowledge, Evals, etc. Plus it has features that leverage the Autonomy Computer platform. Technically, someone could build an agent in another framework like Langchain and run it on Autonomy. They just wouldn’t get all the special functionality of Autonomy.
- Our framework is unique because it turns every Agent into a lightweight Actor. Each actor/agent can send and receive messages to collaborate with other Agents. Apps can seamlessly delegate work to millions of parallel agents.
- All of the message based communication is transparently, mutually authenticated, end-to-end encrypted, and granularly authorized with attribute based access control policies. These secure channels are network bound and they scale to the maximum-achievable network throughput in AWS.
- The Autonomy framework and its documentation presumes that developers will use coding assistants to build their products. Customers have successfully vibe-coded their Autonomy apps.
- Developer creates an autonomy.yaml file.
- The Autonomy Computer deploys apps into an isolated zone that could contain one to many containers that contain,
- Agents and application logic.
- Tools like web browsers, command line tools, MCP servers etc.
- All agents get cryptographically verifiable identities. The entire class of ‘non-human identity’ bolt-on products are made obsolete by this feature. Each agent has a durable, long-lived internal state, and is multi-tenant by default. All agents get managed memory and vector storage so they can accumulate knowledge.
- Agents can securely send messages (via Ockam) to
- Anything
- Agents
- Tools and MCP servers
- Databases
- Enterprise SaaS
- Applications
- …
- Anywhere
- On Autonomy Computer
- Among a customer’s agents
- Between our customers’ agents
- Across our customer’s customer’s customers agents
- ….
- Remote Autonomy nodes
- On the internet
- In private VCPs
- On any cloud or data center
- …
- On Autonomy Computer
- Anything
- Granular attribute based policies control access to every entity in Autonomy. This means that Autonomy customers can easily isolate their customers.
- Every agent automatically gets streaming API endpoints. Agents can also themselves be exposed as MCP tools.
- The Autonomy Computer deploys apps into an isolated zone that could contain one to many containers that contain,
- Developer deploys their app to the Autonomy Computer (the PaaS)
- by typing a single command in our CLI,
- or by pushing to their main branch in Github, we provide a simple github workflow.
- Developer can then observe their applications in production - with logs, telemetry, traces, and runtime Evals.
\
Who is Autonomy built for?- Bottoms-up, developer-led adoption.
- Autonomy’s GTM follows the developer-first playbook
- Our focus is to make it effortless for developers to go from zero-to-one with an agentic application on Autonomy, and then scale from one-to-millions of agents with no architectural rework. Developer experience will be a hallmark of Autonomy.
- Target Market
- Startups and net-new products in large organizations.
- Both look the same because they are actively choosing infrastructure for the first time, or re-platforming after failed agent experiments. They value speed, simplicity, and the ability to scale without deep infra expertise.
- Technical differentiation.
- Unlike the recent batch of YC companies, Autonomy is not a demo-ware project. We have a 5-year head start, a production-hardened runtime that has already stood up to POCs evaluations from large infrastructure teams at Snowflake, Databricks, AWS, Vercel and others. If a potential customer contacts us we can go deep into the POCs we did with each of these companies.
- A unique trust and identity fabric built on Ockam. The startups that tout “identity for agents” are unnecessary add-ons when a product is built with Autonomy. We kill the need for this entire category.
- Agent to Agent is also built in. A2A and other agent to agent add-ons are unnecessary when a product is built with Autonomy. We kill the need for this entire category as well.
- Since there are so many new entrants in the ‘PaaS for Agents’ category we can encourage bakeoffs by developer influencers. The community should quickly discover our differentiation is revealed not just in what we show, but in what’s missing - the hours of work developers don’t have to do because they use Autonomy. A partnership with Heavybit further reinforces this differentiated signal.
- Positioning against AWS.
- AWS Bedrock AgentCore is a credible threat for teams locked into all-things AWS. However, its existence validates the PaaS category and creates space for an “developer friendly” and fast innovating alternative.
- It’s still not a like-for-like comparison because AgentCore still requires a lot of hand-rolling of other AWS services that are baked into Autonomy.
- Autonomy can be the obvious choice for teams that already use best of breed products outside of the AWS product catalog.
- Enterprise teams should use Autonomy too!
- We’ve already sold to a couple startup-within-the-enterprise teams. We learned that each of them lack the intra team capability to hand-roll IaaS in AWS. They are comfortable running production workloads in our third-party PaaS because they have a mandate to get to production fast. This means our startup and net-new product focus also covers some large scale enterprise demand. Over time, as we grow adoption and credibility, we can selectively move up-market. Long term we believe “AI Platform for Enterprises” market positioning is an option if we launch ‘Autonomy Enterprise’ (features TBD)

