Anthropic and Amazon are going all in: $100B+ for 5GW of compute

Anthropic and Amazon are going all in: $100B+ for 5GW of compute

6 0 0

Anthropic just announced a major expansion of its partnership with Amazon, and the numbers are staggering. We’re talking about a commitment of more than $100 billion over the next ten years to AWS technologies, securing up to 5 gigawatts (GW) of new compute capacity for training and running Claude.

To put that in perspective: 5GW is roughly the output of five large nuclear power plants. That’s how much compute Anthropic thinks it needs to stay competitive in the AI arms race.

The deal includes significant Trainium2 capacity coming online in Q2 2026, and nearly 1GW total of Trainium2 and Trainium3 capacity by the end of this year. They’re also planning to use future generations of Amazon’s custom silicon, from Graviton through Trainium4.

This isn’t just about raw compute, though. The announcement reveals some interesting details about Claude’s growth and the challenges that come with it.

The numbers behind the hype

Anthropic claims its run-rate revenue has now surpassed $30 billion, up from approximately $9 billion at the end of 2025. That’s more than tripling in a year. For context, that’s not far from what some analysts estimate OpenAI’s run-rate to be, though OpenAI’s numbers are private.

But here’s the thing: that growth is causing problems. Anthropic openly admits that “our unprecedented consumer growth, in particular, has impacted reliability and performance for free, Pro, Max, and Team users, especially during peak hours.” Translation: Claude has been getting sluggish when too many people try to use it at once.

This is a real issue. I’ve seen complaints on Twitter and Reddit about Claude timing out or giving slow responses during US business hours. It’s good that Anthropic is acknowledging it, but it also shows how hard it is to scale AI inference in real-time.

More than just chips

The agreement has three main components:

Infrastructure at scale – The $100B+ commitment covers compute for both training and inference, spread across multiple generations of Amazon’s custom silicon. They’re also expanding inference capacity in Asia and Europe to serve Claude’s growing international user base.

Claude Platform on AWS – This is interesting. The full Claude Platform will be available directly within AWS, using the same account, controls, and billing. No additional credentials or contracts needed. For enterprise customers who already live in AWS, this removes a lot of friction. Claude remains the only frontier AI model available on all three major clouds: AWS Bedrock, Google Cloud Vertex AI, and Microsoft Azure Foundry.

Continued investment – Amazon is investing $5 billion in Anthropic today, with the option for up to an additional $20 billion in the future. This builds on the $8 billion Amazon has already put in. That’s a lot of money, but given the compute costs, it makes sense.

The Trainium bet

Anthropic has been all-in on Amazon’s custom Trainium chips since 2023. They currently use over one million Trainium2 chips to train and serve Claude, and they launched Project Rainier, one of the largest compute clusters in the world.

This is a bet that’s paying off. AWS CEO Andy Jassy claims their custom AI silicon offers “high performance at significantly lower cost.” If that’s true, it gives Anthropic a cost advantage over competitors using NVIDIA GPUs, which are expensive and in short supply.

But there’s a risk here too. Trainium is still relatively new compared to NVIDIA’s CUDA ecosystem. The software stack isn’t as mature, and developers have reported compatibility issues. Anthropic is essentially betting that Amazon can close that gap fast enough to keep Claude competitive.

What this means for Claude users

For the 100,000+ customers running Claude on AWS Bedrock, this should mean better performance and reliability. The new compute capacity is supposed to deliver “meaningful compute in the next three months” and nearly 1GW total by the end of 2026.

For free and Pro users, the hope is that peak-hour slowdowns become a thing of the past. But I’m skeptical. Consumer demand is growing fast, and Anthropic’s own numbers show that revenue tripled in a year. If that trend continues, even 5GW might not be enough.

The bigger picture

This deal is a signal to the entire AI industry. The cost of compute is becoming the single biggest barrier to entry for frontier AI models. Anthropic is spending over $100B over ten years just on AWS infrastructure. OpenAI is reportedly spending similar amounts on Microsoft Azure and its own data centers.

We’re entering an era where only the best-funded companies can play at the frontier. That’s good for incumbents like Anthropic and OpenAI, but it raises questions about competition and innovation.

That said, Anthropic’s diversified hardware strategy is worth noting. They’re using Trainium for training and inference, but they also have workloads spread across other chips. This gives them flexibility and reduces dependency on any single supplier.

Final thoughts (well, not really)

This is a bold move from Anthropic. The $100B+ commitment shows they’re in it for the long haul, and the focus on infrastructure suggests they understand that AI is as much about engineering as it is about research.

But I’ll be watching to see if the reliability issues actually improve. And I’m curious how long it takes before the next capacity crunch hits. Because at this growth rate, it’s not a question of if, but when.

Comments (0)

Be the first to comment!