Corporate America Is Rationing AI – Because the Token Bill Just Got Insane

Firms from Uber to Amazon now cap per-user token limits and downgrade to cheaper models as frontier AI bills spiral past annual budgets

Jul 2, 2026

2 min read

Key Takeaways

Uber exhausted its entire annual AI budget early due to runaway token billing costs.
Frontier model token prices force companies to switch to mini-models costing 300x less.
Amazon’s “tokenmaxx” culture accelerated AI overspending, mirroring cloud computing’s FinOps cycle.

Uber reportedly burned through its entire annual AI budget months early, according to the Wall Street Journal. Not on some moonshot project. On employees using frontier AI models for everyday work. That detail alone captures where enterprise AI adoption stands right now: deep in the hangover phase.

After a frenzied push to “AI-ify everything,” companies are discovering that usage-based token billing — where you pay per unit of AI input and output — has turned shiny productivity tools into runaway expense accounts. The models are genuinely impressive. The invoices are another matter entirely.

How the Math Went Sideways

Flat per-seat pricing gave way to metered token billing, and nobody budgeted for what happened next.

The old deal was simple: pay per seat, use as much as you want. Then providers like Anthropic, OpenAI, and GitHub shifted enterprise accounts to token-based billing. Frontier models now run roughly $5 per million input tokens and $25–$30 per million output tokens, according to independent technical benchmarks. Long agent runs — autonomous AI workflows chaining dozens of tasks together — can rack up hundreds of dollars in a single session.

Companies are responding with increasingly deliberate measures:

Switching to “mini-models” costing as little as $0.05 per million tokens versus around $15 for frontier models
Breaking complex tasks into segments and routing each to the cheapest capable model
Imposing per-user token budgets and usage dashboards
Moving to open-source models hosted on internal infrastructure
Microsoft canceling most direct Claude Code licenses and steering developers toward GitHub Copilot CLI, according to Fortune

Consultant Adrian Balfour told Yahoo Finance: “The big large monolithic model costs $15 per million tokens, but you can reduce that to around five cents with the smaller mini model.” Exploring AI-powered websites can also help teams find capable, lower-cost alternatives for everyday productivity tasks.

The Culture That Created the Monster

Companies that rewarded heavy AI usage with internal leaderboards are now scrambling to contain what they built.

Amazon at one point reportedly encouraged employees to “tokenmaxx” — maximize token consumption for experimentation and prestige, according to Fortune. Analyst Jack Gold noted via Yahoo Finance that token costs in some deployments surpass the monthly cost of employing a human worker. That math tends to get finance teams’ attention quickly.

The pattern mirrors cloud computing’s FinOps cycle, compressed into roughly eighteen months: generous introductory pricing, enthusiastic adoption, metered billing, sticker shock, then frantic optimization. The Economist reports that much of the expense stems not from raw model prices alone, but from deploying AI without clear use cases or guardrails — a reminder that companies are often paying too much without realizing it.

If your Copilot access just got capped or your team was told to stop using the top-tier model for slide decks, you’re living this story. The era of unlimited AI experimentation on the company card is over. What replaces it looks like every other cloud resource: budgeted, governed, and justified before anyone touches it.