OpenAI just became a chip company. After years of heavy reliance on Nvidia GPUs — sourced through cloud providers like Microsoft Azure, at prices set by a market with few real alternatives — the company has unveiled Jalapeño, its first custom processor built with Broadcom. This isn’t a general-purpose GPU. It’s an ASIC (application-specific integrated circuit) designed to do exactly one thing: run large language models after they’ve been trained. Every ChatGPT response, every API call, every agent action — that’s inference work, and it’s where OpenAI burns the most compute. Deployment target: end of 2026.
What Jalapeño Actually Does (And Doesn’t Do)
Jalapeño is a purpose-built inference engine, not a Swiss Army knife — and that specialization is the whole point.
- Custom ASIC optimized solely for LLM inference — not training, not general computing
- Co-designed with Broadcom; integrates Tomahawk-class networking on-chip
- Engineering samples already running internal OpenAI language model workloads at target production frequencies and power levels
- Architecture prioritizes minimizing data movement and pushing real utilization toward theoretical peak
- Built for compatibility with LLMs across the industry, not just OpenAI’s own models — including AI-powered websites and tools that depend on efficient inference
Hundreds of millions of queries hit OpenAI’s servers daily. Every watt saved across gigawatt-scale data centers translates directly to operating costs — GPU scarcity in AI has essentially been the Taylor Swift tickets situation, except the scalpers wear suits and file 10-Ks. Broadcom CEO Hock Tan told Reuters that Jalapeño performs “on par” with Nvidia’s Blackwell GPUs and Google’s TPUs for relevant workloads. That’s a vendor claim, not an independent benchmark — treat it accordingly. “By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access.” — Greg Brockman, OpenAI President (via CNBC)
The Nvidia Problem OpenAI Is Trying to Solve
Unlike Google or Amazon, OpenAI doesn’t own its data centers — which makes this silicon bet bolder, not safer.
Google, Amazon, Meta, and Microsoft all build custom AI chips from behind the security of owning their own infrastructure. OpenAI is a tenant. Executing this move without that safety net is more audacious, not less. The Bitcoin mining era proved a relevant principle: once a workload stabilizes, purpose-built silicon crushes general-purpose hardware on economics. OpenAI is betting that LLM inference has reached that inflection point — and analysts tracking the custom-ASIC trend note that inference workloads, unlike training, are predictable enough in structure to reward that kind of specialization.
The caveats, however, are substantial. Process node, core counts, memory specs, peak FLOPS — none have been disclosed. OpenAI promises a detailed technical performance report “in the coming months.” Until independent benchmarks appear, “substantially better performance-per-watt” and parity with Blackwell or TPUs remain vendor claims, not established fact. A press release is not a benchmark.
If Jalapeño delivers on those claims, the long-term pressure on Nvidia’s pricing power becomes very real. OpenAI is declaring that AI infrastructure is the future bottleneck — silicon and power, not algorithms alone. The receipts arrive in 2026.




























