Loading article…
OpenAI and Broadcom debut Jalapeño ASIC, claiming up to 50% cheaper ChatGPT inference per token and 9‑month design-to‑tape‑out cycle.
OpenAI and Broadcom announced the Jalapeño ASIC on Wednesday, a purpose‑built inference chip that Broadcom’s CEO says can cut the cost of each ChatGPT token by roughly 50% compared with current‑generation GPUs [1].
| At a glance | |
|---|---|
| Chip name | Jalapeño |
| Claim | ~50% cheaper inference per token vs. current GPUs |
| Design cycle | 9 months from concept to tape‑out |
| Delivery | Engineering samples shipped to OpenAI HQ |
The chip is an application‑specific integrated circuit (ASIC) optimized for the memory‑heavy, low‑precision workloads of large‑language‑model (LLM) inference. Broadcom’s CEO Hock Tan told Bloomberg the early lab tests show performance on par with Nvidia’s Blackwell GPUs and Google’s TPUs, while delivering the 50% cost reduction claim [1]. OpenAI’s own statement qualifies the claim, describing the chip’s performance‑per‑watt as “substantially better than current state‑of‑the‑art” and noting that full technical results will be published in the coming months [1].
Designing the ASIC in nine months—what the companies call the fastest high‑performance ASIC cycle ever—was enabled by OpenAI’s models themselves. President Greg Brockman said the company’s AI models accelerated the design process in a way that was “very surprising” [2]. The silicon was fabricated by TSMC and will be integrated with Broadcom’s Tomahawk networking chips and Celestica‑built racks, forming a full‑stack solution that OpenAI can control end‑to‑end.
Current inference workloads run on general‑purpose GPUs, which typically achieve only 60‑70% utilization because the bottleneck is memory traffic, not raw compute [1]. By tailoring the architecture to the specific kernels, memory movement, and serving patterns of transformer models, Jalapeño aims to push utilization closer to the chip’s theoretical peak, a key factor behind the cost‑saving claim. Independent analysts note that the exact baseline chips and test conditions have not been disclosed, leaving the 50% figure unverified outside OpenAI’s own labs [1].
If the cost advantage holds at scale, OpenAI could reduce its reliance on Nvidia GPUs—its biggest AI‑hardware expense since 2022—and lessen the pressure on its cloud partners. Broadcom, whose shares have risen 10% this year and are up nearly sevenfold since 2022, stands to gain a steady stream of high‑volume ASIC orders, while competitors such as AMD, Cerebras, and AWS (with its Trainium chips) may need to accelerate their own custom‑silicon programs to stay relevant [2].
The Jalapeño debut marks OpenAI’s first foray into owning the hardware stack that powers its flagship services. Whether the promised cost savings translate into lower prices for end users or a competitive edge against GPU‑centric rivals will depend on real‑world deployment data and the pace at which other AI leaders roll out their own ASICs.
Coverage is mostly measured — 103 of 125 reports stay neutral.
Every Monday — the token unlocks, Fed dates & catalysts set to move crypto and markets this week. So you’re never blindsided.
Free · 3-min read · one-click unsubscribe
AI-assisted synthesis by the TrendWatcher Editorial Desk · sourced from 2 outlets · Jun 25, 2026 · How we report
It improves conversational quality, goal understanding, handling of complex instructions, and adapts to user feedback, according to OpenAI's release notes.
Free users are expected to receive the update within a day of its rollout to paid users.
Jalapeño is a purpose‑built ASIC for LLM inference, intended to increase performance per watt and reduce dependence on GPU hardware.
Broadcom will manufacture the chip and associated server hardware, with Celestica assembling the racks.
OpenAI aims to begin deployment by the end of 2026 and expand over several years.