Loading article…
OpenAI and Broadcom announce Jalapeño, a custom AI chip for large language model inference, with 50% cost savings claim, targeting deployment by 2026, and
OpenAI and Broadcom have announced Jalapeño, a custom AI accelerator designed specifically for large language model (LLM) inference, with the goal of making AI more abundant and affordable [1]. The chip is the result of a nine-month collaboration between the two companies and is expected to deliver higher performance per watt than current state-of-the-art AI accelerators, with a claimed 50% cost savings per inference token compared to current-generation graphics processing units [2].
| At a glance | |
|---|---|
| Company | OpenAI and Broadcom |
| Product | Jalapeño AI accelerator |
| Claimed cost savings | 50% per inference token |
| Deployment target | End of 2026 |
The development of Jalapeño was driven by the need for a purpose-built chip that can efficiently handle the specific requirements of LLM inference, which is dominated by memory traffic rather than computation [2]. OpenAI designed the chip architecture and algorithmic design, while Broadcom contributed silicon implementation expertise and high-performance networking technology [1]. The result is a chip that is optimized for the exact memory access patterns, kernel shapes, and serving behaviors of OpenAI's models, allowing for more efficient use of resources and reduced costs.
Jalapeño is designed to compete with existing AI accelerators, such as Nvidia's GPUs and Google's Tensor Processing Units, by offering a more efficient and cost-effective solution for LLM inference [2]. The chip's performance is claimed to be on par with these existing solutions, while delivering substantially better performance per watt [1]. A comparison of the specifications of Jalapeño and existing AI accelerators is not available, but the claimed cost savings and performance improvements make it a significant development in the field of AI hardware.
The announcement of Jalapeño marks a significant development in the field of AI hardware, with the potential to make AI more abundant and affordable for a wide range of users [1]. The success of Jalapeño will depend on its ability to deliver on its claimed performance and cost savings, and its adoption by developers and users of OpenAI's services.
Coverage is mostly measured — 103 of 125 reports stay neutral.
Every Monday — the token unlocks, Fed dates & catalysts set to move crypto and markets this week. So you’re never blindsided.
Free · 3-min read · one-click unsubscribe
AI-assisted synthesis by the TrendWatcher Editorial Desk · sourced from 2 outlets · Jun 25, 2026 · How we report
It improves conversational quality, goal understanding, handling of complex instructions, and adapts to user feedback, according to OpenAI's release notes.
Free users are expected to receive the update within a day of its rollout to paid users.
Jalapeño is a purpose‑built ASIC for LLM inference, intended to increase performance per watt and reduce dependence on GPU hardware.
Broadcom will manufacture the chip and associated server hardware, with Celestica assembling the racks.
OpenAI aims to begin deployment by the end of 2026 and expand over several years.