Loading article…
Meta adds 256 GB DDR4 to 1 TB AI servers using Vistara CXL ASIC, cutting server count 25% amid DDR5 shortages.
Meta’s new “MemServer” combines 768 GB DDR5‑6400 with 256 GB DDR4‑2400, reaching a terabyte of memory per box by bridging the older RAM through a custom CXL 2.0 ASIC called Vistara【1】. The move lets Meta sidestep the global DDR5 shortage and reduces the number of AI inference servers needed by up to 25%【3】.
| At a glance | |
|---|---|
| Server memory mix | 768 GB DDR5 + 256 GB DDR4 |
| Total capacity | 1 TB per MemServer |
| DDR5 bandwidth | 614 GB/s |
| DDR4 bandwidth | 76 GB/s |
| Server count reduction | up to 25% |
Meta’s MemServers run on AMD’s Epyc “Turin” CPUs, which officially support DDR5 only. Vistara’s ASIC links two 72‑bit DDR4 channels to the host via a PCIe Gen5 x16 CXL 2.0/1.1 interface, supporting up to 256 GB per chip and 64 GB per DIMM, though the current design uses 32 GB DIMMs—the largest capacity available for reuse【1】. This configuration delivers a local DDR5 peak bandwidth of 614 GB/s, while the DDR4 tier provides just 76 GB/s and roughly double the idle latency, meaning the slower tier contributes only about one‑tenth the performance of the DDR5 pool【1】.
Meta argues that AI workloads often leave large portions of memory idle, so only a small fraction of pages need fast access. By treating the DDR4 pool as a separate NUMA node, the system keeps hot data in DDR5 and relegates cold pages to DDR4, limiting the impact of the lower bandwidth and higher latency【3】. The company reports that this “disaggregated” approach cuts AI inference server counts by up to 25% and reduces job‑restart and fragmentation overhead by 33%【3】.
The DDR5 shortage and rising DRAM prices have forced hyperscalers to explore unconventional solutions. Meta’s recycling of retired DDR4 modules avoids the “RAM tax” of buying new memory and reduces electronic waste, a claim the company highlights as near‑zero‑cost expansion【2】. While commercial CXL products typically bundle controllers with fresh DRAM, Vistara’s design enables reuse of existing DDR4 inventories at scale, a capability that could appeal to other large cloud providers facing similar supply constraints【2】.
Other industry players are also experimenting with memory disaggregation. Nvidia’s NVLink and the emerging Ultra Accelerator Link (UAL) consortium, which includes AMD, AWS, Google, Microsoft, and Meta, aim to connect accelerators across hardware vendors, suggesting a broader shift toward flexible interconnects that can accommodate heterogeneous memory pools【2】.
Meta’s hybrid memory architecture shows that, even for a company with deep pockets, the scarcity of next‑gen DRAM can drive innovative hardware‑software co‑design. Whether the performance trade‑offs remain acceptable outside hyperscale AI workloads will determine if recycled DDR4 via CXL becomes a mainstream solution.
Coverage is mostly measured — 9 of 9 reports stay neutral.
Every Monday — the token unlocks, Fed dates & catalysts set to move crypto and markets this week. So you’re never blindsided.
Free · 3-min read · one-click unsubscribe
AI-assisted synthesis by the TrendWatcher Editorial Desk · sourced from 3 outlets · Jul 4, 2026 · How we report
Meta repurposes DDR4 from retiring servers to increase total memory capacity and avoid shortages, using a custom ASIC to bridge the older memory to the newer system.
The DDR5 tier provides a peak bandwidth of 614 GB/s and lower latency, while the DDR4 tier offers about 76 GB/s bandwidth and roughly double the latency.
Vistara implements a CXL 2.0/1.1‑compliant PCIe Gen5 x16 interface that bridges two 72‑bit DDR4 channels, supporting up to 256 GB per chip.
Meta observes that only a small fraction of memory is actively accessed at any time, so the slower DDR4 tier's higher latency and lower bandwidth are unlikely to become bottlenecks.
It helps prevent servers from running out of memory, reduces wear on SSDs and DDR5 modules, and lowers overall infrastructure costs.