High Bandwidth Memory Will Stack on AI Chips Starting Around 2026 With HBM4

SK Hynix and Taiwan’s TSMC have established an ‘AI Semiconductor Alliance’. SK Hynix has emerged as a strong player in the high-bandwidth memory (HBM) market due to the generative artificial intelligence (AI) craze. TSMC is the world’s largest semiconductor foundry (consignment production) company.

The strategy is to consolidate the victory of the two companies in the AI semiconductor market by gathering technology in next-generation AI semiconductor packaging.

Samsung is currently the lead in computer memory chips.

SK hynix Inc. is a South Korean supplier of dynamic random-access memory chips and flash memory chips.

TSMC is the dominant leader in advanced chip fabrication.

SK Hynix has established a one-team strategy including cooperation with TSMC on development related to HBM4, the sixth-generation HBM (High Bandwidth Memory).

SK Hynix will stack HBM4 directly on processors. This will change the way logic and memory devices are typically interconnected but will also change the way they are made. This may transform the foundry industry.

Currenty, HBM stacks integrate 8, 12, or 16 memory devices as well as a logic layer that acts like a hub. HBM stacks are placed on the interposer next to CPUs or GPUs and are connected to their processors using a 1,024-bit interface. SK Hynix aims to put HBM4 stacks directly on processors, eliminating interposers altogether.

This approach resembles AMD’s 3D V-Cache, which is placed directly on CPU dies. But HBM will feature considerably higher capacities and will be cheaper but slower than V-Cache albeit slower

The HBM4 memory will use a 2,048-bit interface to connect to host processors. This would make interposers for HBM4 extremely complex and expensive. This makes the direct connection of memory and logic economically feasible. Placing HBM4 stacks directly on logic chips will somewhat simplify chip designs and cut costs but would require a solution for thermal problems.

A package with both the AI chip and the memory would need liquid cooling or immersion.

Samsung’s HBM3 (24GB) completed verification with NVIDIA in December, 2023.

The progress of HBM3e shows that Micron provided its 8hi (24GB) samples to NVIDIA by the end of July, SK hynix in mid-August, and Samsung in early October.

In 2024, NVIDIA will add the H200, using 6 HBM3e chips, and the B100, using 8 HBM3e chips. In 2024, NVIDIA will integrate its own Arm-based CPUs and GPUs to launch the GH200 and GB200, enhancing its lineup with more specialized and powerful AI solutions. In 2025, NVIDIA will add the X100 chip.

AMD’s 2024 focus is on the MI300 series with HBM3, transitioning to HBM3e for the next-gen MI350. The company is expected to start HBM verification for MI350 in 2H24, with a significant product ramp-up projected for 1Q25.

HBM4 in 2026

HBM4 is expected to launch in 2026, with enhanced specifications and performance tailored to future products from NVIDIA and other CSPs. Driven by a push toward higher speeds, HBM4 will mark the first use of a 12nm process wafer for its bottommost logic die (base die), to be supplied by foundries. This advancement signifies a collaborative effort between foundries and memory suppliers for each HBM product, reflecting the evolving landscape of high-speed memory technology.

With the push for higher computational performance, HBM4 is set to expand from the current 12-layer (12hi) to 16-layer (16hi) stacks, spurring demand for new hybrid bonding techniques. HBM4 12hi products are set for a 2026 launch, with 16hi models following in 2027.

TrendForce notes a significant shift toward customization demand in the HBM4 market. Buyers are initiating custom specifications, moving beyond traditional layouts adjacent to the SoC, and exploring options like stacking HBM directly on top of the SoC. While these possibilities are still being evaluated, TrendForce anticipated a more tailored approach for the future of the HBM industry.

1 thought on “High Bandwidth Memory Will Stack on AI Chips Starting Around 2026 With HBM4”

  1. I don’t understand this – “The HBM4 memory will use a 2,048-bit interface to connect to host processors. This would make interposers for HBM4 extremely complex and expensive. This makes the direct connection of memory and logic economically feasible.”

    Expensive or not? I think a comparison to V-cache and its future would help make clear what is new here. Anyone?

Comments are closed.