HBM, a life and death situation?

With the surge in popularity of ChatGPT and the flourishing of AGI, Nvidia is developing at an unprecedented pace, which has not only led to the prosperity of GPUs but also kept the heat on High Bandwidth Memory (HBM), playing a crucial role.

Following Micron and SK Hynix's recent statements that their HBM production capacity for this year has been fully booked, Micron and Samsung have also recently introduced new HBM products, hoping to secure a share in this booming market. Among them, the former has brought products that will be used in Nvidia's GH200, and also stated that it will bring a 36 GB 12-Hi HBM3E product in March 2024, while the latter has stated that the company's released HBM3E 12H has increased performance and capacity by more than 50%.

It can be seen that the competition in HBM is becoming increasingly fierce, and HBM has also become the key to determining the fate of AI chips. This is why Timothy Prickett Morgan believes that whoever controls HBM controls AI training.

Here is the full text shared by Timothy Prickett Morgan:

What is the most important factor driving the development of Nvidia's data center GPU accelerators in 2024?

Is it the upcoming "Blackwell" B100 architecture? Are we sure that this architecture will provide a leap in performance compared to the current "Hopper" H100 and its fat memory brother H200? No.

Is it the company's ability to get back millions of H100 and B100 GPU chips from its foundry partner TSMC? No, it is not.

Is it Nvidia's AI Enterprise software stack and its CUDA programming model and hundreds of libraries? In fact, at least some of this software (if not all) is the de facto standard for AI training and inference. However, it is not.

While all of these are undoubtedly huge advantages and are the advantages that many competitors are focusing on, the most important factor driving Nvidia's business in 2024 is related to money. Specifically: Nvidia ended the 2024 fiscal year in January with cash and bank investments slightly below $26 billion. If this fiscal year proceeds as expected, revenue will exceed $100 billion, with more than 50% of it reflected in net profit. Even after paying taxes, a huge R&D business, and the company's normal operating expenses, it will add about $50 billion to its treasury.

You can do a lot with $75 billion or more, one of which is not having to worry too much about the huge amount of money needed to purchase HBM stack DRAM memory for data center-level GPUs. This memory is becoming faster, denser (in terms of gigabits per chip), and fatter (FAT, in terms of megabit bandwidth and gigabyte capacity) at a fairly good rate, but its improvement speed has not kept up with the needs of AI accelerators.As Micron Technology joins the ranks of suppliers for SK Hynix and Samsung, the supply of High Bandwidth Memory (HBM) has improved, along with its feed rate and speed. We strongly suspect that the supply will not meet the demand, and the price of HBM memory will continue to rise, driven in part by the price of GPU accelerators that HBM influences to a certain extent.

AMD has $5.78 billion in cash and investments, and does not have a lot of idle funds. Although Intel's bank deposits are slightly higher than $25 billion, it must build a foundry, which is indeed very expensive (currently popular in sequence, it costs between $15 billion to $20 billion each time). Therefore, it also cannot squander on HBM memory.

Another factor favorable to Nvidia's GPU accelerator business is that during the GenAI boom, customers are willing to pay almost any price for hundreds, thousands, or even tens of thousands of data center GPUs. We believe that the price of the original "Hopper" H100 GPU announced in March 2022, especially in the SXM configuration, for a single H100 with 80 GB of HBM3 memory and a speed of 3.35 TB/s, is over $30,000. We do not know the cost of the H100 with 96 GB of memory and a speed of 3.9 TB/s, but we can speculate on Nvidia's charges for the H200 device with 141 GB of HBM3E memory, running at a speed of 4.8 TB/s. The H200 is based on the exact same "Hopper" GPU as the H100, increasing the memory capacity by 76.3% and the memory bandwidth by 43.3%, and improving the performance of the H100 chip by 1.6 to 1.9 times. Considering that the additional capacity means fewer GPUs are needed and less power is consumed to train a given model for a static dataset, we believe Nvidia can easily charge 1.6 to 1.9 times more for the H200 compared to the original H100.

GDDR 7, officially released.