NVIDIA chip, the latest roadmap.

It is widely known that with the surge in generative AI, Nvidia is dominating the data center field, which has also helped them achieve better performance. According to the company's published data, for the second quarter ending July 30, 2023, Nvidia's revenue was $13.51 billion, an increase of 88% from the previous quarter and an increase of 101% from the same period last year.

However, many of Nvidia's current performance expectations are based on the current chips and hardware. But some analysts estimate that if enterprise AI and its DGX cloud products are included, the market size of this data center will be at least three times that of the gaming market, or even 4.5 times.

Advertisement

UBS analyst Timothy Arcuri also said that Nvidia's current revenue in the DGX cloud computing field is about $1 billion. But after talking with customers, he believes that the company may get as much as $10 billion in revenue from this department every year. His reason is that Nvidia can still add additional products to the DGX cloud, including pre-trained models, access to H100 GPUs, etc. He said that these leading GPUs are still "very difficult" to access, can scale up and down as needed, and "basically seamlessly integrate" with existing cloud or on-premises infrastructure.

Therefore, Nvidia recently announced a product roadmap that includes new components such as H200, B100, X100, B40, X40, GB200, GX200, GB200NVL, GX200NVL, etc., which is very important for Nvidia's future development.

Data Center Roadmap

According to the roadmap disclosed by servethehome, a major change of Nvidia is that they now separate their Arm-based products and x86-based products, with Arm taking the lead. For reference, ordinary customers can't even buy NVIDIA Grace or Grace Hopper now, so it's an important detail to show it in the 2023-2025 roadmap stack. Here is the roadmap proposed by NVIDIA:

In terms of Arm, Nvidia plans to launch GH200NVL in 2024, GB200NVL in 2024, and then GX200NVL in 2025. We have seen the x86 NVL series launched with NVIDIA H100 NVL, but these are all Arm-based solutions. Then there is GH200NVL launched in 2024. There is also the quick follow-up GB200NVL, and then GX200NVL. There are also non-NVL versions.

When NVIDIA announced the launch of the new dual-configuration NVIDIA Hopper 144GB HBM3e model (which may eventually become GH200NVL), we introduced GH200 with 142GB/144GB of memory (non-NVL). It is said that compared with the current generation, the dual-configuration memory capacity has increased by 3.5 times, the bandwidth has increased by 3 times, including a product with 144 ArmNeoverse cores, 8 petaflops of AI performance, and 282GB of the latest HBM3e memory technology.GB200 will become the next-generation accelerator in 2024, while GX200 will take over in 2025.

Aiming at the x86 market, Nvidia is expected to launch the H200 in 2024, which will be an update on the Hopper architecture and feature more memory. The B100 and B40 are the next-generation architecture components, followed by the X100 and X40 in 2025. Considering that the B40 and X40 are positioned in the "enterprise" track, and the current L40S is a PCIe card, these may also be PCIe cards.

In terms of networking, both Infiniband and Ethernet are expected to evolve from 400Gbps to 800Gbps in 2024, and then reach 1.6Tbps in 2025. Given that we have already studied the Broadcom Tomahawk 4 and switches at the beginning of 2023 and saw the partner's 800G Broadcom Tomahawk 5 switches this year, it feels as if NVIDIA's Ethernet product portfolio is significantly lagging in Ethernet. Broadcom's 800G series from 2022-2023 seems to align with NVIDIA's 2024 upgrade, with NVIDIA announcing the Spectrum 4 in mid-2023, while Tomahawk 5 was announced about 21-22 months ago. In the industry, there is usually a significant gap between chip announcements and production.

It can be seen that, in terms of Infiniband, NVIDIA is on its own. We do not see an NVSwitch/NVLink roadmap from this roadmap.

Other artificial intelligence hardware companies should be intimidated by NVIDIA's enterprise AI roadmap. In the field of AI training and inference, this means an update to the current Hopper in 2024, then a transition to the Blackwell generation later in 2024, and adoption of another architecture in 2025.

In terms of CPUs, we have recently seen an exciting update rhythm, with a significant increase in the core count battle on the x86 side. For example, the number of cores of Intel's top Xeon is expected to increase more than 10 times from the beginning of the second quarter of 2021 to the second quarter of 2024. NVIDIA also seems to be keeping pace in the data center field. For AI startups building chips, considering the pace of NVIDIA's new roadmap, this is now a race.

For Intel, AMD, and perhaps Cerebras, their targets will change as NVIDIA sells large high-profit chips. It also places Arm-based solutions in the top track, so it can not only make high profits in GPUs/accelerators but also in CPUs.

An interesting laggard seems to be Ethernet, which is strange.

Accurate supply chain control

According to semianalysis, one of the important reasons why NVIDIA can stand out in the AI chip market is not only their layout in hardware and software but also their control over the supply chain.NVIDIA has repeatedly demonstrated in the past that they can creatively increase supply during shortages. NVIDIA is willing to commit to non-cancellable orders and even advance payments, thereby securing a huge supply. Currently, Nvidia has $11.15 billion in purchase commitments, capacity obligations, and inventory obligations. Nvidia has also additionally signed prepaid supply agreements worth $3.81 billion. From this perspective alone, no other supplier can match it, so they will also be unable to participate in the frenzied AI wave that is happening.

Since the inception of Nvidia, Huang Renxun has been actively laying out its supply chain to promote Nvidia's huge growth ambitions. Huang Renxun once recounted his early meeting with TSMC founder Zhang Zhongmou:

"In 1997, when Zhang Zhongmou and I met, Nvidia's revenue that year was $27 million. We had 100 people, and then we met. You may not believe this, but Zhang Zhongmou used to make sales calls. You used to visit customers regularly, right? You would come in to visit customers, and I would explain to Zhang Zhongmou what Nvidia did, you know, I would explain how big our chip size needs to be, and it will become bigger and bigger every year. You would regularly come back to Nvidia and let me tell this story again to make sure I need so many wafers, next year, we started to cooperate with TSMC. Nvidia did it, I think it was 127 million, and from then on, we grew nearly 100% every year, until now."

Zhang Zhongmou initially did not believe that Nvidia needed so many wafers, but Huang Renxun persisted and took advantage of the huge growth of the gaming industry at that time. Nvidia achieved great success through bold supply, and they were usually successful. Of course, they had to write down the value of billions of dollars of inventory from time to time, but they still benefited positively from over-ordering.

If something works, why change it?

This time recently, Nvidia has once again seized most of the supply of SK Hynix, Samsung, and Micron HBM, which is another core that GPUs and AI chips are chasing. Nvidia has placed very large orders with all three HBM suppliers and is squeezing out the supply of everyone except Broadcom/Google.

In addition, Nvidia has also bought up most of TSMC's CoWoS supply. But they didn't stop there, they also went out to inspect and bought Amkor's capacity.

Nvidia also uses many downstream components required for HGX boards or servers, such as retimer, DSP, optical devices, etc. Suppliers who refuse Nvidia's requirements are usually treated with a "carrot and stick" approach. On the one hand, they can get seemingly unimaginable orders from Nvidia, on the other hand, they also face problems designed by Nvidia's existing supply chain. They only use submission and non-cancellation when the supplier is critical and cannot be designed or multi-sourced.

Every supplier seems to think they are the AI winner, partly because Nvidia has placed a large number of orders with them, and they all think they have won most of the business, but in fact, Nvidia's development speed is so fast that it has even exceeded their imagination.Return to the market dynamics mentioned above. Although Nvidia's goal is to exceed $70 billion in data center sales next year, only Google has sufficient upstream production capacity to own a meaningful unit of more than 1 million in scale. Even after AMD's latest production capacity adjustment, their total production capacity in AI is still very moderate, with a maximum of only several hundred thousand units.

Shrewd Business Plan

As is well known, Nvidia is leveraging the huge demand for GPUs to upsell and cross-sell to customers. Several sources in the supply chain told semianalysis that Nvidia is prioritizing enterprises based on a variety of factors, including but not limited to: multi-party procurement plans, plans to produce their own AI chips, purchasing Nvidia's DGX, network cards, switches, and optical devices, etc.

Semianalysis pointed out that infrastructure providers such as CoreWeave, Equinix, Oracle, AppliedDigital, Lambda Labs, Omniva, Foundry, Crusoe Cloud, and Cirrascale are facing a much closer allocation of products to their potential demand than large technology companies like Amazon.

According to semianalysis, in fact, Nvidia's bundling sales are very successful. Although they were previously a small-scale optical transceiver supplier, their business doubled in the first quarter and is expected to achieve shipments worth more than $1 billion next year. This far exceeds the growth rate of the GPU or network chip business.

Moreover, these strategies are well thought out. For example, currently, the only way to achieve 3.2T networking on Nvidia systems through reliable RDMA/RoCE is to use Nvidia's NIC. This is mainly because Intel, AMD, and Broadcom lack competitiveness and are still stuck at 200G.

In Semianalysis's view, Nvidia is taking the opportunity to manage its supply chain, making the delivery time of its 400G InfiniBand NIC significantly lower than that of 400G Ethernet NIC. Remember, the chip and circuit board design of the two NICs (ConnectX-7) is the same. This mainly depends on Nvidia's SKU configuration, rather than actual supply chain bottlenecks. This forces companies to purchase Nvidia's more expensive InfiniBand switches instead of using standard Ethernet switches. Nvidia makes an exception when you purchase the Spectrum-X Ethernet network with a NIC mode Bluefield-3 DPU.

The matter is not over yet, just look at the supply chain's craze for L40 and L40S GPUs.

Semianalysis revealed that in order to win more H100 allocation for those original equipment manufacturers, Nvidia is promoting the sale of L40S, and these OEMs are also under pressure to buy more L40S to get better H100 allocation. This is the same game Nvidia plays in the PC field, where notebook manufacturers and AIB partners must buy a large number of G106/G107 (mid-range and low-end GPUs) to get a good allocation for the more scarce and higher profit margin G102/G104 (high-end and flagship GPUs).

Many people in the Taiwan supply chain believe that L40S is better than A100 because it has higher FLOPS. It needs to be clear that these GPUs are not suitable for LLM inference because their memory bandwidth is less than half of A100, and there is no NVLink. This means that running LLM on them with a good total cost of ownership is almost impossible except for very small models. High batch sizes have unacceptable tokens/second/user, making the theoretical FLOPS useless for LLM in practice.Semianalysis says OEMs are also under pressure to support Nvidia's MGX modular server design platform. This effectively removes all the hard work of designing a server, but it also commoditizes it, creating more competition and driving down OEM profits. Companies like Dell, HPE, and Lenovo are clearly resistant to MGX, but Taiwan's low-cost companies like Super Micro, Quanta, Asus, Gigabyte, Pegatron, and ASRock are rushing to fill the gap and commoditize low-cost "enterprise AI." Of course, these OEMs/ODMs that are involved in the L40S and MGX games have also received a better allocation of Nvidia's mainline GPU products. Although Nvidia is facing a pincer attack from chip manufacturers and system manufacturers' self-developed chips. But these layouts seem to allow Nvidia to rest easy in the short term. They will still be the most successful "shovel sellers" in the AI ​​era.

Post a comment