In September 2011, on the last day of the Intel Developer Forum (IDF), Intel's Chief Technology Officer Justin Rattner spent about one minute out of his hour-long presentation introducing a revolutionary technology—HMC (Hybrid Memory Cube).
This technology, co-developed by Micron and Intel, was only briefly mentioned, but its significance is not much less than that of processor architecture iterations, as it represents another revolution in the memory industry, promising to completely solve the bandwidth issues faced by DDR3 in the past.
In fact, as early as August before the IDF began, Micron researcher and Chief Technology Expert Thomas Pawlowski had introduced HMC in detail at Hot Chips. Although he did not disclose the cooperation with Intel at the time, he stated that HMC is an innovation in three-dimensional integrated circuits that surpasses the processor-memory chip stacking technology demonstrated by companies like Samsung, and is a completely new memory-processor interface architecture.
Advertisement
For Micron, HMC is the most powerful weapon to counter Samsung and Hynix, the two major Korean manufacturers.
Memory Revolution
When introducing HMC, Pawlowski questioned the backwardness of the DRAM standard at the time. He believed that to continue increasing bandwidth and reduce power consumption and latency to meet the needs of multi-core processing, direct control of memory must give way to some form of memory abstraction. DRAM manufacturers always need an industry standard organization (such as JEDEC) to agree on about 80 parameters used to specify DRAM, in order to produce a "lowest common denominator" solution.
The implication of his words is that Micron does not intend to continue sitting down for slow negotiations. Since memory bandwidth is tight, they will develop a completely new high-bandwidth standard, break free from the constraints of JEDEC's framework, and establish their own stronghold, with Micron naturally becoming the leader.
In the new HMC standard announced by Pawlowski, communication from the processor to the memory is carried out through high-speed SERDES data links, which connect to the local logic controller chip at the bottom of the DRAM stack. In the prototype demonstrated at IDF, four DRAMs are connected to the logic chip through silicon vias (TSVs), and a stack of up to eight DRAMs is also described. It is worth mentioning that the processor in the prototype is not integrated into the stack, thus avoiding chip size mismatch and thermal issues.
HMC is essentially a complete DRAM module that can be installed on a multi-chip module (MCM) or a 2.5D passive interposer, bringing it closer to the CPU. In addition, Micron also introduced a "far memory" configuration, in which part of the HMC is connected to the host, and the other part of the HMC is connected to other HMCs through a serial link, forming a memory cube network.From today's perspective, HMC can hardly be considered not advanced, and Pawloski also feels quite proud. He stated that HMC does not need to use a complex memory scheduler, but only needs to use a thin arbitrator to form a shallow queue. HMC has eliminated the complex standard requirements from the architectural level. Timing constraints no longer need to be standardized, and only high-speed SERDES interfaces and physical dimensions need to be standardized. This part of the specification can be adjusted through customized logic ICs to adapt to applications, and large-capacity DRAM chips are the same in many applications.
Regarding the latency issue that many people worry about, Pawloski also said that although the serial link of HMC will slightly increase the system latency, the overall latency is actually significantly reduced, especially its DRAM cycle time (tRC) is lower in design, lower queue latency and higher storage body availability further shorten the system latency.
He also showed the specific data of the first-generation HMC prototype. Micron cooperated with Intel to build the first-generation 27mm x 27mm HMC prototype by combining a 1Gb 50nm DRAM array with a 90nm prototype logic chip. It uses 4 40 GBps (billion bytes per second) links on each cube, and the total throughput of each cube is 160 GBps. The total capacity of the DRAM cube is 512MB, and the resulting performance has significantly improved the energy efficiency by about 3 times compared to the next generation of DDR4 (in pj/bit units).
HMC has solved the bandwidth problem of traditional DRAM and has become everyone's new favorite for a while, but in essence, it is a collection of the continuously developing through-silicon via (TSV) technology, and it cannot be entirely attributed to Micron and Intel.
What is TSV? TSV, full name Through Silicon Via, is a new type of three-dimensional stacking packaging technology, which mainly stacks multiple chips (or wafers) vertically together, and then drills holes, connects, and fills with metal inside to achieve electrical connections between multiple layers of chips. Compared with the traditional wire connection multi-chip packaging method, TSV can greatly reduce the amount of wire used in semiconductor design, reduce process complexity, and thus improve speed, reduce power consumption, and reduce volume.
As early as 1999, the Japan Advanced Electronic Technology Development Organization (ASET) began to fund the 3D IC chip project "High Density Electronic System Integration Technology Development" using TSV technology, which is also one of the earliest institutions to study 3D integrated circuits. In 2004, Japan's Elpida also began to develop TSV on its own, and in 2006, it developed a DRAM architecture with 8 stacked 128Mb using TSV technology.
The flash memory industry took the lead in commercializing 3D stacking. Toshiba launched a NAND flash memory chip with 8 stacked dies in April 2007, while Hynix launched a NAND flash memory chip with 24 stacked dies in September of the same year.
The memory industry was relatively a bit later. Elpida launched the first DRAM chip using TSV in September 2009, which was stacked and packaged with 8 pieces of 1GB DDR3 SDRAM. In March 2011, SK Hynix launched a 16 GB DDR3 memory using TSV technology (40 nm level), and in September of the same year, Samsung launched a 3D stacked 32 GB DDR3 based on TSV (30 nm level).The HMC, which has integrated the latest TSV technology, not only won the Best New Technology Award in 2011 from The Linley Group (publisher of the Microprocessor Report), but also attracted the interest of many technology companies. Companies including Samsung, Open-Silicon, ARM, HP, Microsoft, Altera, and Xilinx formed the Hybrid Memory Cube Consortium (HMCC) with Micron. Micron is gearing up to start a more thorough memory technology revolution.
JEDEC's Counterattack
As mentioned earlier, Micron's technical expert Pawlowski criticized the old memory standards, especially the JEDEC organization, which seemed to be an unforgivable villain. It seems that because of its existence, memory technology has been slow to improve.
So, what is JEDEC?
The JEDEC Solid State Technology Association is a standardization organization in the solid-state and semiconductor industry. Its earliest history can be traced back to 1958, when the Joint Electron Device Engineering Council (JEDEC) was jointly established by the Electronic Industries Alliance (EIA) and the National Electrical Manufacturers Association (NEMA). Its main responsibility is to formulate unified standards for semiconductors. After 1999, JEDEC became an independent industry association, established its current name, and has continued to this day.
As an industry association, JEDEC has formulated packaging standards for DRAM components and packaging standards for memory modules in the late 1980s. "The standards formulated by JC-42 and its subcommittees are the reasons why we can upgrade PC memory so easily," said Mark Bird, who has been a JEDEC volunteer since the 1970s. "We have standardized the configurations of various components, SIMMs, the slots they are in, and the functions of each device."
Although DRAM manufacturers cannot do without the standards formulated by JEDEC, JEDEC is essentially non-mandatory. Its first principle is openness and voluntary standards. All standards are open and voluntary, without favoring any country or region and discriminating against other countries or regions. With nearly 300 member companies, it also follows a one-vote-per-company and two-thirds majority system, which reduces the risk of the standard-setting process being controlled by any one or group of companies.
Whether it is Micron or Samsung Hynix, they do not have the ability to interfere with the formulation of JEDEC standards. Even though the number of DRAM manufacturers is already very few, the discourse power of the standards is not in the hands of the three giants. Only when everyone truly recognizes it will it finally be implemented as a formal standard.
At this time, the problem arises. The industry is still moving forward under the standards formulated by JEDEC, but Micron wants to stand out on its own and even build its own alliance. This sounds a bit like something Apple would do, such as the FireWire interface, the early Thunderbolt interface, and the Lightning interface. The products are good, but there is no other place to go.
If Micron's HMC technology is advanced enough, it would be fine to lead JEDEC by four or five years, and it could make a small profit like Apple, and it could compete with Korean manufacturers. Unfortunately, this technology only leads by one or two years, or even less than that.In the year Micron announced HMC, JEDEC published the JESD229 standard for Wide IO in 2011. As a 3D IC memory interface standard, it was specifically designed to address the DRAM bandwidth issue. The basic concept involves using a large number of pins, each operating at a relatively slower speed but with lower power consumption.
In January 2012, the standard was officially approved, specifying 4 channels of 128 bits each, connected via single data rate technology to DRAM operating at a frequency of 200MHz, with a total bandwidth of 100Gb/s. Although it still cannot match the bandwidth of HMC, it indirectly proves that JEDEC standards are not always stagnant and useless.
Of course, if there were only Wide IO, it would be fine, after all, the concept of HMC is advanced enough, and although the price is also high, there will always be some high-bandwidth demand products to pay the bill, and the prospect is still very bright.
However, in 2013, another contender emerged - AMD and Hynix announced their jointly developed HBM, which uses a 128-bit wide channel, with up to 8 channels stacked to form a 1024-bit interface, with a total bandwidth between 128GB/s and 256GB/s. The number of DRAM chip stacks ranges from 4 to 8, and each memory controller is independently timed and controlled.
In terms of cost and bandwidth, HBM seems to be a moderate choice, neither as cheap as Wide I/O nor as high in bandwidth as HMC. However, the moderate HBM has established its position through GPUs, with both AMD and Nvidia choosing HBM as the memory for their graphics cards.
The fatal blow to Micron's HMC was that not long after HBM was launched, it was established as the industry standard JESD235. One is a large organization with the main technology companies in the industry, and the other is a small circle that Micron has pulled up by itself. It seems that the competition has not officially started, and the result has already been determined.
The End of HMC
In April 2013, the HMC 1.0 specification was officially released. According to the specification, HMC uses 16-channel or 8-channel (half-size) full-duplex differential serial links, with each channel having a 10, 12.5, or 15 Gbit/s serial deserializer. Each HMC package is named a cube, which can form a network of up to 8 cubes through links between cubes and some cubes using their links as pass-through links.
Of course, when HMC 1.0 was released, Micron was still full of confidence. Micron DRAM Marketing Vice President Robert Feurle said, "This milestone marks the demolition of the memory wall." "This industry agreement will help promote the fastest adoption of HMC technology, and we believe it will completely improve computing systems and ultimately improve consumer applications."At the "DesignCon 2014" held in January 2014, Micron's Chief Technology Expert Pawlowski stated that JEDEC had not made any new efforts after DDR4, "HMC only needs a SerDes (serializer/deserializer) interface with a simple instruction set, without all the details. The future trend is for HMC to replace DDR and become the new standard for DRAM." He said.
Is the fact really the same as what Micron said?
Of course not, the seemingly powerful bandwidth of HMC is based on a high cost. Since the first version of the specification in 2013, the products that have truly adopted HMC technology are only the Square Kilometer Array (SKA) astronomical project, Fujitsu's supercomputer PRIMEHPC FX 100, Juniper's high-performance network routers and data center switches, and Intel's Xeon Phi coprocessor.
Don't be too excited to see Intel. According to Micron, although the memory solution of the Xeon Phi coprocessor uses the same technology as HMC, it is specifically optimized for integration into Intel's Knight's Landing platform, with no standardization plan, and cannot be provided to other customers. What does this mean? It means that Intel did not fully comply with HMC and created its own standard.
Moreover, not to mention ordinary consumers, even NVIDIA and AMD's professional accelerator cards are not related to HMC. HBM is already expensive enough, and HMC is even more expensive than it. Although Micron has not disclosed the specific costs, we believe that this price will definitely be a burden that most manufacturers cannot bear. The importance of memory bandwidth is true, but the excessively high cost will only deter customers.
It is worth mentioning that although Samsung and Hynix have also joined the HMCC alliance, they are not the main promoters and have not mass-produced HMC products on a large scale. After 2016, the two companies focused on HBM. Apart from a few close friends who are willing to support Micron, most members of the HMCC are more focused on participation.
Time came to 2018, HMC has long lost the glory of 2011, and it is not an exaggeration to describe it as "no one coming to the door". Artificial intelligence began to rise in this year, and high bandwidth has become the focus of the memory industry. However, the market behind it has almost been taken away by HBM, and Hynix and Samsung, which mainly promote the standard, have become the big winners.
Jim Handy, the chief analyst of Objective Analysis, warned Micron in an interview with the media in January 2018: "Intel will also change from HMC variants to HBM in the future. Considering there is not much difference between the two, if Micron has to transform, the loss will not be too great."
Fortunately, Micron did not persist in its own way and announced in August 2018 that it would officially give up HMC and pursue competitive high-performance storage technology, that is, HBM. However, everyone is ready to do HBM2E, and Micron can only slowly catch up at this time, whether it is to eat meat or drink soup.
In March 2020, Micron's HBM2, that is, the second generation of HBM, came late, and its latest mass-produced HBM also stopped at HBM2E, obviously lagging behind the two Korean factories. The market also faithfully reflected this gap. According to the latest data from TrendForce, SK Hynix occupies 50% of the global HBM market share, ranking first; Samsung follows closely, occupying 40% of the share; and Micron ranks third, only occupying 10% of the share.Interestingly, Micron seems not to have completely given up on HMC.
In March 2020, Steve Pawlowski, Vice President of Advanced Computing Solutions at Micron, stated that Micron was one of the earliest and most powerful supporters of HMC technology. The current focus is on how this architecture can meet the high-bandwidth memory requirements of specific use cases, including artificial intelligence (AI). In fact, AI did not exist when HMC was first conceived. "How can we achieve the best cost-effectiveness in terms of low power consumption and high bandwidth while providing our customers with more cost-effective packaging solutions?" he said.
Pawlowski also said that Micron continues to explore the potential of HMC through the "Pathfinding Program," instead of following the original specification update plan. From a performance perspective, HMC is an excellent solution, but customers are also looking for larger capacity. Emerging AI workloads focus more on bandwidth, which is where the potential of the HMC architecture lies.
"HMC still seems to have vitality, and its architecture may be suitable for applications that did not exist when it was first conceived," Pawlowski said. "HMC is an excellent example of a technology that is ahead of its time, and it needs to build an ecosystem to be widely adopted. My intuition is that HMC-style architecture belongs to this camp."
Far behind, Micron
Now it is the beginning of 2024, and HBM has been popular for a whole year. SK Hynix, Samsung, and Micron are all targeting the next generation of HBM3E or even HBM4, striving to ensure that their technology is leading, especially Micron. In order to improve its passive position in the HBM market, it chose to skip the fourth generation of HBM, namely HBM3, and directly upgrade to the fifth generation.
In September 2023, Micron announced the launch of HBM3 Gen2 (i.e., HBM3E), and later stated that it plans to start mass delivery of HBM3 Gen2 memory in early 2024. At the same time, it revealed that Nvidia is one of the main customers. Micron's President and CEO, Sanjay Mehrotra, also said in the company's financial report conference call: "The launch of our HBM3 Gen2 product series has aroused strong interest and enthusiasm from customers."
But for Micron, catching up with technology is only the first step. What is more important is whether it can have a say in the standards. In January 2022, JEDEC released the latest HBM3 standard, and its main contributor is Micron's old rival, who is also one of the creators of HBM—SK Hynix. And now the widely recognized name of HBM3E also comes from SK Hynix.
What are the benefits of becoming a standard contributor? That is, the HBM3E launched by SK Hynix can boldly claim its backward compatibility, even without design or structural modifications, and can apply this product to devices that have been prepared for HBM3. Whether it is Nvidia or AMD, they can easily upgrade the original products to meet the needs of more customers.
According to a report by Business Korea, Nvidia has signed a priority supply agreement with SK Hynix for HBM3E, which will be used for the new generation of B100 computing cards. Although Micron and Samsung have both provided samples of HBM3E to Nvidia, and will officially sign a contract after completing the verification test, some industry insiders predict that SK Hynix will still be the first to obtain the HBM3E supply contract and obtain the largest supply share.Previously, we discussed that the storage giants have always dreamed of one thing: to break free from the traditional semiconductor cycle and live a more stable life. HMC was once a dream for Micron, replacing old standards with new ones, and replacing open ecosystems with closed ones, hoping to become a leader in DRAM technology with it. However, it fell into a vicious circle: HMC is more expensive - customers lack intention - increased costs lead to higher prices - more potential customers are lost.
At present, HBM is a better entry point. It has achieved a delicate balance between the market and profits of new DRAM. SK Hynix is the company that has gone the farthest among the three giants. Considering that the performance of future AI chips is greatly influenced by the placement and packaging methods of HBM, SK Hynix is very likely to become the first manufacturer to break out of the cycle.
Micron's Chief Technology Expert Pawlowski strongly criticized the outdated memory standards at the 2011 Hot Chip, but he would never have imagined that the seemingly advanced HMC would ultimately be defeated by the HBM included in the JEDEC standard. Micron wasted six or seven years, and in the end, the sweet fruit was picked by Korean factories, which is quite impressive.
Post a comment