nVidia "AI-Blackwell": Chips, Products, Naming, Hardware

Posted by Voodoo2-SLi@reddit | hardware | View on Reddit | 6 comments

The hardware data for Blackwell-based AI products is often reported very inaccurately or even incorrectly, as nVidia does not make a clear distinction between chip and product and sometimes provides contradictory information. The following lists are intended to clarify (to the best of our knowledge) what the hardware of the actual Blackwell chips and the AI products based on them looks like.

 

 |Class|Naming|Hardware|max. TDP|Notes |:--|:--:|:--:|:--:|:--:|:--:| GB102|chip|-|4 GPC, 80 SM, 4096-bit HBM3e, PCIe 6.0|-|104 billion transistors at ~800mm² die-size on TSMC's 4nm manufacturing GB100|dual-chip (2x GB102)|"Blackwell"|8 GPC, 160 SM, 8192-bit HBM3e, PCIe 6.0, ≤192 GB, die-to-die interconnect|1200W|2x 104 billion transistors at 2x ~800mm² die-size on TSMC's 4nm manufacturing GB100U|chip variant (2x GB102)|"Blackwell Ultra"|8 GPC, 160 SM, 8192-bit HBM3e, PCIe 6.0, ≤288 GB, die-to-die interconnect|1400W|2x 104 billion transistors at 2x ~800mm² die-size on TSMC's 4nm manufacturing B100|product (1x GB100)|-|unknown|700W|SXM modul for "HGX" plattform B200|product (1x GB100)|-|unknown|1000W|SXM modul for "HGX" plattform B300|product (1x GB100U)|-|unknown|1200W|SXM modul for "HGX" plattform GB200|product (2x GB100)|"Blackwell Superchip"|16 GPC, 288 SM, 2x 8192-bit HBM3e, PCI 5.0, 2x 192 GB|2700W|nVidia's own-created server module with 2x GB100, 1x "Grace" CPU & NVLink-Switch GB300|product (2x GB100U)|"Blackwell Ultra Superchip"|16 GPC, 320 SM, 2x 8192-bit HBM3e, PCI 6.0, 2x 288 GB|>3000W|nVidia's own-created server module with 2x GB100, 1x "Grace" CPU & NVLink-Switch

 

"GB100U" is a self-invented, completely unofficial code name used purely to distinguish the Ultra variant of the GB100 chip. Technically speaking, this is not correct, because nVidia has not released any new chips for Blackwell Ultra, only new products consisting of existing chips.

Uncertain points:
- GB100 has never been officially confirmed, but there is at least a clear indication that this code name exists.
- GB102 exists as a code name only in the rumor mill; so far, there has been no mention of it by nVidia.
- Whether 160 SM is really the maximum hardware for GB100 is currently known only to nVidia.

 

 |Chip||Dual-chip||Product |:--|:--:|:--:|:--:|:--:|:--:| Blackwell|GB102, 80 SM, 4096-bit HBM3e, 104 billion transistors, ~800mm² die-size||GB100 (2x GB102), 160 SM, 8192-bit HBM3e, 208 billion transistors, ~1600mm² die-size||GB200 (4x GB102), 288 SM, 16384-bit HBM3e, 416 billion transistors, ~3200mm² die-size (+ "Grace" CPU & NVLink-Switch) Blackwell Ultra|GB102, 80 SM, 4096-bit HBM3e, 104 billion transistors, ~800mm² die-size||GB100U (2x GB102), 160 SM, 8192-bit HBM3e, 208 billion transistors, ~1600mm² die-size||GB300 (4x GB102), 320 SM, 16384-bit HBM3e, 416 billion transistors, ~3200mm² die-size (+ "Grace" CPU & NVLink-Switch) nVidia naming|-| |"one GPU"| |"Superchip"

 

Unfortunately, nVidia itself sometimes only provides data for a single GPU, even though it is actually referring to the GB200/GB300 "superchips". For example, the "Blackwell Architecture Technical Brief" (PDF) specifies 15/20 petaFLOPS FP4 as the computing power for "GB300" and 8 TB/s as the bandwidth. However, according to the nVidia blog, these are clearly the specifications for a single GB100 GPU. The (correct) data for GB300 with two GB100 GPUs is also noted there: 30/40 petaFLOPS FP4 computing power. If only "15/20 petaFLOPS" is noted for GB300 anywhere, this has been incorrectly copied from nVidia's own PDF.

 

Source: 3DCenter.org