Template:NvidiaDgxAccelerators
Appearance
Comparison of accelerators used in DGX:[1][2][3]
Model | Architecture | Socket | FP32 CUDA cores |
FP64 cores (excl. tensor) |
Mixed INT32/FP32 cores |
INT32 cores |
Boost clock |
Memory clock |
Memory bus width |
Memory bandwidth |
VRAM | Single precision (FP32) |
Double precision (FP64) |
INT8 (non-tensor) |
INT8 dense tensor |
INT32 | FP4 dense tensor |
FP16 | FP16 dense tensor |
bfloat16 dense tensor |
TensorFloat-32 (TF32) dense tensor |
FP64 dense tensor |
Interconnect (NVLink) |
GPU | L1 Cache | L2 Cache | TDP | Die size | Transistor count |
Process | Launched |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
B200 | Blackwell | SXM6 | N/A | N/A | N/A | N/A | N/A | 8 Gbit/s HBM3e | 8192-bit | 8 TB/sec | 192 GB HBM3e | N/A | N/A | N/A | 4.5 POPS | N/A | 9 PFLOPS | N/A | 2.25 PFLOPS | 2.25 PFLOPS | 1.2 PFLOPS | 40 TFLOPS | 1.8 TB/sec | GB100 | N/A | N/A | 1000 W | N/A | 208 B | TSMC 4NP | Q4 2024 (expected) |
B100 | Blackwell | SXM6 | N/A | N/A | N/A | N/A | N/A | 8 Gbit/s HBM3e | 8192-bit | 8 TB/sec | 192 GB HBM3e | N/A | N/A | N/A | 3.5 POPS | N/A | 7 PFLOPS | N/A | 1.98 PFLOPS | 1.98 PFLOPS | 989 TFLOPS | 30 TFLOPS | 1.8 TB/sec | GB100 | N/A | N/A | 700 W | N/A | 208 B | TSMC 4NP | |
H200 | Hopper | SXM5 | 16896 | 4608 | 16896 | N/A | 1980 MHz | 6.3 Gbit/s HBM3e | 6144-bit | 4.8 TB/sec | 141 GB HBM3e | 67 TFLOPS | 34 TFLOPS | N/A | 1.98 POPS | N/A | N/A | N/A | 990 TFLOPS | 990 TFLOPS | 495 TFLOPS | 67 TFLOPS | 900 GB/sec | GH100 | 25344 KB (192 KB × 132) | 51200 KB | 1000 W | 814 mm2 | 80 B | TSMC 4N | Q3 2023 |
H100 | Hopper | SXM5 | 16896 | 4608 | 16896 | N/A | 1980 MHz | 5.2 Gbit/s HBM3 | 5120-bit | 3.35 TB/sec | 80 GB HBM3 | 67 TFLOPS | 34 TFLOPS | N/A | 1.98 POPS | N/A | N/A | N/A | 990 TFLOPS | 990 TFLOPS | 495 TFLOPS | 67 TFLOPS | 900 GB/sec | GH100 | 25344 KB (192 KB × 132) | 51200 KB | 700 W | 814 mm2 | 80 B | TSMC 4N | Q3 2022 |
A100 80GB | Ampere | SXM4 | 6912 | 3456 | 6912 | N/A | 1410 MHz | 3.2 Gbit/s HBM2e | 5120-bit | 1.52 TB/sec | 80 GB HBM2e | 19.5 TFLOPS | 9.7 TFLOPS | N/A | 624 TOPS | 19.5 TOPS | N/A | 78 TFLOPS | 312 TFLOPS | 312 TFLOPS | 156 TFLOPS | 19.5 TFLOPS | 600 GB/sec | GA100 | 20736 KB (192 KB × 108) | 40960 KB | 400 W | 826 mm2 | 54.2 B | TSMC N7 | Q1 2020 |
A100 40GB | Ampere | SXM4 | 6912 | 3456 | 6912 | N/A | 1410 MHz | 2.4 Gbit/s HBM2 | 5120-bit | 1.52 TB/sec | 40 GB HBM2 | 19.5 TFLOPS | 9.7 TFLOPS | N/A | 624 TOPS | 19.5 TOPS | N/A | 78 TFLOPS | 312 TFLOPS | 312 TFLOPS | 156 TFLOPS | 19.5 TFLOPS | 600 GB/sec | GA100 | 20736 KB (192 KB × 108) | 40960 KB | 400 W | 826 mm2 | 54.2 B | TSMC N7 | |
V100 32GB | Volta | SXM3 | 5120 | 2560 | N/A | 5120 | 1530 MHz | 1.75 Gbit/s HBM2 | 4096-bit | 900 GB/sec | 32 GB HBM2 | 15.7 TFLOPS | 7.8 TFLOPS | 62 TOPS | N/A | 15.7 TOPS | N/A | 31.4 TFLOPS | 125 TFLOPS | N/A | N/A | N/A | 300 GB/sec | GV100 | 10240 KB (128 KB × 80) | 6144 KB | 350 W | 815 mm2 | 21.1 B | TSMC 12FFN | Q3 2017 |
V100 16GB | Volta | SXM2 | 5120 | 2560 | N/A | 5120 | 1530 MHz | 1.75 Gbit/s HBM2 | 4096-bit | 900 GB/sec | 16 GB HBM2 | 15.7 TFLOPS | 7.8 TFLOPS | 62 TOPS | N/A | 15.7 TOPS | N/A | 31.4 TFLOPS | 125 TFLOPS | N/A | N/A | N/A | 300 GB/sec | GV100 | 10240 KB (128 KB × 80) | 6144 KB | 300 W | 815 mm2 | 21.1 B | TSMC 12FFN | |
P100 | Pascal | SXM/SXM2 | N/A | 1792 | 3584 | N/A | 1480 MHz | 1.4 Gbit/s HBM2 | 4096-bit | 720 GB/sec | 16 GB HBM2 | 10.6 TFLOPS | 5.3 TFLOPS | N/A | N/A | N/A | N/A | 21.2 TFLOPS | N/A | N/A | N/A | N/A | 160 GB/sec | GP100 | 1344 KB (24 KB × 56) | 4096 KB | 300 W | 610 mm2 | 15.3 B | TSMC 16FF+ | Q2 2016 |
Template documentation
This template's documentation is missing, inadequate, or does not accurately describe its functionality or the parameters in its code. Please help to expand and improve it. |
- ^ Smith, Ryan (March 22, 2022). "NVIDIA Hopper GPU Architecture and H100 Accelerator Announced: Working Smarter and Harder". AnandTech.
- ^ Smith, Ryan (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech.
- ^ "NVIDIA Tesla V100 tested: near unbelievable GPU power". TweakTown. September 17, 2017.