NVIDIA H200 NVL Graphic Card
900-21010-0040-000
NVIDIA H200 NVL is ideal for lower-power, air-cooled enterprise rack designs that require flexible configurations, delivering acceleration for every AI and HPC workload regardless of size. With up to four GPUs connected by NVIDIA NVLink™ and a 1.5X memory increase, large language model (LLM) inference can be accelerated up to 1.7X and HPC applications achieve up to 1.3X more performance over the H100 NVL.

NVIDIA H200 GPU
Supercharging AI and HPC workloads.

The GPU for Generative AI and HPC
The NVIDIA H200 GPU supercharges generative AI and high-performance computing (HPC) workloads with game-changing performance and memory capabilities. As the first GPU with HBM3E, the H200’s larger and faster memory fuels the acceleration of generative AI and large language models (LLMs) while advancing scientific computing for HPC workloads.
Experience Next-Level Performance
1.9X FasterLlama2 70B Inference |
1.6X FasterGPT-3 175B Inference |
110X FasterHigh-Performance Computing |
Higher Performance With Larger, Faster Memory
Based on the NVIDIA Hopper™ architecture, the NVIDIA H200 is the first GPU to offer 141 gigabytes (GB) of HBM3e memory at 4.8 terabytes per second (TB/s) —that’s nearly double the capacity of the NVIDIA H100 GPU with 1.4X more memory bandwidth. The H200’s larger and faster memory accelerates generative AI and LLMs, while advancing scientific computing for HPC workloads with better energy efficiency and lower total cost of ownership.
Unlock Insights With High-Performance LLM Inference
In the ever-evolving landscape of AI, businesses rely on LLMs to address a diverse range of inference needs. An AI inference accelerator must deliver the highest throughput at the lowest TCO when deployed at scale for a massive user base.
The H200 boosts inference speed by up to 2X compared to H100 GPUs when handling LLMs like Llama2.
Explore NVIDIA’s AI Inference Platform >
Preliminary specifications. May be subject to change.
Llama2 13B: ISL 128, OSL 2K | Throughput | H100 SXM 1x GPU BS 64 | H200 SXM 1x GPU BS 128
GPT-3 175B: ISL 80, OSL 200 | x8 H100 SXM GPUs BS 64 | x8 H200 SXM GPUs BS 128
Llama2 70B: ISL 2K, OSL 128 | Throughput | H100 SXM 1x GPU BS 8 | H200 SXM 1x GPU BS 32.
Supercharge High-Performance Computing
Memory bandwidth is crucial for HPC applications as it enables faster data transfer, reducing complex processing bottlenecks. For memory-intensive HPC applications like simulations, scientific research, and artificial intelligence, the H200’s higher memory bandwidth ensures that data can be accessed and manipulated efficiently, leading up to 110X faster time to results compared to CPUs.
Learn More About High-Performance Computing >
Preliminary specifications. May be subject to change.
HPC MILC- dataset NERSC Apex Medium | HGX H200 4-GPU | dual Sapphire Rapids 8480
HPC Apps- CP2K: dataset H2O-32-RI-dRPA-96points | GROMACS: dataset STMV | ICON: dataset r2b5 | MILC: dataset NERSC Apex Medium | Chroma: dataset HMC Medium | Quantum Espresso: dataset AUSURF112 | 1x H100 SXM | 1x H200 SXM.
Reduce Energy and TCO
With the introduction of the H200, energy efficiency and TCO reach new levels. This cutting-edge technology offers unparalleled performance, all within the same power profile as the H100. AI factories and supercomputing systems that are not only faster but also more eco-friendly, deliver an economic edge that propels the AI and scientific community forward.
Learn More About Sustainable Computing >
Preliminary specifications. May be subject to change.
Llama2 70B: ISL 2K, OSL 128 | Throughput | H100 SXM 1x GPU BS 8 | H200 SXM 1x GPU BS 32
Accelerating AI Acceleration for Mainstream Enterprise Servers With H200 NVL
NVIDIA H200 NVL is ideal for lower-power, air-cooled enterprise rack designs that require flexible configurations, delivering acceleration for every AI and HPC workload regardless of size. With up to four GPUs connected by NVIDIA NVLink™ and a 1.5x memory increase, large language model (LLM) inference can be accelerated up to 1.7x, and HPC applications achieve up to 1.3x more performance over the H100 NVL.

Enterprise-Ready: AI Software Streamlines Development and Deployment
NVIDIA H200 NVL comes with a five-year NVIDIA Enterprise subscription. This subscription includes NVIDIA AI Enterprise to simplify the way you build an enterprise AI-ready platform. H200 accelerates AI development and deployment for production-ready generative AI solutions, including computer vision, speech AI, retrieval augmented generation (RAG), and more. NVIDIA AI Enterprise includes NVIDIA NIM™, a set of easy-to-use microservices designed to speed up enterprise generative AI deployment. Together, deployments have enterprise-grade security, manageability, stability, and support. This results in performance-optimized AI solutions that deliver faster business value and actionable insights.
Activate Your NVIDIA AI Enterprise License >

Graphics memory
Memory size 141 GB┃Memory type HBM3e
VRAM, these days primarily of GDDR type, is a synchronous memory, similar to standard RAM. However, in the case of graphic memory, memory chips with faster throughput and multiple data transfer rates are concerned. The result is a much faster buffering of data that the graphics card or coprocessor calculates and passes to the processor.
Specifications
NVIDIA H200 GPU
| Specification | H200 SXM | H200 NVL |
|---|---|---|
| FP64 Performance | 34 TFLOPS | 30 TFLOPS |
| FP64 Tensor Core | 67 TFLOPS | 60 TFLOPS |
| FP32 Performance | 67 TFLOPS | 60 TFLOPS |
| TF32 Tensor Core | 989 TFLOPS | 835 TFLOPS |
| BFLOAT16 Tensor Core | 1,979 TFLOPS | 1,671 TFLOPS |
| FP16 Tensor Core | 1,979 TFLOPS | 1,671 TFLOPS |
| FP8 Tensor Core | 3,958 TFLOPS | 3,341 TFLOPS |
| INT8 Tensor Core | 3,958 TFLOPS | 3,341 TFLOPS |
| GPU Memory | 141 GB | 141 GB |
| GPU Memory Bandwidth | 4.8 TB/s | 4.8 TB/s |
| Decoders | 7 × NVDEC, 7 × JPEG | 7 × NVDEC, 7 × JPEG |
| Confidential Computing | Supported | Supported |
| Max TDP | Up to 700 W (configurable) | Up to 600 W (configurable) |
| Multi-Instance GPU (MIG) | Up to 7 MIGs @ 18 GB each | Up to 7 MIGs @ 16.5 GB each |
| Form Factor | SXM | PCIe, dual-slot air-cooled |
| Interconnect | NVIDIA NVLink™: 900 GB/s PCIe Gen5: 128 GB/s |
2- or 4-way NVLink bridge: 900 GB/s per GPU PCIe Gen5: 128 GB/s |
| Server Options | NVIDIA HGX™ H200 partner systems NVIDIA-Certified™ systems with 4 or 8 GPUs |
NVIDIA MGX™ H200 NVL partner systems NVIDIA-Certified™ systems with up to 8 GPUs |
| NVIDIA AI Enterprise | Add-on | Included |
Resources
| Datasheet |
| H200 NVL Product Brief |
| Data Center Product Performance |
Want to buy in bulk?
Custom Requirements?
Discussion Forum
Feel free to ask questions, share tips or report issues.



