NVIDIA

H100 NVL 188GB

Name: H100 NVL 188GB
Brand: NVIDIA

Data CenterHopperNVLINKCUDA

188GB

VRAM

7.8kGB/s

Bandwidth

2kTFLOPS

FP16 Compute

4kTOPS

INT8 Inference

H100 NVL 188GBCategory AvgAMD Instinct MI325X 256GB

Specifications

Compute

FP161979 TFLOPS

INT83958 TOPS

ArchitectureHopper

Memory

VRAM188 GB

Bandwidth7800 GB/s

General

FamilyData Center

SegmentData Center

InterconnectNVLINK

Compute PlatformCUDA

Architecture

Hopper

Hopper is NVIDIA's datacenter-focused architecture succeeding Ampere. Built on TSMC 4N, it introduces the Transformer Engine with automatic FP8/FP16 mixed-precision training, HBM3/HBM3e memory, and NVLink 4.0 for multi-GPU scaling. The H100 flagship delivers up to 3x the AI training performance of A100.

AI Relevance

The Transformer Engine automatically manages FP8 precision for optimal training speed without accuracy loss. With up to 141 GB HBM3e (H200), Hopper GPUs can hold the largest open-weight models entirely in GPU memory, making them the workhorse of AI datacenters.

Process: TSMC 4NPlatform: CUDATensor Cores: Gen 4Precisions: FP64, FP32, TF32, FP16, BF16, FP8, INT8

Recommendations by Workload

Agentic Coding

Devstral 2 123B Instruct

This model is still usable for agentic-coding, but it is not the most specialized pick. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 84.2 tok/s · 45K ctx · llama.cpp

133.2 GB / 188.0 GB VRAM

Chat

Mistral Small 4 119B

This model is a direct match for chat. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 253.2 tok/s · 16K ctx · llama.cpp

93.1 GB / 188.0 GB VRAM

Coding

Devstral 2 123B Instruct

This model is a direct match for coding. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 84.2 tok/s · 26K ctx · llama.cpp

113.9 GB / 188.0 GB VRAM

RAG

Command A 111B

This model is a direct match for rag. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, ollama, lm-studio.

Decode 93.3 tok/s · 49K ctx · llama.cpp

122.1 GB / 188.0 GB VRAM

Reasoning

Devstral 2 123B Instruct

This model is a direct match for reasoning. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 84.2 tok/s · 26K ctx · llama.cpp

113.9 GB / 188.0 GB VRAM

Full Model Compatibility

H100 NVL 188GB

Specifications

Hopper

Recommendations by Workload

Full Model Compatibility

Models you could run with an upgrade

Upgrade from H100 NVL 188GB

Upgrade options

H100 NVL 188GB

Specifications

Hopper

Recommendations by Workload

Full Model Compatibility

Models you could run with an upgrade

Upgrade from H100 NVL 188GB

Upgrade options