NVIDIA

NVIDIA B200 180GB

Name: NVIDIA B200 180GB
Brand: NVIDIA
Price: 30000 USD

Blackwell DatacenterDatacenterBlackwellNVLINKCUDA

180GB

VRAM

8kGB/s

Bandwidth

2.3kTFLOPS

FP16 Compute

4.5kTOPS

INT8 Inference

$30,000 MSRP

NVIDIA B200 180GBCategory AvgAMD Instinct MI325X 256GB

Specifications

Compute

FP162250 TFLOPS

INT84500 TOPS

ArchitectureBlackwell

Memory

VRAM180 GB

Bandwidth8000 GB/s

General

FamilyBlackwell Datacenter

SegmentDatacenter

InterconnectNVLINK

Compute PlatformCUDA

MSRP$30,000

Architecture

Blackwell

Blackwell is NVIDIA's fifth-generation RTX architecture, built on TSMC's 4NP process. It introduces 5th-generation Tensor Cores with native FP4 precision support, enabling double the inference throughput per watt compared to Ada Lovelace's FP8 operations. Key innovations include the Neural Rendering Pipeline for AI-driven shading and the debut of GDDR7 memory in consumer GPUs.

AI Relevance

FP4 Tensor Cores deliver the highest tokens-per-watt efficiency in any consumer architecture. Native FP4 quantization means models can run at lower precision with minimal quality loss, effectively doubling the effective VRAM for model weights.

Process: TSMC 4NPPlatform: CUDATensor Cores: Gen 5Precisions: FP32, FP16, BF16, FP8, FP4, INT8, INT4

Recommendations by Workload

Agentic Coding

Devstral 2 123B Instruct

This model is still usable for agentic-coding, but it is not the most specialized pick. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 89.6 tok/s · 44K ctx · llama.cpp

132.4 GB / 180.0 GB VRAM

Chat

Mistral Small 4 119B

This model is a direct match for chat. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 269.3 tok/s · 16K ctx · llama.cpp

92.3 GB / 180.0 GB VRAM

Coding

Devstral 2 123B Instruct

This model is a direct match for coding. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 89.6 tok/s · 25K ctx · llama.cpp

113.1 GB / 180.0 GB VRAM

RAG

Command A 111B

This model is a direct match for rag. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, ollama, lm-studio.

Decode 99.2 tok/s · 47K ctx · llama.cpp

121.3 GB / 180.0 GB VRAM

Reasoning

Devstral 2 123B Instruct

This model is a direct match for reasoning. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 89.6 tok/s · 25K ctx · llama.cpp

113.1 GB / 180.0 GB VRAM

Full Model Compatibility

NVIDIA B200 180GB

Specifications

Blackwell

Recommendations by Workload

Full Model Compatibility

Models you could run with an upgrade

Upgrade from NVIDIA B200 180GB

Upgrade options

NVIDIA B200 180GB

Specifications

Blackwell

Recommendations by Workload

Full Model Compatibility

Models you could run with an upgrade

Upgrade from NVIDIA B200 180GB

Upgrade options