Calculator Models Hardware Compare

Product

Calculator
Compare
Tier List

Browse

Models
Hardware
Docs

About

Why It Works
What's New
Legal Notice
Privacy Policy

All estimates are approximations based on mathematical models and public specifications. Actual performance may vary. Do not make purchasing decisions based solely on these estimates.

Data sourced from Hugging Face, Ollama, and official model documentation. Model names and logos are trademarks of their respective owners.

© 2026 Will It Run AI — Fase Consulting Ibiza, S.L. (NIF: B57969656)

Home/Hardware/GPUs/AMD Instinct MI350X 288GB

AMD

AMD Instinct MI350X 288GB

InstinctDatacenterCDNA 4OAMROCm

288GB

VRAM

8kGB/s

Bandwidth

2.3kTFLOPS

FP16 Compute

4.6kTOPS

INT8 Inference

$8,000 MSRP

AMD Instinct MI350X 288GBCategory Avg

Specifications

Compute

FP162300 TFLOPS

INT84600 TOPS

ArchitectureCDNA 4

Memory

VRAM288 GB

Bandwidth8000 GB/s

General

FamilyInstinct

SegmentDatacenter

InterconnectOAM

Compute PlatformROCM

MSRP$8,000

Architecture

CDNA 4

CDNA 4 powers the next-generation Instinct MI325X and MI350X accelerators. Built on TSMC 3nm with up to 288 GB HBM3e memory and native FP4 support for maximum inference density.

AI Relevance

With up to 288 GB HBM3e and FP4 support, CDNA 4 targets the highest-density AI inference deployments. Directly competes with NVIDIA Blackwell B200 for large-scale model serving.

Process: TSMC 3nmPlatform: ROCMPrecisions: FP64, FP32, TF32, FP16, BF16, FP8, FP4, INT8

Recommendations by Workload

Agentic Coding

C

Devstral 2 123B Instruct

This model is still usable for agentic-coding, but it is not the most specialized pick. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 77.8 tok/s · 64K ctx · llama.cpp

143.2 GB / 288.0 GB VRAM

Chat

B

Qwen 3 235B A22B

This model is a direct match for chat. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 108.7 tok/s · 13K ctx · llama.cpp

174.8 GB / 288.0 GB VRAM

Coding

C

Devstral 2 123B Instruct

This model is a direct match for coding. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 77.8 tok/s · 37K ctx · llama.cpp

123.9 GB / 288.0 GB VRAM

RAG

C

Command A 111B

This model is a direct match for rag. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, ollama, lm-studio.

Decode 86.3 tok/s · 70K ctx · llama.cpp

132.1 GB / 288.0 GB VRAM

Reasoning

B

Qwen 3 235B A22B

This model is a direct match for reasoning. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 108.7 tok/s · 26K ctx · llama.cpp

176.5 GB / 288.0 GB VRAM

Full Model Compatibility

Qwen 3 235B A22B

235B176.5 GB109 tok/s26K ctx

Llama 4 Maverick 17B 128E

400B276.4 GB72 tok/s17K ctx

141B121.8 GB130 tok/s38K ctx

Devstral 2 123B Instruct

123B123.9 GB78 tok/s37K ctx

111B114.8 GB86 tok/s40K ctx

Qwen3.5 122B A10B

122B108.5 GB91 tok/s42K ctx

Mistral Small 4 119B

119B103.3 GB234 tok/s45K ctx

72B84.9 GB133 tok/s54K ctx

Qwen 2.5 VL 72B

72B84.9 GB133 tok/s33K ctx

70B83.3 GB137 tok/s55K ctx

Qwen3-Coder-Next

80B79.3 GB363 tok/s58K ctx

Qwen3.5 35B A3B

35B56.5 GB274 tok/s82K ctx

Qwen3.5 35B A3B

35B56.5 GB274 tok/s82K ctx

Qwen 2.5 Coder 32B

32B54.2 GB299 tok/s85K ctx

27B50.4 GB355 tok/s91K ctx

Qwen3-Coder 30B A3B Instruct

30.5B49.1 GB812 tok/s94K ctx

27B50.4 GB355 tok/s91K ctx

Qwen3-VL 30B A3B Instruct

30B48.8 GB840 tok/s94K ctx

Devstral Small 2 24B Instruct

24B48.1 GB399 tok/s96K ctx

Devstral Small 1.1

24B48.1 GB399 tok/s96K ctx

Codestral 2 25.08

22B46.6 GB435 tok/s99K ctx

9B36.6 GB1064 tok/s126K ctx

Qwen3.5 9B Uncensored HauhauCS Aggressive

9B36.6 GB1064 tok/s126K ctx

Meta Llama 3.1 8B Instruct

8B35.8 GB1197 tok/s129K ctx

llava llama 3 8b v1 1

8B35.8 GB1197 tok/s129K ctx

9B36.6 GB1064 tok/s126K ctx

Llama 2 7B Chat

7B35.1 GB1368 tok/s131K ctx

DeepSeek R1 0528 Qwen3 8B

8B35.8 GB1197 tok/s129K ctx

Mistral 7B Instruct v0.2

7B35.1 GB1368 tok/s131K ctx

Meta Llama 3 8B Instruct

8B35.8 GB1197 tok/s129K ctx

Mistral 7B Instruct v0.3

7B35.1 GB1368 tok/s131K ctx

Llama 3.2 3B Instruct

3B32.7 GB2758 tok/s141K ctx

4B32.9 GB2393 tok/s140K ctx

2B32.1 GB3739 tok/s143K ctx

2B31.7 GB4787 tok/s145K ctx

Qwen2.5 3B Instruct

3B32.3 GB3191 tok/s143K ctx

4B32.9 GB2393 tok/s140K ctx

Llama 3.2 1B Instruct Q8 0

1B31.3 GB6132 tok/s147K ctx

Qwen2.5 1.5B Instruct

1.5B31.4 GB5840 tok/s147K ctx

Gemmasutra Mini 2B v1

2B31.7 GB4787 tok/s145K ctx

SmolVLM 500M Instruct

0.5B30.9 GB6132 tok/s149K ctx

TinyLlama 1.1B Chat v1.0

1.1B31.2 GB5840 tok/s148K ctx

embeddinggemma 300M

0.3B30.7 GB6132 tok/s150K ctx

Qwen3-Coder 480B A35B Instruct

480B328.0 GB50 tok/s14K ctx

Qwen3.5 397B A17B

397B333.9 GB21 tok/s14K ctx

DeepSeek R1 671B

671B444.8 GB42 tok/s10K ctx

744B489.8 GB38 tok/s9K ctx

1000B644.7 GB29 tok/s7K ctx

Mistral Large 3

675B447.9 GB41 tok/s10K ctx

DeepSeek V3 671B

671B444.8 GB42 tok/s10K ctx

Just out of reach

Models you could run with an upgrade

High-quality models that need a bit more memory

DeepSeek R1 671B

671BTier 5Needs ~450.6 GB

744BTier 5Needs ~496.0 GB

1000BTier 5Needs ~649.7 GB

Mistral Large 3

675BTier 5Needs ~454.3 GB

DeepSeek V3 671B

671BTier 5Needs ~450.6 GB

Compare this GPU