Will It Run AI
CalculatorModelsHardwareCompare
Product
  • Calculator
  • Compare
  • Tier List
Browse
  • Models
  • Hardware
  • Docs
About
  • Why It Works
  • What's New
  • Legal Notice
  • Privacy Policy

All estimates are approximations based on mathematical models and public specifications. Actual performance may vary. Do not make purchasing decisions based solely on these estimates.

Data sourced from Hugging Face, Ollama, and official model documentation. Model names and logos are trademarks of their respective owners.

© 2026 Will It Run AI — Fase Consulting Ibiza, S.L. (NIF: B57969656)

Browse AI Models

84 models available

/
Status:
Sort:
Filtered by:
AlibabaAlibabaQwen 2.5 Math 72B
72B4K ctx40.3 GBfrontier
denseLegacy

> [!Warning] > > > 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. > >

UnslothUnslothDeepSeek R1 Distill Llama 8B
8B0K ctx4.5 GB
denseLegacy

 

UnslothUnslothDeepSeek R1 Distill Qwen 1.5B
1.5B0K ctx0.8 GB
denseLegacy

 

CohereCohereCommand R+ 104B
104B131K ctx58.2 GBcurrent
denseLegacy

Command R+ is Cohere's most capable open-weight model for enterprise RAG workloads. Offers superior long-context reasoning, multi-step tool use, and grounded generation with citations across 10 languages.

DeepSeekDeepSeekDeepSeek Coder V2 236B
236B (21B active)131K ctx132.2 GBcurrent
moeLegacy

We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks.

DeepSeekDeepSeekDeepSeek R1 Distill 32B
32B33K ctx17.9 GBfrontier
denseLegacy

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing.

MetaMetaLlama 4 Scout 17B 16E
109B (17B active)10.5M ctx61 GBfrontier
moeLegacy

Llama 4 Scout is Meta's efficient Mixture-of-Experts model with 17B active parameters across 16 experts. Supports a 10M token context window and natively handles text, images, and video inputs.

AlibabaAlibabaQwen 3 30B A3B
30.5B (3.3B active)131K ctx17.1 GBfrontier
moeLegacy

We introduce the updated version of the Qwen3-30B-A3B non-thinking mode, named Qwen3-30B-A3B-Instruct-2507, featuring the following key enhancements:

BigCodeBigCodeStarCoder 15B
15B8K ctx8.4 GBlegacy
denseLegacy

StarCoder 15B is BigCode's flagship code generation model trained on 1 trillion tokens from The Stack. Supports 80+ programming languages with 8K context and strong code completion capabilities.

Lmstudio-communityLLmstudio-communityDeepSeek R1 0528 Qwen3 8B
8B0K ctx4.5 GB
denseLegacy

 

UnslothUnslothDeepSeek R1 Distill Qwen 14B
14B0K ctx7.8 GB
denseLegacy

 

DeepSeekDeepSeekDeepSeek V2.5 236B
236B (21B active)131K ctx132.2 GBcurrent
moeLegacy

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions. For model details, please visit DeepSeek-V2 page for more information.

MicrosoftMicrosoftPhi-4-reasoning-plus 14B
14.7B33K ctx8.2 GBfrontier
denseLegacy

> [!IMPORTANT] > To fully take advantage of the model's capabilities, inference must use `temperature=0.8`, `top_k=50`, `top_p=0.95`, and `do_sample=True`. For more complex queries, set `max_new_tokens=32768` to allow for longer chain-of-thought (CoT).

AlibabaAlibabaQwen 3 32B
32B131K ctx17.9 GBfrontier
denseLegacy

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

MaziyarPanahiMMaziyarPanahiDeepSeek R1 0528 Qwen3 8B
8B0K ctx4.5 GB
denseLegacy

 

MistralMistralMagistral Small 2507
24B131K ctx13.4 GBlegacy
denseLegacy

Building upon Mistral Small 3.1 (2503), with added reasoning capabilities, undergoing SFT from Magistral Medium traces and RL on top, it's a small, efficient reasoning model with 24B parameters.

MistralMistralMixtral 8x7B
47B (13B active)33K ctx26.3 GBcurrent
moeLegacy

from mistral_common.tokens.tokenizers.mistral import MistralTokenizer from mistral_common.protocol.instruct.messages import UserMessage from mistral_common.protocol.instruct.request import ChatCompletionRequest

OpenAIOpenAIGPT-OSS 20B
21B (3.6B active)128K ctx11.8 GBfrontier
moeLegacy

GPT-OSS 20B is OpenAI's first open-weight model, a 21B-parameter mixture-of-experts model with 3.6B active parameters per token. Features configurable reasoning effort (low/medium/high), full chain-of-thought visibility, and agentic capabilities including function calling. Runs on devices with 16GB of memory using MXFP4 quantization.

MistralMistralMistral Small 3.2 24B
24B131K ctx13.4 GBcurrent
visionLegacy

Mistral-Small-3.2-24B-Instruct-2506 is a minor update of Mistral-Small-3.1-24B-Instruct-2503.

AlibabaAlibabaQwen 2.5 32B
32B131K ctx17.9 GBcurrent
denseLegacy

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

BigCodeBigCodeStarCoder 7B
7B8K ctx3.9 GBlegacy
denseLegacy

StarCoder 7B is BigCode's code generation model trained on The Stack v1. Supports over 80 programming languages with fill-in-the-middle capability and 8K context window.

BartowskiBBartowskicognitivecomputations Dolphin3.0 R1 Mistral 24B
24B0K ctx13.4 GB
denseLegacy

 

GoogleGoogleGemma 2 27B
27B8K ctx15.1 GBcurrent
denseLegacy

Gemma 2 27B is Google's largest Gemma 2 model, offering state-of-the-art performance among open models of similar size. Built on Gemini technology with strong reasoning, code, and multilingual capabilities.

01.AI01.AIYi 1.5 34B
34B4K ctx19 GBcurrent
denseLegacy

🐙 GitHub • 👾 Discord • 🐤 Twitter • 💬 WeChat

PreviousPage 2 of 4Next