Browse AI Models

328 models available

/

Status:

Sort:

13B33K ctx7.3 GBcurrent

denseLegacy

OLMo 2 13B is AI2's fully open research model with transparent training data and methodology. Designed for reproducible research with competitive performance on reasoning and general knowledge tasks.

Alibaba Qwen 2.5 Math 7B

7B4K ctx3.9 GBcurrent

denseLegacy

> [!Warning] > > > 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. > >

HuggingFace SmolLM3 3B

3B128K ctx1.7 GBactive

denseLegacy

SmolLM3 is a fully open 3B-parameter language model with dual-mode reasoning, 128K context via YARN extrapolation, and native support for 6 languages. Pretrained on 11.2T tokens with a staged curriculum of web, code, math, and reasoning data. Post-trained with 140B reasoning tokens and Anchored Preference Optimization.

01.AI Yi Coder 9B

9B131K ctx5 GBcurrent

denseLegacy

🐙 GitHub • 👾 Discord • 🐤 Twitter • 💬 WeChat

Intervitens-archive internlm2 limarp chat 20b

20B0K ctx11.2 GB

denseLegacy

Duyntnet TinyLlama 1.1B Chat v1.0 imatrix

1.1B0K ctx0.6 GB

denseLegacy

QuantFactory starcoder2 7b

7B0K ctx3.9 GB

denseLegacy

MaziyarPanahi zephyr 7b beta Mistral 7B Instruct v0.2

7B0K ctx3.9 GB

denseLegacy

Mradermacher OpenChat 3.5 7B Qwen v2.0 i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher OpenChat 3.5 7B Starling v2.0 i1

7B0K ctx3.9 GB

denseLegacy

HelpingAI HelpingAI2 6B

6B0K ctx3.4 GB

denseLegacy

Mradermacher HelpingAI 15B i1

15B0K ctx8.4 GB

denseLegacy

Legraphista internlm2 math plus 7b IMat

7B0K ctx3.9 GB

denseLegacy

DeepSeek DeepSeek R1 Distill 8B

8B33K ctx4.5 GBfrontier

denseLegacy

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing.

TII Falcon 40B Instruct

40B8K ctx22.4 GBlegacy

denseLegacy

Falcon-40B-Instruct is a 40B parameters causal decoder-only model built by TII based on Falcon-40B and finetuned on a mixture of Baize. It is made available under the Apache 2.0 license.

InternLM InternLM 7B

7B8K ctx3.9 GBlegacy

denseLegacy

InternLM has open-sourced a 7 billion parameter base model tailored for practical scenarios. The model has the following characteristics: - It leverages trillions of high-quality tokens for training to establish a powerful knowledge base. - It provides a versatile toolset for users to flexibly build their own workflows.

MosaicML MPT-30B-Instruct

30B8K ctx16.8 GBlegacy

denseLegacy

MPT-30B Instruct is MosaicML's large instruction-tuned model offering strong reasoning and generation quality. Features 8K context with ALiBi encoding and efficient inference optimizations.

Alibaba Qwen 3 8B

8B131K ctx4.5 GBfrontier

denseLegacy

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

Mradermacher CodeNinja 1.0 OpenChat 7B i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher HelpingAI2 9B i1

9B0K ctx5 GB

denseLegacy

Mradermacher HelpingAI2.5 5B i1

5B0K ctx2.8 GB

denseLegacy

Mradermacher Codestral 21B Pruned i1

21B0K ctx11.8 GB

denseLegacy

Mradermacher internlm2 5 1 8b chat i1

8B0K ctx4.5 GB

denseLegacy

Mradermacher internlm2 math plus 20b i1

20B0K ctx11.2 GB

denseLegacy

Browse AI Models

328 models available

/

Status:

Sort:

AllenAI OLMo 2 13B

13B33K ctx7.3 GBcurrent

denseLegacy

OLMo 2 13B is AI2's fully open research model with transparent training data and methodology. Designed for reproducible research with competitive performance on reasoning and general knowledge tasks.

Alibaba Qwen 2.5 Math 7B

7B4K ctx3.9 GBcurrent

denseLegacy

> [!Warning] > > > 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. > >

HuggingFace SmolLM3 3B

3B128K ctx1.7 GBactive

denseLegacy

01.AI Yi Coder 9B

9B131K ctx5 GBcurrent

denseLegacy

🐙 GitHub • 👾 Discord • 🐤 Twitter • 💬 WeChat

Intervitens-archive internlm2 limarp chat 20b

20B0K ctx11.2 GB

denseLegacy

Duyntnet TinyLlama 1.1B Chat v1.0 imatrix

1.1B0K ctx0.6 GB

denseLegacy

QuantFactory starcoder2 7b

7B0K ctx3.9 GB

denseLegacy

MaziyarPanahi zephyr 7b beta Mistral 7B Instruct v0.2

7B0K ctx3.9 GB

denseLegacy

Mradermacher OpenChat 3.5 7B Qwen v2.0 i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher OpenChat 3.5 7B Starling v2.0 i1

7B0K ctx3.9 GB

denseLegacy

HelpingAI HelpingAI2 6B

6B0K ctx3.4 GB

denseLegacy

Mradermacher HelpingAI 15B i1

15B0K ctx8.4 GB

denseLegacy

Legraphista internlm2 math plus 7b IMat

7B0K ctx3.9 GB

denseLegacy

DeepSeek DeepSeek R1 Distill 8B

8B33K ctx4.5 GBfrontier

denseLegacy

TII Falcon 40B Instruct

40B8K ctx22.4 GBlegacy

denseLegacy

Falcon-40B-Instruct is a 40B parameters causal decoder-only model built by TII based on Falcon-40B and finetuned on a mixture of Baize. It is made available under the Apache 2.0 license.

InternLM InternLM 7B

7B8K ctx3.9 GBlegacy

denseLegacy

MosaicML MPT-30B-Instruct

30B8K ctx16.8 GBlegacy

denseLegacy

MPT-30B Instruct is MosaicML's large instruction-tuned model offering strong reasoning and generation quality. Features 8K context with ALiBi encoding and efficient inference optimizations.

Alibaba Qwen 3 8B

8B131K ctx4.5 GBfrontier

denseLegacy

Mradermacher CodeNinja 1.0 OpenChat 7B i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher HelpingAI2 9B i1

9B0K ctx5 GB

denseLegacy

Mradermacher HelpingAI2.5 5B i1

5B0K ctx2.8 GB

denseLegacy