Browse AI Models

283 models available

/

Status:

Sort:

Filtered by:

Mradermacher HelpingAI 15B i1

15B0K ctx8.4 GB

denseLegacy

Legraphista internlm2 math plus 7b IMat

7B0K ctx3.9 GB

denseLegacy

DeepSeek DeepSeek R1 Distill 8B

8B33K ctx4.5 GBfrontier

denseLegacy

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing.

TII Falcon 40B Instruct

40B8K ctx22.4 GBlegacy

denseLegacy

Falcon-40B-Instruct is a 40B parameters causal decoder-only model built by TII based on Falcon-40B and finetuned on a mixture of Baize. It is made available under the Apache 2.0 license.

InternLM InternLM 7B

7B8K ctx3.9 GBlegacy

denseLegacy

InternLM has open-sourced a 7 billion parameter base model tailored for practical scenarios. The model has the following characteristics: - It leverages trillions of high-quality tokens for training to establish a powerful knowledge base. - It provides a versatile toolset for users to flexibly build their own workflows.

MosaicML MPT-30B-Instruct

30B8K ctx16.8 GBlegacy

denseLegacy

MPT-30B Instruct is MosaicML's large instruction-tuned model offering strong reasoning and generation quality. Features 8K context with ALiBi encoding and efficient inference optimizations.

Alibaba Qwen 3 8B

8B131K ctx4.5 GBfrontier

denseLegacy

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

Mradermacher CodeNinja 1.0 OpenChat 7B i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher HelpingAI2 9B i1

9B0K ctx5 GB

denseLegacy

Mradermacher HelpingAI2.5 5B i1

5B0K ctx2.8 GB

denseLegacy

Mradermacher internlm2 5 1 8b chat i1

8B0K ctx4.5 GB

denseLegacy

Mradermacher internlm2 math plus 20b i1

20B0K ctx11.2 GB

denseLegacy

Mradermacher HelpingAI 9B 200k i1

9B0K ctx5 GB

denseLegacy

Mradermacher internlm2 5 7b chat i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher HelpingAI 3B hindi i1

3B0K ctx1.7 GB

denseLegacy

Mradermacher internlm3 8b instruct abliterated i1

8B0K ctx4.5 GB

denseLegacy

Mradermacher HelpingAI2 6B i1

6B0K ctx3.4 GB

denseLegacy

RichardErkhov OpenSafetyLab MD Judge v0 2 internlm2 7b

7B0K ctx3.9 GB

denseLegacy

Mradermacher AI21 Jamba2 3B

3B0K ctx1.7 GB

denseLegacy

Mradermacher MD Judge v0 2 internlm2 7b i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher HelpingAI2.5 10B i1

10B0K ctx5.6 GB

denseLegacy

Baichuan Baichuan 7B

7B8K ctx3.9 GBlegacy

denseLegacy

Baichuan-7B是由百川智能开发的一个开源的大规模预训练模型。基于Transformer结构，在大约1.2万亿tokens上训练的70亿参数模型，支持中英双语，上下文窗口长度为4096。在标准的中文和英文权威benchmark（C-EVAL/MMLU）上均取得同尺寸最好的效果。

DevStral AI DevStral 7B

7B8K ctx3.9 GBlegacy

denseLegacy

Devstral 7B is Mistral AI's specialized coding model optimized for software development tasks. Features strong code generation, completion, and understanding across multiple programming languages.

Zhipu GLM-4 9B

9B128K ctx5 GBcurrent

denseLegacy

2024/11/25, 我们建议使用从 `transformers>=4.46.0` 开始，使用 glm-4-9b-chat-hf 以减少后续 transformers 升级导致的兼容性问题。

Browse AI Models

283 models available

/

Status:

Sort:

Filtered by:

Mradermacher HelpingAI 15B i1

15B0K ctx8.4 GB

denseLegacy

Legraphista internlm2 math plus 7b IMat

7B0K ctx3.9 GB

denseLegacy

DeepSeek DeepSeek R1 Distill 8B

8B33K ctx4.5 GBfrontier

denseLegacy

TII Falcon 40B Instruct

40B8K ctx22.4 GBlegacy

denseLegacy

Falcon-40B-Instruct is a 40B parameters causal decoder-only model built by TII based on Falcon-40B and finetuned on a mixture of Baize. It is made available under the Apache 2.0 license.

InternLM InternLM 7B

7B8K ctx3.9 GBlegacy

denseLegacy

MosaicML MPT-30B-Instruct

30B8K ctx16.8 GBlegacy

denseLegacy

MPT-30B Instruct is MosaicML's large instruction-tuned model offering strong reasoning and generation quality. Features 8K context with ALiBi encoding and efficient inference optimizations.

Alibaba Qwen 3 8B

8B131K ctx4.5 GBfrontier

denseLegacy

Mradermacher CodeNinja 1.0 OpenChat 7B i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher HelpingAI2 9B i1

9B0K ctx5 GB

denseLegacy

Mradermacher HelpingAI2.5 5B i1

5B0K ctx2.8 GB

denseLegacy

Mradermacher internlm2 5 1 8b chat i1

8B0K ctx4.5 GB

denseLegacy

Mradermacher internlm2 math plus 20b i1

20B0K ctx11.2 GB

denseLegacy

Mradermacher HelpingAI 9B 200k i1

9B0K ctx5 GB

denseLegacy

Mradermacher internlm2 5 7b chat i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher HelpingAI 3B hindi i1

3B0K ctx1.7 GB

denseLegacy

Mradermacher internlm3 8b instruct abliterated i1

8B0K ctx4.5 GB

denseLegacy

Mradermacher HelpingAI2 6B i1

6B0K ctx3.4 GB

denseLegacy

RichardErkhov OpenSafetyLab MD Judge v0 2 internlm2 7b

7B0K ctx3.9 GB

denseLegacy

Mradermacher AI21 Jamba2 3B

3B0K ctx1.7 GB

denseLegacy

Mradermacher MD Judge v0 2 internlm2 7b i1

7B0K ctx3.9 GB

denseLegacy

Mradermacher HelpingAI2.5 10B i1

10B0K ctx5.6 GB

denseLegacy

Baichuan Baichuan 7B

7B8K ctx3.9 GBlegacy

denseLegacy

DevStral AI DevStral 7B

7B8K ctx3.9 GBlegacy

denseLegacy

Devstral 7B is Mistral AI's specialized coding model optimized for software development tasks. Features strong code generation, completion, and understanding across multiple programming languages.

Zhipu GLM-4 9B

9B128K ctx5 GBcurrent

denseLegacy

2024/11/25, 我们建议使用从 `transformers>=4.46.0` 开始，使用 glm-4-9b-chat-hf 以减少后续 transformers 升级导致的兼容性问题。