Browse AI Models

3 models available

/

Status:

Sort:

Filtered by:

NVIDIA Nemotron 70B

70B131K ctx39.2 GBcurrent

denseLegacy

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.

NVIDIA Nemotron Nano 8B

8B131K ctx4.5 GBactive

denseLegacy

Nemotron Nano 8B is NVIDIA's reasoning model derived from Llama 3.1 8B Instruct, post-trained for switchable reasoning with on/off modes. Achieves 95.4% on MATH-500 and 54.1% on GPQA Diamond with reasoning enabled. Fits on a single RTX GPU for local deployment.

NVIDIA Nemotron Mini 4B

4B4K ctx2.2 GBcurrent

denseLegacy

Nemotron-Mini-4B-Instruct is a model for generating responses for roleplaying, retrieval augmented generation, and function calling. It is a small language model (SLM) optimized through distillation, pruning and quantization for speed and on-device deployment. It is a fine-tuned version of nvidia/Minitron-4B-Base, which was pruned and distilled from Nemotron-4 15B using our LLM compression technique. This instruct model is optimized for roleplay, RAG QA, and function calling in English. It supports a context length of 4,096 tokens. This model is ready for commercial use.

Browse AI Models

3 models available

/

Status:

Sort:

Filtered by:

NVIDIA Nemotron 70B

70B131K ctx39.2 GBcurrent

denseLegacy

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.

NVIDIA Nemotron Nano 8B

8B131K ctx4.5 GBactive

denseLegacy

NVIDIA Nemotron Mini 4B

4B4K ctx2.2 GBcurrent

denseLegacy