Vai al contenuto

Managing models

AI models are the heart of the application. In this chapter we'll see how to download new models, understand the differences between them and choose the one best suited to your needs.

Local models vs Hub

The model management interface has two tabs:

  • Local Models: models already downloaded on your computer, ready to use
  • Hub Search: online catalog of models available for download

To access model management, click the Manage Models button in the right sidebar.

Local models management

Installed models

In the local models tab you see all the models available on your computer. For each model the following are shown:

  • Name and version: for example "llama3.2:3b"
  • Size: how much disk space it takes
  • Date: when it was downloaded
  • Category: Chat, Code, Reasoning, Multimodal

You can sort the list by name, size, date or category using the buttons at the top.

Downloading new models

Switch to the Hub Search tab to search for models in the Ollama catalog:

Model search in Hub

You can filter by category:

  • Chat: generic models for conversation
  • Code: specialized in programming
  • Reasoning: optimized for logical reasoning
  • Multimodal: capable of analyzing images too

Find an interesting model and click Download. You'll see the download progress in real time:

Model download in progress

Before downloading

Models can be very large: a 70 billion parameter model requires over 40 GB of disk space.

Also, large models require more RAM and a GPU with enough memory to run at acceptable speed. See the "Which model to choose" section to understand what your hardware can handle.

Understanding model names

Model names follow a precise pattern:

name:variant

For example: llama3.2:3b, qwen2.5:7b-instruct, codellama:13b

The number after the colon typically indicates size: - 1b-3b: lightweight models, fast, suitable for less powerful computers - 7b-8b: good compromise between quality and speed - 13b-14b: more accurate responses, need 16 GB of RAM - 70b and above: maximum quality, require powerful hardware

Which model to choose

The choice depends on your hardware and what you want to do:

For computers with 8 GB of RAM

Model Recommended use
llama3.2:3b General conversation, fast
qwen2.5:3b Good for texts in multiple languages
phi3:3.8b Reasoning and logic

For computers with 16 GB of RAM

Model Recommended use
llama3.1:8b General use, excellent responses
qwen2.5:7b Multilingual, including Italian
mistral:7b Fast and reliable
codellama:7b Programming

For computers with GPU (8+ GB VRAM)

Model Recommended use
llama3.3:70b Maximum quality
qwen2.5-coder:32b Advanced programming
command-r:35b Research and document analysis

Start small

If you don't know which to choose, start with llama3.2:3b to test that everything works, then move to larger models if your hardware allows.

Removing a model

To free up disk space, you can remove models you no longer use. In the local models tab, click the trash icon next to the model to delete.

Re-downloading is always possible

Removing a model isn't irreversible, you can always re-download it from the Hub if you need it.

Models and MCP

Not all models support MCP (Model Context Protocol), the feature that allows AI to use external tools. If you plan to use MCP, choose models that support "function calling":

  • llama3.1, llama3.2, llama3.3
  • qwen2.5 (all variants)
  • mistral, mistral-nemo
  • command-r, command-r-plus

Older models like llama2 or codellama don't support MCP. We'll explore this topic further in the dedicated chapter.