Managing models¶

AI models are the heart of the application. In this chapter we'll see how to download new models, understand the differences between them and choose the one best suited to your needs.

Local models vs Hub¶

The model management interface has two tabs:

Local Models: models already downloaded on your computer, ready to use
Hub Search: online catalog of models available for download

To access model management, click the Manage Models button in the right sidebar.

Installed models¶

In the local models tab you see all the models available on your computer. For each model the following are shown:

Name and version: for example "llama3.2:3b"
Size: how much disk space it takes
Date: when it was downloaded
Category: Chat, Code, Reasoning, Multimodal

You can sort the list by name, size, date or category using the buttons at the top.

Downloading new models¶

Switch to the Hub Search tab to search for models in the Ollama catalog:

You can filter by category:

Chat: generic models for conversation
Code: specialized in programming
Reasoning: optimized for logical reasoning
Multimodal: capable of analyzing images too

Find an interesting model and click Download. You'll see the download progress in real time:

Before downloading

Models can be very large: a 70 billion parameter model requires over 40 GB of disk space.

Also, large models require more RAM and a GPU with enough memory to run at acceptable speed. See the "Which model to choose" section to understand what your hardware can handle.

Understanding model names¶

Model names follow a precise pattern:

name:variant

For example: llama3.2:3b, qwen2.5:7b-instruct, codellama:13b

The number after the colon typically indicates size: - 1b-3b: lightweight models, fast, suitable for less powerful computers - 7b-8b: good compromise between quality and speed - 13b-14b: more accurate responses, need 16 GB of RAM - 70b and above: maximum quality, require powerful hardware

Which model to choose¶

The choice depends on your hardware and what you want to do:

For computers with 8 GB of RAM¶

Model	Recommended use
llama3.2:3b	General conversation, fast
qwen2.5:3b	Good for texts in multiple languages
phi3:3.8b	Reasoning and logic

For computers with 16 GB of RAM¶

Model	Recommended use
llama3.1:8b	General use, excellent responses
qwen2.5:7b	Multilingual, including Italian
mistral:7b	Fast and reliable
codellama:7b	Programming

For computers with GPU (8+ GB VRAM)¶

Model	Recommended use
llama3.3:70b	Maximum quality
qwen2.5-coder:32b	Advanced programming
command-r:35b	Research and document analysis

Start small

If you don't know which to choose, start with llama3.2:3b to test that everything works, then move to larger models if your hardware allows.

Removing a model¶

To free up disk space, you can remove models you no longer use. In the local models tab, click the trash icon next to the model to delete.

Re-downloading is always possible

Removing a model isn't irreversible, you can always re-download it from the Hub if you need it.

Models and MCP¶

Not all models support MCP (Model Context Protocol), the feature that allows AI to use external tools. If you plan to use MCP, choose models that support "function calling":

llama3.1, llama3.2, llama3.3
qwen2.5 (all variants)
mistral, mistral-nemo
command-r, command-r-plus

Older models like llama2 or codellama don't support MCP. We'll explore this topic further in the dedicated chapter.