Managing models¶
AI models are the heart of the application. In this chapter we'll see how to download new models, understand the differences between them and choose the one best suited to your needs.
Local models vs Hub¶
The model management interface has two tabs:
- Local Models: models already downloaded on your computer, ready to use
- Hub Search: online catalog of models available for download
To access model management, click the Manage Models button in the right sidebar.
Installed models¶
In the local models tab you see all the models available on your computer. For each model the following are shown:
- Name and version: for example "llama3.2:3b"
- Size: how much disk space it takes
- Date: when it was downloaded
- Category: Chat, Code, Reasoning, Multimodal
You can sort the list by name, size, date or category using the buttons at the top.
Downloading new models¶
Switch to the Hub Search tab to search for models in the Ollama catalog:
You can filter by category:
- Chat: generic models for conversation
- Code: specialized in programming
- Reasoning: optimized for logical reasoning
- Multimodal: capable of analyzing images too
Find an interesting model and click Download. You'll see the download progress in real time:
Before downloading
Models can be very large: a 70 billion parameter model requires over 40 GB of disk space.
Also, large models require more RAM and a GPU with enough memory to run at acceptable speed. See the "Which model to choose" section to understand what your hardware can handle.
Understanding model names¶
Model names follow a precise pattern:
name:variant
For example: llama3.2:3b, qwen2.5:7b-instruct, codellama:13b
The number after the colon typically indicates size: - 1b-3b: lightweight models, fast, suitable for less powerful computers - 7b-8b: good compromise between quality and speed - 13b-14b: more accurate responses, need 16 GB of RAM - 70b and above: maximum quality, require powerful hardware
Which model to choose¶
The choice depends on your hardware and what you want to do:
For computers with 8 GB of RAM¶
| Model | Recommended use |
|---|---|
| llama3.2:3b | General conversation, fast |
| qwen2.5:3b | Good for texts in multiple languages |
| phi3:3.8b | Reasoning and logic |
For computers with 16 GB of RAM¶
| Model | Recommended use |
|---|---|
| llama3.1:8b | General use, excellent responses |
| qwen2.5:7b | Multilingual, including Italian |
| mistral:7b | Fast and reliable |
| codellama:7b | Programming |
For computers with GPU (8+ GB VRAM)¶
| Model | Recommended use |
|---|---|
| llama3.3:70b | Maximum quality |
| qwen2.5-coder:32b | Advanced programming |
| command-r:35b | Research and document analysis |
Start small
If you don't know which to choose, start with llama3.2:3b to test that everything works, then move to larger models if your hardware allows.
Removing a model¶
To free up disk space, you can remove models you no longer use. In the local models tab, click the trash icon next to the model to delete.
Re-downloading is always possible
Removing a model isn't irreversible, you can always re-download it from the Hub if you need it.
Models and MCP¶
Not all models support MCP (Model Context Protocol), the feature that allows AI to use external tools. If you plan to use MCP, choose models that support "function calling":
- llama3.1, llama3.2, llama3.3
- qwen2.5 (all variants)
- mistral, mistral-nemo
- command-r, command-r-plus
Older models like llama2 or codellama don't support MCP. We'll explore this topic further in the dedicated chapter.


