Running Local LLMs | Bjarne Verschorre

../running-local-llms.md August 30, 2024

Why

Running LLMs locally can be useful for a number of reasons:

Privacy: You don’t have to send your data to a server
Speed: You don’t have to wait for the server to respond
Faster than searching: The model has the information locally
Offline: You can use it without an internet connection, you can access the information without internet
Different models: You can use models that are not available online

Why not

Running LLMs locally can be resource intensive, especially for larger models, if you don’t have a good enough GPU it might be slow (for larger models).

Installation

Download from https://ollama.com/download/^↗

For Linux it is

curl -fsSL https://ollama.com/install.sh | sh

Models

You can get your models from their library^↗

Pull the image with:

ollama pull <name>

I recommend:

phi3^↗, a lightweight model
llama3.1^↗, Llama 3.1 is a state-of-the-art model from Meta
llama2-uncensored^↗, Uncensored Llama 2 model by George Sung and Jarrad Hope.

Note that you’d need a beefer system for the larger models.

Usage

(Pull and) run the image with:

ollama run <name> [prompt]

For more options:

ollama --help

API

If you want the ollama API to be accessable to other systems on your network you’d need add this to the [Service] part of the config file /etc/systemd/system/ollama.service:

Environment="OLLAMA_HOST=0.0.0.0:11434"

You can checkout the API documentation here^↗.

UIs

You can use UIs for a more user friendly experience:

Uninstall

sudo systemctl stop ollama
sudo systemctl disable ollama

sudo rm /etc/systemd/system/ollama.service
sudo rm $(which ollama)
sudo rm -r /usr/share/ollama

sudo userdel ollama
sudo groupdel ollama

← Revamped Website My PCs Disks Setup →