Small Language Models

Why Small Language Models

Open Accountant runs on small language models (SLMs) — AI models compact enough to run on your laptop. No cloud API. No subscriptions. No data leaving your machine.

Privacy. Your bank transactions never touch a third-party server. The model runs locally.
Speed. No network round-trip. Responses come as fast as your hardware allows.
Cost. Zero API fees. Run as many queries as you want for free.
Offline. Works without an internet connection. Categorize transactions on a plane.

What’s a Small Language Model

SLMs are language models with roughly 1–13 billion parameters — small enough to run on consumer hardware but capable enough for structured tasks like transaction categorization, spending analysis, and financial summaries.

Larger models (70B+) need server-grade GPUs. SLMs run on a MacBook or a desktop with 8–32 GB of RAM.

Supported Models

Model	Size	RAM Needed	Quality	Best For
`llama3.2:3b`	2.0 GB	8 GB	Good	Categorization, quick queries
`llama3.2:1b`	1.3 GB	8 GB	Fair	Low-resource machines, simple tasks
`llama3.1:8b`	4.7 GB	16 GB	Great	General financial analysis
`phi-4-mini`	2.2 GB	8 GB	Good	Structured reasoning on low RAM
`mistral`	4.1 GB	16 GB	Great	Balanced quality and speed
`gemma3:4b`	3.3 GB	8 GB	Good	Summarization, categorization
`qwen2.5:7b`	4.4 GB	16 GB	Great	Multilingual financial data
`deepseek-r1:8b`	4.9 GB	16 GB	Great	Complex reasoning, tax analysis

Hardware Requirements

RAM	What You Can Run
8 GB	1B–4B models (`llama3.2:3b`, `phi-4-mini`, `gemma3:4b`)
16 GB	7B–8B models (`llama3.1:8b`, `mistral`, `qwen2.5:7b`)
32 GB+	13B+ models, multiple models loaded simultaneously

Apple Silicon Macs (M1/M2/M3/M4) get significantly better performance due to unified memory and Metal GPU acceleration.

Getting Started

Install Ollama and pull a model:

brew install ollama
ollama pull llama3.2
ollama serve

On Linux:

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2
ollama serve

Verify the model is running:

ollama list

Configuring Open Accountant

Set your default model in .env:

DEFAULT_MODEL=ollama:llama3.2

Or switch models interactively:

/model ollama:llama3.2
/model ollama:llama3.1:8b
/model ollama:mistral

No API key needed. Ollama runs on http://localhost:11434 by default. Override with:

OLLAMA_BASE_URL=http://localhost:11434

Which Model Should I Use?

Task	Recommended Model	Why
Transaction categorization	`llama3.2:3b`	Fast, accurate enough for category matching
Spending review	`llama3.1:8b`	Needs more nuance for actionable insights
Tax preparation	`llama3.1:8b` or larger	Tax rules require stronger reasoning
Subscription audit	`llama3.2:3b`	Pattern matching, doesn’t need large model
Complex financial reasoning	`deepseek-r1:8b`	Built for step-by-step reasoning
Budget planning	`mistral` or `qwen2.5:7b`	Good balance of speed and quality

Start with llama3.2:3b. It handles most daily tasks well. Move to an 8B model if you need deeper analysis or find quality lacking.

For tasks requiring maximum accuracy (year-end tax prep, investment analysis), consider using a cloud provider like claude-sonnet-4-20250514 or gpt-4o via the LLM Providers configuration.