Small Language Models
Why Small Language Models
Section titled “Why Small Language Models”Open Accountant runs on small language models (SLMs) — AI models compact enough to run on your laptop. No cloud API. No subscriptions. No data leaving your machine.
- Privacy. Your bank transactions never touch a third-party server. The model runs locally.
- Speed. No network round-trip. Responses come as fast as your hardware allows.
- Cost. Zero API fees. Run as many queries as you want for free.
- Offline. Works without an internet connection. Categorize transactions on a plane.
What’s a Small Language Model
Section titled “What’s a Small Language Model”SLMs are language models with roughly 1–13 billion parameters — small enough to run on consumer hardware but capable enough for structured tasks like transaction categorization, spending analysis, and financial summaries.
Larger models (70B+) need server-grade GPUs. SLMs run on a MacBook or a desktop with 8–32 GB of RAM.
Supported Models
Section titled “Supported Models”| Model | Size | RAM Needed | Quality | Best For |
|---|---|---|---|---|
llama3.2:3b | 2.0 GB | 8 GB | Good | Categorization, quick queries |
llama3.2:1b | 1.3 GB | 8 GB | Fair | Low-resource machines, simple tasks |
llama3.1:8b | 4.7 GB | 16 GB | Great | General financial analysis |
phi-4-mini | 2.2 GB | 8 GB | Good | Structured reasoning on low RAM |
mistral | 4.1 GB | 16 GB | Great | Balanced quality and speed |
gemma3:4b | 3.3 GB | 8 GB | Good | Summarization, categorization |
qwen2.5:7b | 4.4 GB | 16 GB | Great | Multilingual financial data |
deepseek-r1:8b | 4.9 GB | 16 GB | Great | Complex reasoning, tax analysis |
Hardware Requirements
Section titled “Hardware Requirements”| RAM | What You Can Run |
|---|---|
| 8 GB | 1B–4B models (llama3.2:3b, phi-4-mini, gemma3:4b) |
| 16 GB | 7B–8B models (llama3.1:8b, mistral, qwen2.5:7b) |
| 32 GB+ | 13B+ models, multiple models loaded simultaneously |
Apple Silicon Macs (M1/M2/M3/M4) get significantly better performance due to unified memory and Metal GPU acceleration.
Getting Started
Section titled “Getting Started”Install Ollama and pull a model:
brew install ollamaollama pull llama3.2ollama serveOn Linux:
curl -fsSL https://ollama.com/install.sh | shollama pull llama3.2ollama serveVerify the model is running:
ollama listConfiguring Open Accountant
Section titled “Configuring Open Accountant”Set your default model in .env:
DEFAULT_MODEL=ollama:llama3.2Or switch models interactively:
/model ollama:llama3.2/model ollama:llama3.1:8b/model ollama:mistralNo API key needed. Ollama runs on http://localhost:11434 by default. Override with:
OLLAMA_BASE_URL=http://localhost:11434Which Model Should I Use?
Section titled “Which Model Should I Use?”| Task | Recommended Model | Why |
|---|---|---|
| Transaction categorization | llama3.2:3b | Fast, accurate enough for category matching |
| Spending review | llama3.1:8b | Needs more nuance for actionable insights |
| Tax preparation | llama3.1:8b or larger | Tax rules require stronger reasoning |
| Subscription audit | llama3.2:3b | Pattern matching, doesn’t need large model |
| Complex financial reasoning | deepseek-r1:8b | Built for step-by-step reasoning |
| Budget planning | mistral or qwen2.5:7b | Good balance of speed and quality |
Start with llama3.2:3b. It handles most daily tasks well. Move to an 8B model if you need deeper analysis or find quality lacking.
For tasks requiring maximum accuracy (year-end tax prep, investment analysis), consider using a cloud provider like claude-sonnet-4-20250514 or gpt-4o via the LLM Providers configuration.
See Also
Section titled “See Also”- Transformers.js Local Inference — Run ONNX models in-process with zero configuration, no Ollama required