Skip to content

Small Language Models

Open Accountant runs on small language models (SLMs) — AI models compact enough to run on your laptop. No cloud API. No subscriptions. No data leaving your machine.

  • Privacy. Your bank transactions never touch a third-party server. The model runs locally.
  • Speed. No network round-trip. Responses come as fast as your hardware allows.
  • Cost. Zero API fees. Run as many queries as you want for free.
  • Offline. Works without an internet connection. Categorize transactions on a plane.

SLMs are language models with roughly 1–13 billion parameters — small enough to run on consumer hardware but capable enough for structured tasks like transaction categorization, spending analysis, and financial summaries.

Larger models (70B+) need server-grade GPUs. SLMs run on a MacBook or a desktop with 8–32 GB of RAM.

ModelSizeRAM NeededQualityBest For
llama3.2:3b2.0 GB8 GBGoodCategorization, quick queries
llama3.2:1b1.3 GB8 GBFairLow-resource machines, simple tasks
llama3.1:8b4.7 GB16 GBGreatGeneral financial analysis
phi-4-mini2.2 GB8 GBGoodStructured reasoning on low RAM
mistral4.1 GB16 GBGreatBalanced quality and speed
gemma3:4b3.3 GB8 GBGoodSummarization, categorization
qwen2.5:7b4.4 GB16 GBGreatMultilingual financial data
deepseek-r1:8b4.9 GB16 GBGreatComplex reasoning, tax analysis
RAMWhat You Can Run
8 GB1B–4B models (llama3.2:3b, phi-4-mini, gemma3:4b)
16 GB7B–8B models (llama3.1:8b, mistral, qwen2.5:7b)
32 GB+13B+ models, multiple models loaded simultaneously

Apple Silicon Macs (M1/M2/M3/M4) get significantly better performance due to unified memory and Metal GPU acceleration.

Install Ollama and pull a model:

Terminal window
brew install ollama
ollama pull llama3.2
ollama serve

On Linux:

Terminal window
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2
ollama serve

Verify the model is running:

Terminal window
ollama list

Set your default model in .env:

Terminal window
DEFAULT_MODEL=ollama:llama3.2

Or switch models interactively:

/model ollama:llama3.2
/model ollama:llama3.1:8b
/model ollama:mistral

No API key needed. Ollama runs on http://localhost:11434 by default. Override with:

Terminal window
OLLAMA_BASE_URL=http://localhost:11434
TaskRecommended ModelWhy
Transaction categorizationllama3.2:3bFast, accurate enough for category matching
Spending reviewllama3.1:8bNeeds more nuance for actionable insights
Tax preparationllama3.1:8b or largerTax rules require stronger reasoning
Subscription auditllama3.2:3bPattern matching, doesn’t need large model
Complex financial reasoningdeepseek-r1:8bBuilt for step-by-step reasoning
Budget planningmistral or qwen2.5:7bGood balance of speed and quality

Start with llama3.2:3b. It handles most daily tasks well. Move to an 8B model if you need deeper analysis or find quality lacking.

For tasks requiring maximum accuracy (year-end tax prep, investment analysis), consider using a cloud provider like claude-sonnet-4-20250514 or gpt-4o via the LLM Providers configuration.