LLM & transcription providers

Gistlist has two provider choices: what summarizes (the LLM) and what transcribes (the ASR). They’re independent — you can mix any LLM with any ASR.

Change either in Settings → Models at any time.

LLM providers

Provider	Mode	Cost	Notes
Anthropic Claude	Cloud	API credits	Fastest summaries; best at structured outputs.
OpenAI	Cloud	API credits	Good general fallback; useful for shared OpenAI key across LLM + ASR.
Ollama	Local	Free	Fully offline; slower per section, depends on your machine.

Individual prompts can override the default model — see per-prompt model overrides in Prompts. Useful when most outputs should be local but one specific prompt needs a stronger model.

Ollama (local LLM)

If you pick local mode and don’t already have Ollama installed, the Setup Wizard downloads a pinned, hash-verified Ollama binary into the app’s data directory. No Homebrew or separate install needed.

If you already have Ollama on PATH, Gistlist uses that instead — no second daemon, no duplicate models. Models live in the standard ~/.ollama/models directory regardless.

The Settings -> Models picker lists cloud models and a curated set of Ollama models chosen for transcript-style work. Gistlist defaults to a Qwen 3.5 model sized to your machine’s RAM, biased toward smaller models so the app stays responsive — heavier picks (Gemma 4) are only suggested on 24 GB+ machines that have real headroom after macOS and the app.

Current curated local models:

Model	Approx. size	Recommended RAM	Notes
`qwen3.5:0.8b`	1.0 GB	4 GB	Ultra-tiny test model; recommended below 8 GB RAM.
`qwen3.5:2b`	2.7 GB	4 GB	Fast low-resource option; recommended on 8-15 GB machines.
`qwen3.5:4b`	3.4 GB	8 GB	Compact everyday local model; recommended on 16-23 GB machines.
`qwen3.5:9b`	6.6 GB	16 GB	Larger Qwen option; recommended on 24 GB+ machines.
`llama3.1:8b`	4.9 GB	8 GB	Strong performance/size balance.
`mistral:7b`	4.4 GB	8 GB	Efficient general local model.
`phi3:latest`	2.2 GB	4 GB	Small Microsoft model.
`gemma4:e2b`	7.2 GB	16 GB	Smaller Gemma 4 option.
`gemma4:e4b`	9.6 GB	24 GB	Larger Gemma 4 — recommended only on 24 GB+ machines.

Sizes verified against ollama.com/library on 2026-04-28. The “Recommended RAM” column is an app-side heuristic (Ollama itself doesn’t publish a canonical RAM number) and accounts for app overhead alongside the model.

The wizard highlights recommended local models by RAM tier:

Host RAM	Recommended models
Below 8 GB	`qwen3.5:0.8b`, `phi3:latest`
8-15 GB	`qwen3.5:2b`, `phi3:latest`
16-23 GB	`qwen3.5:4b`, `llama3.1:8b`
24 GB or more	`qwen3.5:9b`, `gemma4:e4b`

nomic-embed-text is not a chat model. Gistlist installs and manages it for meeting-index semantic search, but it should not appear in the default text-analysis model picker.

The Custom… option lets you type any tag from ollama.com/library, so you’re not locked into the curated list.

Speed. On a 16 GB Apple Silicon machine, a 30-minute meeting summary takes 30 seconds to 2 minutes per prompt with a 7B–9B model. The Meeting Detail page shows a live spinner and elapsed-seconds counter for each running section.

Cloud keys. In local-only mode, leave the Claude and OpenAI fields blank. Add either later if you want a single prompt to use a cloud model.

Transcription providers

Transcription runs separately from the LLM.

Provider	Mode	Notes
Parakeet	Local (Apple Silicon, MLX)	Fast, accurate, fully offline. Default on Apple Silicon.
OpenAI Whisper	Cloud	Quick to set up, costs API credits. Same API key as the OpenAI LLM.

whisper.cpp / whisper-local may appear only if an older config already selected it. It is not offered as a normal option in the current signed macOS build.

Parakeet

Parakeet installs into a Python venv at ~/.gistlist/parakeet-venv during the Setup Wizard — one click, no Terminal, no Homebrew, no system Python. The wizard installs an app-managed CPython 3.12 runtime at ~/Library/Application Support/Gistlist/bin/python-runtime/, ffmpeg if it’s missing, then mlx-audio into the venv, then runs a transcription smoke test that downloads ~600 MB of model weights to the Hugging Face cache. About 1 GB total disk usage on first install.

Apple Silicon only. MLX (and therefore mlx-audio) does not support Intel Macs. The wizard hides the Parakeet option entirely on x86_64 hosts; Settings does the same. Intel users go to OpenAI Whisper.

Switch to OpenAI cloud transcription in Settings → Models if you’d rather not have a local venv.