LLM & transcription providers
Gistlist has two provider choices: what summarizes (the LLM) and what transcribes (the ASR). They’re independent — you can mix any LLM with any ASR.
Change either in Settings → Models at any time.
LLM providers
Section titled “LLM providers”| Provider | Mode | Cost | Notes |
|---|---|---|---|
| Anthropic Claude | Cloud | API credits | Fastest summaries; best at structured outputs. |
| OpenAI | Cloud | API credits | Good general fallback; useful for shared OpenAI key across LLM + ASR. |
| Ollama | Local | Free | Fully offline; slower per section, depends on your machine. |
Individual prompts can override the default model — see per-prompt model overrides in Prompts. Useful when most outputs should be local but one specific prompt needs a stronger model.
Ollama (local LLM)
Section titled “Ollama (local LLM)”If you pick local mode and don’t already have Ollama installed, the Setup Wizard downloads a pinned, hash-verified Ollama binary into the app’s data directory. No Homebrew or separate install needed.
If you already have Ollama on PATH, Gistlist uses that instead — no second daemon, no duplicate models. Models live in the standard ~/.ollama/models directory regardless.
The Settings -> Models picker lists cloud models and a curated set of Ollama models chosen for transcript-style work. Gistlist defaults to a Qwen 3.5 model sized to your machine’s RAM, biased toward smaller models so the app stays responsive — heavier picks (Gemma 4) are only suggested on 24 GB+ machines that have real headroom after macOS and the app.
Current curated local models:
| Model | Approx. size | Recommended RAM | Notes |
|---|---|---|---|
qwen3.5:0.8b | 1.0 GB | 4 GB | Ultra-tiny test model; recommended below 8 GB RAM. |
qwen3.5:2b | 2.7 GB | 4 GB | Fast low-resource option; recommended on 8-15 GB machines. |
qwen3.5:4b | 3.4 GB | 8 GB | Compact everyday local model; recommended on 16-23 GB machines. |
qwen3.5:9b | 6.6 GB | 16 GB | Larger Qwen option; recommended on 24 GB+ machines. |
llama3.1:8b | 4.9 GB | 8 GB | Strong performance/size balance. |
mistral:7b | 4.4 GB | 8 GB | Efficient general local model. |
phi3:latest | 2.2 GB | 4 GB | Small Microsoft model. |
gemma4:e2b | 7.2 GB | 16 GB | Smaller Gemma 4 option. |
gemma4:e4b | 9.6 GB | 24 GB | Larger Gemma 4 — recommended only on 24 GB+ machines. |
Sizes verified against ollama.com/library on 2026-04-28. The “Recommended RAM” column is an app-side heuristic (Ollama itself doesn’t publish a canonical RAM number) and accounts for app overhead alongside the model.
The wizard highlights recommended local models by RAM tier:
| Host RAM | Recommended models |
|---|---|
| Below 8 GB | qwen3.5:0.8b, phi3:latest |
| 8-15 GB | qwen3.5:2b, phi3:latest |
| 16-23 GB | qwen3.5:4b, llama3.1:8b |
| 24 GB or more | qwen3.5:9b, gemma4:e4b |
nomic-embed-text is not a chat model. Gistlist installs and manages it for meeting-index semantic search, but it should not appear in the default text-analysis model picker.
The Custom… option lets you type any tag from ollama.com/library, so you’re not locked into the curated list.
Speed. On a 16 GB Apple Silicon machine, a 30-minute meeting summary takes 30 seconds to 2 minutes per prompt with a 7B–9B model. The Meeting Detail page shows a live spinner and elapsed-seconds counter for each running section.
Cloud keys. In local-only mode, leave the Claude and OpenAI fields blank. Add either later if you want a single prompt to use a cloud model.
Transcription providers
Section titled “Transcription providers”Transcription runs separately from the LLM.
| Provider | Mode | Notes |
|---|---|---|
| Parakeet | Local (Apple Silicon, MLX) | Fast, accurate, fully offline. Default on Apple Silicon. |
| OpenAI Whisper | Cloud | Quick to set up, costs API credits. Same API key as the OpenAI LLM. |
whisper.cpp / whisper-local may appear only if an older config already selected it. It is not offered as a normal option in the current signed macOS build.
Parakeet
Section titled “Parakeet”Parakeet installs into a Python venv at ~/.gistlist/parakeet-venv during the Setup Wizard — one click, no Terminal, no Homebrew, no system Python. The wizard installs an app-managed CPython 3.12 runtime at ~/Library/Application Support/Gistlist/bin/python-runtime/, ffmpeg if it’s missing, then mlx-audio into the venv, then runs a transcription smoke test that downloads ~600 MB of model weights to the Hugging Face cache. About 1 GB total disk usage on first install.
Apple Silicon only. MLX (and therefore mlx-audio) does not support Intel Macs. The wizard hides the Parakeet option entirely on x86_64 hosts; Settings does the same. Intel users go to OpenAI Whisper.
Switch to OpenAI cloud transcription in Settings → Models if you’d rather not have a local venv.