Skip to content

Choose a speech-to-text backend

Speech-to-text (STT) is how ORBIS turns your voice into text. Pick the backend that fits your priorities in Settings → Voice → Speech to text. For the full list of keys, see the Voice reference.

Which one?

If you want…UseNotes
Private, offline, zero setupLocal (Whisper)The default. Runs on-device (Apple-Silicon MPS). Transcribes after you stop talking.
Faster + fewer false transcriptsParakeetBest local quality/speed, far fewer silence-hallucinations. Needs the [parakeet] extra and a ~600 MB model on first use.
A hosted transcriberOpenAIAn OpenAI-compatible Whisper endpoint; needs a key.
protoLabs-hostedprotoLabsfaster-whisper on the gateway, same key as the LLM.

When in doubt, the Local default is the right call — it's private and needs nothing extra. Move to Parakeet if you find Whisper slow or it transcribes silence into stray words.

Switch backends

  1. Open Settings → Voice → Speech to text.
  2. Pick the backend.
  3. For a hosted backend, set the URL / model / key.
  4. Parakeet needs a one-time install of the [parakeet] extra and a restart to load its model. Hosted backends take effect on the next conversation.

Confirm it

Speak and watch the transcript / the mic level meter. If audio reaches ORBIS (the meter moves) but no text appears, see Voice isn't working.

See also