Voice

Trigger workflows with voice input — records audio, transcribes with a speech-to-text model, and surfaces relevant knowledge base context.

The Voice Trigger block lets users start a workflow by speaking. It records audio from the browser, transcribes it using your configured speech-to-text (STT) provider, and optionally retrieves relevant context from a knowledge base before the workflow continues.

Configuration

Field	Required	Description
STT Provider	Yes	The speech-to-text integration to use for transcription. Must be an installed STT-capable integration.
Model	Yes	The specific model offered by the selected provider. Options are populated after you pick a provider.
Language	No	Force a specific transcription language. Defaults to auto-detect.
Knowledge Base	No	A knowledge base to search using the transcript. Relevant items are returned as structured context.
Max KB Results	No	Maximum number of knowledge base items to surface. Defaults to `5`.

Outputs

Once transcription and (optional) KB retrieval complete, the following variables are available in downstream blocks:

Variable	Type	Description
`<voice1.transcript>`	string	The transcribed text from the audio recording
`<voice1.knowledgeBaseItems>`	JSON array	Matched knowledge base items (empty array if no KB is configured)
`<voice1.knowledgeBaseContext>`	string	Matched KB items formatted as a single context string, ready to pass to an LLM
`<voice1.audioUrl>`	string	Presigned URL to download the original audio file
`<voice1.language>`	string	Detected or configured language code (e.g. `en`, `fr`)
`<voice1.duration>`	number	Audio duration in seconds
`<voice1.notes>`	string	Session notes taken in the recording studio (absent when none were taken)

Replace voice1 with the name you assigned to the block.

Deployment

The Voice Trigger generates a shareable Trigger Link (format: /v/{id}) once the workflow is deployed. Share that link with users — opening it in a browser presents the voice recording interface and starts a workflow run when audio is submitted.

The recorder is a full recording studio built for real meetings: one desk carries preparation, live capture, and review, with the controls on a console bar. Recording can be paused and resumed, the screen stays awake while capture is live, and closing the tab mid-recording asks for confirmation. A live transcript scrolls beside a session notes pad — notes ride into the workflow as <voice1.notes> and are saved to the process instance's knowledge base. Focus mode reduces the room to the transcript for on-site recording. After stopping you can correct the transcript before sending it — the edited text is what the workflow receives. Recordings up to four hours are supported.

Process Flow recordings and semi-live questions

When a Voice Trigger is bound to a Process Flow recording action, the process instance determines the destination knowledge base; the recorder cannot choose or override it. While recording, the platform saves cumulative transcript snapshots to one draft page at a bounded cadence (no faster than every five seconds). The knowledge view and Shared action conversation remain available alongside the recorder, so authorized process participants can ask semi-live questions against the same persisted draft and the instance's governed knowledge.

Only server-persisted transcript content is visible to Cortex. An unpersisted browser buffer is never injected directly into the shared conversation. If a draft write fails, recording continues locally, the UI displays a warning, and the platform retries.

Stopping the recording flushes the remaining transcript, finalizes the existing draft page, persists the audio, and then runs configured enrichment. The action completes only after required persistence succeeds. Retrying finalization updates the same transcript and audio artifacts instead of creating duplicates.

A Speech-to-Text integration must be installed and configured in your workspace before the Voice Trigger can transcribe audio. See Integrations to add an STT provider.

Voice

Configuration

Outputs

Deployment

Process Flow recordings and semi-live questions

On this page