Voice recording (Alpha)

Drop a sales call, voicemail, or training session into the dashboard. Coffield.io transcribes it, scrubs the personal information before anything reaches the database, and surfaces draft business knowledge for you to approve.

Alpha feature. Voice recording is available on Growth plans and above. We're still learning what works best — expect the workflow to improve over the next few months. If you hit anything rough, email support@coffield.io and we'll fix it.


Why we built this

Most of your business's most useful knowledge lives in your head — and in the recordings of conversations you've already had with customers, partners, and your own team. The voice recording feature pulls that knowledge out:

  • A sales call where you explain pricing → becomes a draft FAQ.
  • A training session walking a new hire through your process → becomes a draft how-to.
  • A voicemail from a customer asking the same question for the third time this week → becomes a draft FAQ paired with a knowledge gap.

You upload the audio. We do the rest. Nothing is published to your live agent until you click Activate.

What this feature does NOT do

It's important to be clear up front:

  • It does not record calls for you. You upload audio you already have.
  • It does not persist the raw transcript. We never write the un-redacted text to disk — only the cleaned version after PII removal.
  • It does not dial out, listen to live audio, or sit in on meetings.
  • It does not replace your judgment. Every extracted item starts as a draft — your agent never uses it until you approve it.

Before you upload — the legal bit

Recording laws vary by state and country. Some states (California, Florida, Illinois, Pennsylvania, Washington, and others) require all parties on the call to consent before you can record. Other states only require one party (typically you) to consent.

Coffield.io does not verify that your recording was made legally. By uploading, you're confirming three things — and the consent checkbox in the upload modal makes this explicit:

"I confirm that all parties on this recording have provided informed consent for the recording and processing of their voice, in compliance with applicable laws (including state two-party-consent statutes such as those in California, Florida, Illinois, Pennsylvania, Washington, and others). I am authorized to upload this content for analysis. PII detected in the transcript will be redacted before any text is stored."

If you can't confirm this for a particular recording, don't upload it. The platform will refuse to process anything without an acknowledged consent box, and we log the acknowledgement timestamp and the IP address it came from for our own audit trail.

Finding the page

In the dashboard's left menu under AI Agents, look for Voice Recordings (microphone icon). If you don't see it:

  • Confirm you're on Growth plan or above (Starter does not include this feature).
  • Confirm you have the "Manage agents" permission inside this tenant. If not, ask your tenant admin to grant it.

The URL is:

/dashboard/{your-tenant-id}/voice-recordings

Uploading a recording

  1. Click Upload recording in the top right of the Voice Recordings list.

  2. Pick the agent the knowledge belongs to. Extracted items will be attached to this agent only — they won't leak across agents in the same tenant.

  3. Pick the audio file from your computer. Supported formats and limits:

    Format Extension Max size
    MP3 .mp3 500 MB
    MP4 (audio) .mp4 500 MB
    M4A .m4a 500 MB
    WAV .wav 500 MB
    OGG .ogg 500 MB

    Files outside this list are rejected before upload starts. Files renamed to look like audio (e.g. report.mp3.exe) are sniffed and rejected too — the platform checks the actual file contents, not just the name.

  4. Read the compliance acknowledgment and tick the box. Without it, the Upload and process button stays disabled.

  5. Click Upload and process. A toast confirms "Recording uploaded — processing started." and you're redirected to the recording's detail page.

The audio file is stored privately on the same encrypted bucket as your other files (tenants/{your-tenant-id}/voice/...), with a randomized storage name. Your original filename is preserved for display.

What happens after you upload

Processing runs in the background. The recording moves through these statuses (visible as a coloured badge on the list view and detail page):

Status What's happening Typical duration
Pending Queued up, hasn't started yet. Seconds
Transcribing Audio is being converted to text via Whisper. The raw transcript lives in memory only — it never touches disk. ~1 min per 10 min of audio
Redacting The in-memory transcript is run through a Named-Entity-Recognition (NER) pass to find names, emails, phone numbers, addresses, payment info, SSNs, account numbers, and dates of birth. Each PII span is replaced with a [REDACTED_TYPE] placeholder. Seconds
Extracting The redacted transcript is sent to the LLM with one job: pull out business knowledge as category + title + content. ~30 sec per 10 min of redacted text
Review ready Done. Items extracted, ready for your approval. Coloured green.
Review needed Done, but the redaction confidence dropped below threshold (85% by default). Coloured amber — please double-check the transcript looks clean before activating anything.
Complete You've reviewed and either activated or discarded the drafts.
Failed Something went wrong. Coloured red — see the Error field on the detail page.

A 10-minute audio file usually finishes the whole pipeline in under 2 minutes.

Reviewing the recording

Open the recording from the list. The detail page has three sections:

Recording

The basics — filename, format, size, duration, agent, who uploaded it, when consent was acknowledged, the IP it was acknowledged from, the current status, and any processing error.

PII Redaction

What the redactor actually did:

  • Engine — which provider scrubbed the text (e.g. presidio, aws-comprehend).
  • Spans redacted — how many distinct pieces of PII were caught.
  • Confidence — the redactor's own confidence (0.0–1.0) that it caught everything.
  • Counts by type — a per-category tally (e.g. PERSON: 4, EMAIL: 2, PHONE: 1). Only counts. No values. The redactor is not allowed to write back what it caught — only how much.

Redacted Transcript

The actual text the platform stored. Every name, email, phone number, address, payment detail, SSN, account number, and date of birth in the original audio is replaced with a [REDACTED_PERSON] / [REDACTED_EMAIL] / etc. placeholder. Read this before activating any drafts — if you see something that should have been redacted but wasn't, discard the drafts and let us know.

Activating extracted knowledge

Extracted business knowledge is created as draft items in your agent's knowledge base. They are not used at runtime until you activate them.

You have two ways to review:

Bulk: header actions (visible to Manage agents)

  • Activate all draft items — flips every draft from this recording to active status. Each item is queued for embedding (so the agent can semantically search it) and becomes available at runtime within a few minutes. Status moves to Complete.
  • Discard all draft items — marks every draft as inactive. The recording stays in the list for audit. Status moves to Complete.

Per-item: the Extracted knowledge items table

Scroll past the three info sections and you'll find a table of every item the recording produced. Each row shows the Title, Topic, Status (Draft / Active / Discarded) and Embedding state. Row-level actions are visible to Manage agents:

Action When it shows What it does
Activate Status = Draft Flips this single item to Active and queues it for embedding. Bumps the recording's Activated counter by one.
Discard Status = Draft Flips this single item to Inactive. Other drafts are unaffected.
Deactivate Status = Active Pulls this item back out of the agent's retrieval pool. The text stays in your tenant for audit.
Edit Always (Manage agents) Modal to edit the title, one-line summary, and content before activating. Editing an already-Active item also re-queues embedding so the agent picks up the new wording.

Use the Status filter at the top of the table to flip between Draft (awaiting review), Active, and Discarded.

Tip. Mixed-quality recordings are the norm — a 10-minute call might produce 8 great items and 2 vague ones. Use the per-item picker to ship the good ones immediately and discard the rest. You don't have to re-upload to fix one bad draft any more.

What gets extracted, what doesn't

The extractor is told to look for business information only. It pulls:

  • Pricing rules ("$150 base plus $50 per additional zone")
  • Hours, holidays, and service-area boundaries
  • Product specs, model numbers, supported configurations
  • Process knowledge ("we always ask for the property address first")
  • Policies (cancellation, refund, scheduling)
  • FAQs that came up in the conversation

It is explicitly told to drop:

  • Anything containing a redaction placeholder (extra safety net — if the LLM accidentally surfaces [REDACTED_PERSON] in a draft, that draft is thrown away).
  • Personal opinions, gossip, off-topic small talk.
  • Anything that looks like PII even if the redactor missed it.

If you wanted to capture personal context (a customer's birthday, a vendor's preferred name), this is not the tool. Treat it as a business-knowledge extractor only.

Filters, search, and finding old recordings

The Voice Recordings list supports:

  • Search by filename (the original name you uploaded with — not the storage name).
  • Filter by status — useful when you have a lot of recordings and want to find everything stuck in Review needed or Failed.
  • Sort by uploaded date (default: newest first), filename, or duration.

Plan limits

Plan Recordings per month
Starter — (not included)
Growth 10
Pro 30
Custom Unlimited

The cap is enforced at upload time. When you've used your monthly allowance:

  • The Upload recording button still appears (so you can see your usage), but clicking it shows a friendly "Monthly recording limit reached" toast with the upgrade prompt instead of opening the modal.
  • The Upload modal also shows your current usage as a description line: "This month: 7 of 10 recordings used."
  • API/MCP callers that try to upload past the cap get an explicit Monthly voice-recording limit reached error back from the backend service — the limit is enforced server-side, not just in the UI.

Usage resets at the start of each calendar month (UTC). Custom plans show "unlimited" instead of a count. Starter and tenants without an active subscription get a 0-cap by default — a safety stance, not a punishment; contact support if you're on a special plan tier we haven't mapped yet.

The maximum file size (500 MB per upload) is also enforced.

Privacy and storage — what really happens to your audio

Thing Where it lives How long
Original audio file Tenant-scoped S3 bucket (tenants/{your-tenant-id}/voice/...) Indefinite — kept for audit. Deleting the recording from the dashboard also deletes the S3 object.
Raw (un-redacted) transcript In RAM only, on the worker that processed your recording. Never written to disk. Discarded after the redaction step. Microseconds
Redacted transcript voice_transcripts.redacted_text Until you delete the recording
PII counts (e.g. PERSON: 4) voice_pii_audits (append-only) Indefinite — for audit
Extracted knowledge items agent_knowledge_items with source_voice_id = {your recording} Indefinite — until you delete or deactivate them
Consent acknowledgement voice_recordings.consent_acknowledged_at + consent_ip Indefinite — for audit

The raw transcript never persists. If the worker crashes mid-pipeline, the worst case is the recording ends in Failed status — there is no temp file on disk with personal data in it.

When things go wrong

Symptom Cause Fix
Status stuck in Pending for hours The voice queue worker isn't running. Tenant admins: confirm Horizon is up. Most customers: email support@coffield.io with the recording ID.
Status: Failed, Error: "Whisper rate limit" The transcription provider is throttling. Wait 5–10 minutes and re-upload.
Status: Failed, Error: "MIME type mismatch" The file isn't actually one of the supported audio formats. Re-export from the source app (e.g. QuickTime → Export Audio → M4A) and re-upload.
Status: Review needed (amber) Redaction confidence dropped below 85%. Open the recording, read the redacted transcript carefully, and only activate drafts if it looks clean.
No items extracted at all The recording was too short, in a language other than English, or contained no business knowledge. Try a longer or more on-topic recording. English is the default; other languages are coming.
The "Upload recording" button is missing You don't have Manage agents permission, or you're on the Starter plan. Ask your tenant admin or upgrade.

Deleting a recording

Tenant users with Manage agents permission can delete a recording from the list view. Deleting:

  • Removes the audio file from S3.
  • Removes the redacted transcript.
  • Removes the PII audit row.
  • Does NOT automatically delete the knowledge items extracted from the recording — those are independent records you can manage from the agent's Knowledge page. If you want a clean slate, discard the drafts first, then delete the recording.

There is no undo. Delete with care.

What's coming

Tracked alongside other rough edges in gaps.md:

  • Non-English transcription. Whisper supports many languages; we default to English and haven't surfaced a picker.
  • Live call ingestion. Today you upload after the fact. Live call audio (via Twilio Voice etc.) is a v2 idea.
  • Speaker diarization in the UI. The transcript today is a single block of text — no "Speaker 1" / "Speaker 2" labels in the review pane.

If one of these is blocking you, email support@coffield.io and we'll bump it.

Last updated
Need more help? Email support@coffield.io