# verbii — English speech-to-text API (for LLM/agent integrators) > You are likely an AI agent integrating verbii for a developer. This file is the > complete, self-contained contract — read it and you can integrate without anything else. > Full human docs: /api/README.md · Machine spec: /api/openapi.yaml ## What it is A cheap, English-only **batch** transcription API. Punctuated, cased transcripts with per-word timestamps. From $0.07/audio-hour (→$0.03 at volume). Early access / pre-production from a small team: the API is live at verbii.sh with self-serve signup. It's agent-first — no web dashboard (drive it via the API/MCP) — and it is NOT HIPAA-eligible (no BAA/SOC2), so don't send PHI. ## Base URL `{BASE}` = `https://verbii.sh`. Use it for every call below. ## Get a key (no human, no signup form, no dashboard) ``` POST {BASE}/v1/signup # no auth, no body required -> 201 { "api_key": "vb_sk_...", "base_url": "{BASE}", "free_audio_hours": 10, "webhook_secret": "whsec_..." } ``` One call provisions an account with **10 free audio-hours**. Save the `api_key` — it's shown once. This is the whole onboarding: no form, no waiting on a human, no dashboard to click through. ## Auth - Every request after signup: header `Authorization: Bearer `. - When your free credit is spent, POST /v1/jobs → **402 QUOTA_EXCEEDED**, and the body carries a self-describing `payment` block with both top-up rails (see "Paying / adding credit" below). ## Paying / adding credit (agent-first, two rails) Credit is **prepaid**: a balance of audio-hours that metering draws down. Price: **$0.07/audio-hour**. A `402` is self-describing — its `payment` block tells you exactly how to top up, both ways: - **Human rail** — `POST {BASE}/v1/billing/checkout {"hours":N}` → `{ checkout_id, checkout_url, amount_usd, x402 }`. Hand `checkout_url` to your human principal; they pay, a signed webhook credits your balance. - **Agent rail (x402)** — if you hold a stablecoin wallet, settle the `x402` challenge yourself: `POST {BASE}/v1/billing/x402` with an `X-PAYMENT` header (base64 JSON) or `{"payment":{...}}` → `{ settled:true, credited_hours }`. Amount is signed; replayed nonces are rejected. - `GET {BASE}/v1/billing` → `{ plan, remaining_hours, price_per_hour_usd, recent, add_credit }`. Note: billing is a **mock** today (no real money moves yet) — the contracts are stable, the rails are being wired to a real processor + x402 facilitator. Need credit now? `POST /v1/contact` (kind=credit). ## Core flow (pick one ingestion path, then poll or use a webhook) 1. Create a job (we fetch a URL, or you upload bytes). 2. Get the result by polling GET /v1/jobs/{id}, or set callback_url for a webhook. ### Create by URL (simplest — no byte handling) ``` POST {BASE}/v1/jobs Authorization: Bearer $KEY Content-Type: application/json { "audio_url": "https://.../call.mp3", "callback_url": "https://you/hook" } -> 201 { "job_id": "job_...", "status": "queued", "request_id": "..." } ``` ### Create by upload (you have the bytes) ``` POST {BASE}/v1/jobs { "filename": "meeting.wav" } -> 201 { "job_id": "...", "status": "awaiting_upload", "upload_url": "https://...s3..." } # then PUT the bytes. IMPORTANT: send NO Authorization header and NO Content-Type on this PUT # (the signature is in the query string; adding auth → 400 "Only one auth mechanism allowed"). PUT --data-binary @meeting.wav ``` ### Batch (up to 100, partial success) ``` POST {BASE}/v1/jobs/batch { "jobs": [ {"audio_url":"..."}, {"filename":"a.wav"}, ... ] } -> 200 { "created": N, "failed": M, "jobs": [ {"index":0,"job_id":"...","status":"queued"}, ... ] } ``` Each item is independent: a bad item returns {index, error_code, error_message}; the rest succeed. ### Get the result ``` GET {BASE}/v1/jobs/{job_id} -> { "job_id","status": "queued|processing|done|failed", "text": "...", "words": [{"word","start","end","confidence"}], "audio_duration_sec", "language":"en", "request_id": "..." } ``` On failure: { "status":"failed", "error_code", "error_message", "retryable": true|false }. `retryable:true` ⇒ safe to resubmit; `false` ⇒ a permanent problem with the input. ### Subtitles GET {BASE}/v1/jobs/{job_id}?format=srt (or ?format=vtt) → a ready-to-use SRT/VTT caption file (plain text), segmented from the word timestamps. Only once status=done, else 409 NOT_READY (retryable). ## Optional fields on any job - `glossary`: list of your domain terms as plain strings (no weights needed) to bias decoding toward names/jargon — e.g. `["Ozempic","prior authorization"]`. Helps recover near-miss names; won't fix terms the model gets acoustically far wrong. Up to 200 terms. Echoed on GET. - `glossary_id`: id of a **saved** glossary (see below) to apply instead of resending the terms. Can be combined with inline `glossary` (merged; inline wins on a dup). 400 GLOSSARY_NOT_FOUND if unknown. - `metadata`: any JSON object; stored and echoed back on detail/listing/webhook (reconcile to your ids). - `idempotency_key`: a string that makes submit safe to retry — the same key returns the SAME job_id (no duplicate, no double charge). Use it for cron re-runs. ## Saved glossaries (create once, reference by id forever) Save a reusable term list once, then pass its `glossary_id` on jobs instead of resending terms. - `POST {BASE}/v1/glossaries` `{ "name":"clinic-terms", "terms":["Ozempic","Humira"] }` → `{ glossary_id, name, term_count }` - `GET {BASE}/v1/glossaries` → `{ count, glossaries:[{ glossary_id, name, term_count, created_at }] }` - `GET {BASE}/v1/glossaries/{id}` → the glossary with its full `terms` - `PUT {BASE}/v1/glossaries/{id}` `{ name?, terms? }` → update (a given `terms` replaces the list) - `DELETE {BASE}/v1/glossaries/{id}` → `{ deleted:true }` Glossaries are per-account and private to your API key. ## Self-serve introspection (no human needed) - `GET {BASE}/v1/usage` → { plan, credit_hours, used_hours, remaining_hours } - `GET {BASE}/v1/billing` → balance + pricing + recent top-ups + how to add credit (both rails) - `GET {BASE}/v1/jobs?limit=50&status=done&cursor=...` → recent jobs, newest first, paginated - `GET {BASE}/health` → { status: "operational" } (public, no auth) ## Reach us (no human handoff, no email needed) Found a bug, want a feature, or need more free credit? Just tell us: - `POST {BASE}/v1/contact` `{ "message":"...", "kind":"bug|feature|credit|feedback|contact", "contact":"you@x (optional)" }` → `201 { received:true, ticket_id }`. Auth is optional; include your key and we attach it for follow-up. ## Error contract (branch on these; do not parse prose) Every response carries a `request_id`. Every error is `{ error_code, error_message, retryable, hint? }`. - API errors (synchronous, on POST): BAD_REQUEST, UNAUTHORIZED (401), QUOTA_EXCEEDED (402), NOT_FOUND (404), NOT_READY (409), RATE_LIMITED (429), GLOSSARY_NOT_FOUND, URL_NOT_FOUND, URL_FORBIDDEN, FETCH_FAILED, FILE_TOO_LARGE, BATCH_TOO_LARGE, PAYMENT_INVALID (402), PAYMENT_REPLAYED (409), INTERNAL. - Job errors (on a failed job): SOURCE_NOT_FOUND, UNSUPPORTED_OR_CORRUPT, EMPTY_AUDIO, FILE_TOO_LONG, DECODE_TIMEOUT (permanent), PROCESSING_FAILED (transient — retry). - `retryable:true` (INTERNAL, FETCH_FAILED, PROCESSING_FAILED) ⇒ back off and retry; otherwise fix the input. ## Behavior an automated client should expect - Webhooks are HMAC-signed: header `X-Verbii-Signature: sha256=`. Verify it. - The backend scales from zero: the first job after idle cold-starts (1–3 min); then jobs return fast. Poll patiently. - Audio is deleted on completion and never used for training. Transcripts expire after a retention window. ## Scope (be honest with the developer) English only. No speaker diarization yet. Not for PHI/regulated data. (Native SRT/VTT *is* supported, see ?format=srt|vtt above.)