STT Engine · Early Access Q3 2026

The First Uzbek-Native
Speech-to-Text Engine

The same engine that powers Bayonic VOCO — soon available as a standalone API. 98% accuracy on Uzbek, native UZ↔RU code-switching, custom vocabulary. Built for enterprise, deployable on-premise.

Request Early Access See it Live in VOCO

98%

Uzbek WER

Languages

< 1.2s

Latency

On-Prem

Deployment

Why It Matters

Built for the language no one else trained on

Global STT vendors trained on English, Mandarin, Spanish. Uzbek — spoken by 35M+ people across Central Asia — was an afterthought, if it was supported at all. Bayonic built ours from scratch on real call-center audio.

35M+

Native Uzbek speakers

Across Uzbekistan and the diaspora — and zero global STT vendors with serious accuracy.

Scripts supported

Latin and Cyrillic — both are in active use today; both are recognized natively.

UZ↔RU

Code-switching

Real Central Asian speech mixes UZ and RU mid-sentence. Our model handles it natively.

10k+

Hours of training data

Real-world contact-center audio from Uzbekistan — the largest such corpus that exists.

Engine Capabilities

What the API Will Deliver

When standalone access opens, you'll get the full feature set already running in Bayonic VOCO — exposed as a clean API.

Trilingual Native

Uzbek (Latin & Cyrillic), Russian, English — single model, single endpoint, automatic language detection.

Code-Switching

UZ↔RU within the same utterance handled natively. Other vendors require manual language flags per call.

Real-Time Streaming

WebSocket and gRPC streaming under 1.2s latency. Suitable for live agent assist and IVR routing.

Speaker Diarization

Agent and customer voices separated even on a single mono channel — no special audio capture needed.

Custom Vocabulary

Upload product names, brand terms, internal jargon. Recognition accuracy on those terms approaches 100%.

On-Prem & Cloud

Choose your deployment. Air-gap install for banks and government; managed cloud for fast prototyping.

Roadmap

Where we are. Where we're going.

The engine is already in production — embedded in Bayonic VOCO. Standalone API access opens in Q3 2026.

Shipped · 2024

v1 Engine — embedded in VOCO

First trilingual model trained and deployed. Today running across 50+ enterprise contact centers in Uzbekistan and Kazakhstan.

Shipped · 2025

v2 — 98% UZ accuracy, code-switching

Major retraining cycle on 10k+ hours of real call-center audio. Native UZ↔RU code-switching added; latency dropped under 1.2s.

In progress · Q2 2026

Standalone API — closed beta

A small group of partners (banking, telecom, healthcare) gets API access for pilot integrations. Rate limits, SLAs and pricing tested.

Planned · Q3 2026

General Availability — public API

Self-serve API access opens. REST and streaming endpoints. Per-second pricing, with on-prem licensing for regulated sectors.

Planned · 2027

v3 — domain models + Karakalpak, Tajik

Domain-tuned models (medical, legal, financial). Expansion to Karakalpak and Tajik for full Central Asian coverage.

Early Access Waitlist

Be First on the API

We're onboarding a small set of design partners for the closed beta in Q2 2026. Banks, telecoms, healthcare and government deployments get priority. Tell us about your use case.

Closed beta priority

Founder-led onboarding

Custom vocabulary tuning

Locked launch pricing

The First Uzbek-NativeSpeech-to-Text Engine

Built for the language no one else trained on

What the API Will Deliver

Where we are. Where we're going.

The First Uzbek-Native
Speech-to-Text Engine