STT Engine · Early Access Q3 2026

The First Uzbek-Native
Speech-to-Text Engine

The same engine that powers Bayonic VOCO — soon available as a standalone API. 98% accuracy on Uzbek, native UZ↔RU code-switching, custom vocabulary. Built for enterprise, deployable on-premise.

98%
Uzbek WER
3
Languages
< 1.2s
Latency
On-Prem
Deployment
Why It Matters

Built for the language no one else trained on

Global STT vendors trained on English, Mandarin, Spanish. Uzbek — spoken by 35M+ people across Central Asia — was an afterthought, if it was supported at all. Bayonic built ours from scratch on real call-center audio.

35M+
Native Uzbek speakers
Across Uzbekistan and the diaspora — and zero global STT vendors with serious accuracy.
2
Scripts supported
Latin and Cyrillic — both are in active use today; both are recognized natively.
UZ↔RU
Code-switching
Real Central Asian speech mixes UZ and RU mid-sentence. Our model handles it natively.
10k+
Hours of training data
Real-world contact-center audio from Uzbekistan — the largest such corpus that exists.
Engine Capabilities

What the API Will Deliver

When standalone access opens, you'll get the full feature set already running in Bayonic VOCO — exposed as a clean API.

Trilingual Native
Uzbek (Latin & Cyrillic), Russian, English — single model, single endpoint, automatic language detection.
Code-Switching
UZ↔RU within the same utterance handled natively. Other vendors require manual language flags per call.
Real-Time Streaming
WebSocket and gRPC streaming under 1.2s latency. Suitable for live agent assist and IVR routing.
Speaker Diarization
Agent and customer voices separated even on a single mono channel — no special audio capture needed.
Custom Vocabulary
Upload product names, brand terms, internal jargon. Recognition accuracy on those terms approaches 100%.
On-Prem & Cloud
Choose your deployment. Air-gap install for banks and government; managed cloud for fast prototyping.
Roadmap

Where we are. Where we're going.

The engine is already in production — embedded in Bayonic VOCO. Standalone API access opens in Q3 2026.

Shipped · 2024
v1 Engine — embedded in VOCO
First trilingual model trained and deployed. Today running across 50+ enterprise contact centers in Uzbekistan and Kazakhstan.
Shipped · 2025
v2 — 98% UZ accuracy, code-switching
Major retraining cycle on 10k+ hours of real call-center audio. Native UZ↔RU code-switching added; latency dropped under 1.2s.
3
In progress · Q2 2026
Standalone API — closed beta
A small group of partners (banking, telecom, healthcare) gets API access for pilot integrations. Rate limits, SLAs and pricing tested.
4
Planned · Q3 2026
General Availability — public API
Self-serve API access opens. REST and streaming endpoints. Per-second pricing, with on-prem licensing for regulated sectors.
5
Planned · 2027
v3 — domain models + Karakalpak, Tajik
Domain-tuned models (medical, legal, financial). Expansion to Karakalpak and Tajik for full Central Asian coverage.
Early Access Waitlist
Be First on the API

We're onboarding a small set of design partners for the closed beta in Q2 2026. Banks, telecoms, healthcare and government deployments get priority. Tell us about your use case.

Closed beta priority
Founder-led onboarding
Custom vocabulary tuning
Locked launch pricing