Voice and Language
Yaplet voice agents speak 17 languages with 30 different Gemini voices. This page explains how voices are grouped, how to preview them, which language to pick, and what happens when a caller switches language mid-conversation.
The 30 Voices
Voices are grouped into families based on tone and character. Within each family you'll find masculine, feminine and neutral options. Every voice can speak every supported language — picking a voice does not lock you into a specific language.
Bright & energetic
Warm, upbeat voices that work well for sales agents, retail support and consumer brands.
Calm & professional
Neutral, even-toned voices for B2B receptionists, financial services and healthcare.
Soft & friendly
Gentler voices that work well for hospitality, after-hours support and wellness brands.
Crisp & assertive
Direct, slightly faster voices for technical support and high-volume call centres.
Mature & authoritative
Lower-pitched voices for legal, public-sector or formal corporate contexts.
Casual & conversational
Relaxed voices for consumer support, food & beverage, or anything that should feel like a friend on the phone.
Playful & expressive
Lighter, more emotive voices for kid-facing brands, entertainment and gaming.
Picking a Voice
Three guidelines that work for most teams:
- Match the brand tone, not the country. A premium watch brand doesn't change voice between English and German — it changes voice between premium and friendly.
- Listen to the actual greeting, not just a generic sample. The voice picker plays the agent's begin message in the selected voice + language, so you can hear exactly what your callers will hear.
- Run a friend test. Send the greeting recording to two people outside your company. If both say "that's clearly a robot" — try a different voice family. If both say "that's a normal call" — you're done.
The 17 Languages
| Language | Code | Notes |
|---|---|---|
| English (US) | en-US | Default for the platform. |
| English (UK) | en-GB | British accent, same vocabulary handling. |
| Spanish | es | Both European and Latin American callers handled cleanly. |
| French | fr | European French; Canadian French inflections supported. |
| German | de | Standard German. |
| Italian | it | Standard Italian. |
| Portuguese (BR) | pt-BR | Brazilian Portuguese. |
| Portuguese (PT) | pt-PT | European Portuguese. |
| Dutch | nl | Standard Dutch. |
| Polish | pl | Standard Polish. |
| Hungarian | hu | Standard Hungarian. |
| Romanian | ro | Standard Romanian. |
| Czech | cs | Standard Czech. |
| Slovak | sk | Standard Slovak. |
| Greek | el | Modern Greek. |
| Turkish | tr | Standard Turkish. |
| Japanese | ja | Standard Japanese. |
| Korean | ko | Standard Korean. |
| Mandarin | zh | Simplified Mandarin. |
| Hindi | hi | Standard Hindi. |
What Happens Mid-Call
A voice agent stays in its configured language for the whole call. If a caller switches to another language:
- The agent will still reply in its configured language, but may slow down or simplify wording.
- If a workspace has multiple agents in different languages, the simplest pattern is to give each a separate number and let the caller pick.
- For occasional code-switching ("Spanish caller throws in an English brand name") the agent handles it transparently — no configuration needed.
ASR Fallback
Transcription uses Gemini Live's native ASR for the configured language. For non-English calls where Gemini's transcription is weaker, Yaplet automatically falls back to Deepgram for the transcript only — the spoken side of the conversation is unaffected. This is automatic; you never configure it.