Voice and Language

Yaplet voice agents speak 17 languages with 30 different Gemini voices. This page explains how voices are grouped, how to preview them, which language to pick, and what happens when a caller switches language mid-conversation.

The 30 Voices

Voices are grouped into families based on tone and character. Within each family you'll find masculine, feminine and neutral options. Every voice can speak every supported language — picking a voice does not lock you into a specific language.

Bright & energetic

Warm, upbeat voices that work well for sales agents, retail support and consumer brands.

Calm & professional

Neutral, even-toned voices for B2B receptionists, financial services and healthcare.

Soft & friendly

Gentler voices that work well for hospitality, after-hours support and wellness brands.

Crisp & assertive

Direct, slightly faster voices for technical support and high-volume call centres.

Mature & authoritative

Lower-pitched voices for legal, public-sector or formal corporate contexts.

Casual & conversational

Relaxed voices for consumer support, food & beverage, or anything that should feel like a friend on the phone.

Playful & expressive

Lighter, more emotive voices for kid-facing brands, entertainment and gaming.

The voice picker shows the family group and gender for every voice so you can filter quickly. You can play a sample before saving — sampling does not affect your billing.

Picking a Voice

Three guidelines that work for most teams:

  1. Match the brand tone, not the country. A premium watch brand doesn't change voice between English and German — it changes voice between premium and friendly.
  2. Listen to the actual greeting, not just a generic sample. The voice picker plays the agent's begin message in the selected voice + language, so you can hear exactly what your callers will hear.
  3. Run a friend test. Send the greeting recording to two people outside your company. If both say "that's clearly a robot" — try a different voice family. If both say "that's a normal call" — you're done.

The 17 Languages

LanguageCodeNotes
English (US)en-USDefault for the platform.
English (UK)en-GBBritish accent, same vocabulary handling.
SpanishesBoth European and Latin American callers handled cleanly.
FrenchfrEuropean French; Canadian French inflections supported.
GermandeStandard German.
ItalianitStandard Italian.
Portuguese (BR)pt-BRBrazilian Portuguese.
Portuguese (PT)pt-PTEuropean Portuguese.
DutchnlStandard Dutch.
PolishplStandard Polish.
HungarianhuStandard Hungarian.
RomanianroStandard Romanian.
CzechcsStandard Czech.
SlovakskStandard Slovak.
GreekelModern Greek.
TurkishtrStandard Turkish.
JapanesejaStandard Japanese.
KoreankoStandard Korean.
MandarinzhSimplified Mandarin.
HindihiStandard Hindi.
If you serve multiple language markets, create one voice agent per language rather than asking a single agent to switch. Each agent has its own greeting pre-generated for instant pickup — switching language mid-call costs a noticeable beat of silence.

What Happens Mid-Call

A voice agent stays in its configured language for the whole call. If a caller switches to another language:

  • The agent will still reply in its configured language, but may slow down or simplify wording.
  • If a workspace has multiple agents in different languages, the simplest pattern is to give each a separate number and let the caller pick.
  • For occasional code-switching ("Spanish caller throws in an English brand name") the agent handles it transparently — no configuration needed.

ASR Fallback

Transcription uses Gemini Live's native ASR for the configured language. For non-English calls where Gemini's transcription is weaker, Yaplet automatically falls back to Deepgram for the transcript only — the spoken side of the conversation is unaffected. This is automatic; you never configure it.