AI Models

Configure which AI models your organization uses, understand the two-model system (base and low), compare provider pricing, and learn how AI token usage is tracked and billed across all Yaplet features.

The Two-Model System

Base Model

Handles complex, resource-intensive tasks that require deep reasoning and longer outputs. Used for chatbot answer generation, AI suggestions, and content creation.

Low Model

Handles fast, lightweight tasks that need quick turnaround. Used for grammar correction, message rephrasing, conversation summaries, and supporting operations that make the base model more efficient.

Which features use which model?

FeatureModelExamples
Vex chatbot streaming answersBaseGenerating real-time responses to visitor questions
AI content generationBaseWriting knowledge base articles, generating suggestions
Inbox AI ToolsLowGrammar check, rephrase, translate
Conversation summariesLowSummarizing chat history for context
Chatbot context retrievalLowSearching knowledge base for relevant answers
The Generate Modal — available across several AI features — lets you temporarily override the model for a single request. This is useful when you want to test a more powerful model for a specific task without changing your organization's defaults.

Configuring Your Models

Navigate to Settings > AI Models to view and change your organization's model configuration.

Current Models

The settings page displays your active base and low models, including their provider name and per-million-token pricing for all cost types:

  • Input — cost per million input (prompt) tokens
  • Output — cost per million output (response) tokens
  • Cached Input — discounted rate for cached prompt tokens
  • Reasoning — cost for advanced reasoning tokens (used by models like OpenAI o-series or Claude with extended thinking)

Each model shows a badge indicating whether it's the Yaplet Default or a Custom selection.

Changing Models

Select new models

Use the dropdown menus to pick a new base model, low model, or both. Models are grouped by provider (OpenAI, Anthropic, Google, xAI) with pricing shown for each option.

Review the cost comparison

When you select a different model, a comparison table appears showing current vs. selected pricing for each cost type. Increases are highlighted in red, decreases in green — so you can see the cost impact at a glance.

Confirm and apply

Click Update Models and confirm the change. The new models take effect immediately for all AI tasks across your organization.

If you leave a model set to Use Yaplet's Default, your organization will automatically benefit from model upgrades whenever Yaplet updates its recommended models — no action needed on your part.

Supported Providers

Yaplet integrates with leading AI providers through a unified interface. You don't need API keys or separate accounts — all provider access is managed through your Yaplet subscription.

ProviderStrengths
OpenAIBroad model range, strong general-purpose performance
AnthropicAdvanced reasoning, large context windows
GoogleCompetitive pricing, multimodal capabilities
xAIFast inference, latest research models
The available models and pricing are managed by Yaplet and updated regularly as new models are released. Check the settings page for the current selection.

Token Usage & Billing

Every AI operation consumes tokens — small units of text that models process. Yaplet tracks usage across all features and bills using a unified currency called Yaplet Tokens.

How token billing works

Tokens are consumed

Each AI request uses input tokens (your prompt) and output tokens (the model's response). Some models also use cached input tokens and reasoning tokens.

Costs are normalized

Because different token types have different prices, all usage is converted to input-token equivalents using the model's pricing ratios. This gives you a single, comparable number.

Billed as Yaplet Tokens

The normalized usage is converted to Yaplet Tokens at a rate of $0.10 per million tokens. This unified pricing makes it easy to understand costs regardless of which model or provider you use.

Token cost types explained

Token TypeWhat It IsWhen It Applies
InputTokens in your prompt and contextEvery request
OutputTokens the model generatesEvery request
Cached InputRepeated prompt tokens served from cache (lower cost)When conversation context is reused
ReasoningInternal thinking tokens used for complex problem-solvingModels with reasoning capabilities (e.g., OpenAI o-series)
You can track your organization's AI token usage over time in the Reports dashboard. For billing details, see Billing & Payments.

Free trial allowance

Trial accounts include 5 million free AI tokens to explore all AI features. Once the trial limit is reached, you'll need an active subscription to continue using AI-powered features. See Plans & Pricing for details.

Where AI Is Used in Yaplet

AI models power features across the entire platform. Here's a quick overview of where your model choices have an impact:

Vex AI Chatbot

Your base model generates real-time streamed answers to visitor questions, using your knowledge base, products, and API tools as context.

Inbox AI Tools

Your low model powers grammar checking, rephrasing, and other message enhancement tools available to agents in the inbox.

AI Responses

Track how your chatbot performs, monitor response quality, and identify areas where your AI context can be improved.

Reports

View daily AI token consumption broken down by model, and monitor usage trends over time.

Tips for Choosing Models