AI Models | Yaplet

The Two-Model System

Base Model

Handles complex, resource-intensive tasks that require deep reasoning and longer outputs. Used for chatbot answer generation, AI suggestions, and content creation.

Low Model

Handles fast, lightweight tasks that need quick turnaround. Used for grammar correction, message rephrasing, conversation summaries, and supporting operations that make the base model more efficient.

Which features use which model?

Feature	Model	Examples
Vex chatbot streaming answers	Base	Generating real-time responses to visitor questions
AI content generation	Base	Writing knowledge base articles, generating suggestions
Inbox AI Tools	Low	Grammar check, rephrase, translate
Conversation summaries	Low	Summarizing chat history for context
Chatbot context retrieval	Low	Searching knowledge base for relevant answers

The Generate Modal — available across several AI features — lets you temporarily override the model for a single request. This is useful when you want to test a more powerful model for a specific task without changing your organization's defaults.

Configuring Your Models

Navigate to Settings > AI Models to view and change your organization's model configuration.

Current Models

The settings page displays your active base and low models, including their provider name and per-million-token pricing for all cost types:

Input — cost per million input (prompt) tokens
Output — cost per million output (response) tokens
Cached Input — discounted rate for cached prompt tokens
Reasoning — cost for advanced reasoning tokens (used by models like OpenAI o-series or Claude with extended thinking)

Each model shows a badge indicating whether it's the Yaplet Default or a Custom selection.

Changing Models

Select new models

Use the dropdown menus to pick a new base model, low model, or both. Models are grouped by provider (OpenAI, Anthropic, Google, xAI) with pricing shown for each option.

Review the cost comparison

When you select a different model, a comparison table appears showing current vs. selected pricing for each cost type. Increases are highlighted in red, decreases in green — so you can see the cost impact at a glance.

Confirm and apply

Click Update Models and confirm the change. The new models take effect immediately for all AI tasks across your organization.

If you leave a model set to Use Yaplet's Default, your organization will automatically benefit from model upgrades whenever Yaplet updates its recommended models — no action needed on your part.

Supported Providers

Yaplet integrates with leading AI providers through a unified interface. You don't need API keys or separate accounts — all provider access is managed through your Yaplet subscription.

Provider	Strengths
OpenAI	Broad model range, strong general-purpose performance
Anthropic	Advanced reasoning, large context windows
Google	Competitive pricing, multimodal capabilities
xAI	Fast inference, latest research models

The available models and pricing are managed by Yaplet and updated regularly as new models are released. Check the settings page for the current selection.

Token Usage & Billing

Every AI operation consumes tokens — small units of text that models process. Yaplet tracks usage across all features and bills using a unified currency called Yaplet Tokens.

How token billing works

Tokens are consumed

Each AI request uses input tokens (your prompt) and output tokens (the model's response). Some models also use cached input tokens and reasoning tokens.

Costs are normalized

Because different token types have different prices, all usage is converted to input-token equivalents using the model's pricing ratios. This gives you a single, comparable number.

Billed as Yaplet Tokens

The normalized usage is converted to Yaplet Tokens at a rate of $0.10 per million tokens. This unified pricing makes it easy to understand costs regardless of which model or provider you use.

Token cost types explained

Token Type	What It Is	When It Applies
Input	Tokens in your prompt and context	Every request
Output	Tokens the model generates	Every request
Cached Input	Repeated prompt tokens served from cache (lower cost)	When conversation context is reused
Reasoning	Internal thinking tokens used for complex problem-solving	Models with reasoning capabilities (e.g., OpenAI o-series)

You can track your organization's AI token usage over time in the Reports dashboard. For billing details, see Billing & Payments.

Free trial allowance

Trial accounts include 5 million free AI tokens to explore all AI features. Once the trial limit is reached, you'll need an active subscription to continue using AI-powered features. See Plans & Pricing for details.

Where AI Is Used in Yaplet

AI models power features across the entire platform. Here's a quick overview of where your model choices have an impact:

Vex AI Chatbot

Your base model generates real-time streamed answers to visitor questions, using your knowledge base, products, and API tools as context.

Inbox AI Tools

Your low model powers grammar checking, rephrasing, and other message enhancement tools available to agents in the inbox.

AI Responses

Track how your chatbot performs, monitor response quality, and identify areas where your AI context can be improved.

Reports

View daily AI token consumption broken down by model, and monitor usage trends over time.