Token Usage Policy

What is a Token in the Context of ModelXpert AI Platform?

A token is a unit of text that AI models use to process information for both input (when you send a prompt) and output (when the model generates a response). A token can be a word, part of a word, or a few characters.

Each AI model (ChatGPT, Gemini, Claude, Perplexity, DeepSeek, Grok) has its own tokenization method to count tokens.

Token Estimation

Although it depends on the specific model, for simplicity:

  • Each token is approximately ¾ of a word
  • 750 words equal approximately 1,000 tokens

Token Usage for Different Features

Thinking Models:

When a thinking model is used, tokens are consumed during the AI's reasoning process. The number of tokens depends on:

  • The complexity of the task
  • The reasoning level selected (low, medium, or high)

Image Generation:

Tokens are also counted when generating images.


How Are Tokens Counted?

Input Tokens

When you send a prompt, tokens are counted based on the length of words or characters:

  • Approximately 1 token = ¾ of a word
  • For example, 75 words = approximately 100 tokens

Output Tokens

For Non-Reasoning Models:

  • Token counting is similar to input
  • Approximately 1 token = ¾ of a word
  • For example, a 300-word response = approximately 400 tokens

For Thinking/Reasoning Models:

  • Output tokens include thinking tokens + response tokens
  • Example: 1,000 thinking tokens + 400 response tokens = 1,400 total output tokens

Practical Example

Scenario 1: Using ChatGPT-5 Nano (No-Reasoning Model)

Your Prompt: "How does ChatGPT generate responses?"

Input tokens counted: 7 tokens

ChatGPT Response: "ChatGPT generates responses using a type of artificial intelligence... refined by humans to give meaningful and safe answers — not just random text." (338 words)

Output tokens counted: 441 tokens

Scenario 2: Using ChatGPT-5 Reasoning Model

Same Prompt: "How does ChatGPT generate responses?"

  • Input tokens: 7 tokens (same as before)
  • Response tokens: 441 tokens (same as before)
  • Thinking tokens: 1,000 tokens (depending on task complexity)
  • Total output tokens: 1,441 tokens

Chat History and Context

When you continue chatting in the same conversation, the AI remembers previous context. Here's how tokens are counted:

Second Prompt in Same Chat: "What are the limitations of ChatGPT's knowledge and reasoning?"

The system counts:

  • Previous chat history as input: 7 (previous input) + 441 (previous output) = 448 tokens
    Note: Previous thinking tokens are NOT included in history
  • Current prompt: 12 tokens
  • Total input tokens: 448 + 12 = 460 tokens
  • Output tokens: Counted based on the new response generated

Important Notes:

  • As you continue chatting in the same conversation, input tokens increase with each prompt because the entire chat history is sent to maintain context
  • This is why we provide 6 million input tokens — which is sufficient for extensive conversations
  • When you start a fresh chat, no history is sent, so input token counting starts from zero

How Are Tokens Counted on ModelXpert for Standard & Premium Models?

Standard Models

Token usage for Standard models is counted as 1x — the same as the actual tokens consumed by the model for both input and output.

Exception: Perplexity models are counted at 2.5x the actual token usage.

Premium Models

Token usage for Premium models is counted as 4x the actual tokens consumed for both input and output.

Exception: Perplexity Sonar Pro is counted at 6x the actual token usage.

Detailed Token Counting by Model

Note: All models share the same base allocation of 8 Million total tokens (6 Million Input + 2 Million Output). The rate multiplier determines how much of this allocation you can actually use with each model.

CategoryModelsRateEffective Usage
StandardGemini-2.5-Flash
Gemini-2.5-Flash-Lite
GPT-5 Nano
GPT-5 Mini
Claude Haiku 3
Grok-4.1 Fast Reasoning
Grok-4.1 Fast Non-Reasoning
Grok-4 Fast Reasoning
Grok-4 Fast Non-Reasoning
Grok-3 Mini
Deepseek Chat
Deepseek Reasoner
1x8 Million Total
6 Million Input
2 Million Output
StandardPerplexity Sonar
Perplexity Sonar Reasoning
2.5x3.2 Million Total
2.4 Million Input
800,000 Output
PremiumGemini-2.5-Pro
Gemini-3 Low Reasoning
Gemini-3 High Reasoning
GPT-5.1 High Reasoning
GPT-5.1 Medium Reasoning
GPT-5.1 Low Reasoning
Claude Sonnet 4.5
Claude Sonnet 4
Sonar Reasoning Pro
Grok-4
4x2 Million Total
1.5 Million Input
500,000 Output
PremiumPerplexity Sonar Pro6x1,333,000 Total
1 Million Input
333,000 Output
Image GenerationGemini 2.5 Flash Image (Nano Banana)6x338,000 Total (Output)
Only counted as Output tokens

Note: Per image generation consumes approximately 1,300 actual tokens.


Reasoning and Non-Reasoning Models

You should know which models are reasoning and non-reasoning because reasoning models consume more tokens as the AI uses additional tokens for thinking and processing complex tasks.

ProviderReasoning ModelsNon-Reasoning Models
ChatGPTGPT-5.1 High Reasoning
GPT-5.1 Medium Reasoning
GPT-5.1 Low Reasoning
GPT-5 Nano
GPT-5 Mini
GeminiGemini-2.5-Pro
Gemini-3 Low Reasoning
Gemini-3 High Reasoning
Gemini-2.5-Flash
Gemini-2.5-Flash-Lite
ClaudeClaude Sonnet 4.5
Claude Sonnet 4
Claude Haiku 3
GrokGrok-4
Grok-4.1 Fast Reasoning
Grok-4 Fast Reasoning
Grok-3 Mini
Grok-4.1 Fast Non-Reasoning
Grok-4 Fast Non-Reasoning
PerplexityPerplexity Sonar Reasoning
Perplexity Sonar Reasoning Pro
Perplexity Sonar
Perplexity Sonar Pro
DeepseekDeepseek ReasonerDeepseek Chat