Generative AI

What is Token?

In AI, a token is the basic unit of text that a language model processes. Tokens can be whole words, parts of words, or punctuation marks. Understanding tokens is essential for managing AI costs, context window limits, and performance, as most AI services charge and measure capacity in tokens.

What Is a Token in AI?

A token is the smallest unit of text that an AI language model works with. When you type a message to an AI assistant, your text is not processed as whole sentences or even whole words. Instead, it is broken into tokens -- chunks that might be complete words, parts of words, or individual characters -- before the AI can understand and respond.

For example, the word "understanding" might be split into two tokens: "understand" and "ing." Common short words like "the" or "is" are typically single tokens. Unusual or technical words might be broken into several tokens. Numbers, punctuation marks, and spaces are also tokens.

This might seem like a technical detail, but tokens directly affect your AI costs, what fits in a conversation, and how fast the AI responds. Most AI services price their APIs per token, and context windows (the maximum amount of text the AI can handle at once) are measured in tokens.

How Tokenization Works

The process of converting text into tokens is called tokenization. Different AI models use different tokenization methods, but most modern models use a technique called Byte Pair Encoding (BPE) or a variant of it.

Here is what happens behind the scenes:

Your text is received: "How can we improve customer retention?"
The tokenizer splits it: ["How", " can", " we", " improve", " customer", " retention", "?"]
Each token gets a number: [2437, 649, 584, 7417, 6130, 34689, 30]
The model processes these numbers: The AI works entirely with these numerical representations
The response is generated as tokens: The model outputs a sequence of token numbers that are converted back to readable text

Practical token counts for common content:

A short email (100 words): approximately 130-150 tokens
A one-page business document (500 words): approximately 650-750 tokens
A 10-page report (5,000 words): approximately 6,500-7,500 tokens
A full-length book (80,000 words): approximately 100,000-110,000 tokens

Why Tokens Matter for Business

Cost Management AI API pricing is almost always based on token usage. OpenAI, Anthropic, Google, and other providers charge per 1,000 or per million tokens processed. Understanding token counts helps you predict and control AI costs. A query that includes a 20-page document costs significantly more than a simple question because both the input tokens (your document) and output tokens (the AI's response) are counted.

Typical pricing examples:

GPT-4o: approximately USD 2.50 per million input tokens
Claude 3.5 Sonnet: approximately USD 3 per million input tokens
Gemini 1.5 Pro: approximately USD 1.25 per million input tokens

Context Window Planning Every AI model has a maximum number of tokens it can process at once. If your business workflow involves analyzing long documents, the token count determines whether the entire document fits in a single AI interaction or needs to be broken into parts.

Response Quality The AI's response also consumes tokens from the context window. If you fill most of the context window with input, there is less room for a detailed response. Balancing input length with the desired output length is a practical consideration for workflow design.

Token Considerations for Southeast Asian Languages

An important consideration for ASEAN businesses is that tokenization efficiency varies significantly across languages. Most AI models were primarily trained on English text, which means their tokenizers are optimized for English. Southeast Asian languages often require more tokens to represent the same amount of content:

English: Approximately 1 token per word
Thai: Can require 2-4 tokens per word equivalent due to the script and lack of spaces
Vietnamese: Approximately 1.5-2 tokens per word due to diacritical marks
Bahasa Indonesia/Malay: Approximately 1.2-1.5 tokens per word (closer to English due to Latin script)
Chinese characters: Typically 1-2 tokens per character

This means that processing Thai or Vietnamese text can cost 2-3 times more than the equivalent English text, and documents in these languages consume more of the context window. Factor this into your cost projections and workflow design.

Managing Token Usage Effectively

Practical strategies for business teams:

Be concise in prompts: Clear, direct instructions use fewer tokens than verbose ones and often produce better results
Summarize before analyzing: For long documents, consider using a first pass to create a summary, then analyze the summary for detailed questions
Use retrieval systems: Rather than sending entire databases to the AI, use RAG to select only the relevant sections, dramatically reducing token usage
Monitor usage: Most AI platforms provide dashboards showing token consumption, which should be reviewed regularly to avoid cost surprises

Why It Matters for Business

Tokens are the currency of AI -- they determine what your AI tools cost, how much information they can process at once, and how fast they respond. Understanding tokens enables business leaders to budget accurately for AI services, design efficient workflows, and avoid unexpected costs as AI usage scales across the organization.

Key Considerations

Monitor token usage from the start of any AI deployment, as costs can scale quickly when multiple team members or automated processes are making API calls simultaneously
Account for the higher token costs of Southeast Asian languages like Thai and Vietnamese, which can require two to three times more tokens than English for the same content
Design AI workflows that minimize unnecessary token consumption by using concise prompts, summarization techniques, and retrieval systems rather than sending entire documents for every query

Frequently Asked Questions

How can I check how many tokens my text uses?

Most AI platforms provide token counting tools. OpenAI offers a free tokenizer tool on their website where you can paste text and see the exact token count. Anthropic and Google provide similar utilities. As a quick estimate, English text averages about 1.3 tokens per word. For Southeast Asian languages, multiply by 1.5 to 3 depending on the language. Many AI API libraries also include token counting functions you can use programmatically.

Why do AI companies charge by tokens instead of by message?

Token-based pricing reflects the actual computational cost of processing AI requests. A simple one-sentence question requires far fewer computations than analyzing a 50-page document, even though both are single messages. Token pricing ensures that users pay proportionally to the resources consumed. For businesses, this model can actually be advantageous because you only pay for what you use rather than a flat rate that might not match your actual usage patterns.

Need help implementing Token?

Pertama Partners helps businesses across Southeast Asia adopt AI strategically. Let's discuss how token fits into your AI roadmap.

Book a Consultation Browse AI Glossary