What Is a Token? (And Why It Matters for Understanding AI)
When you send a message to an AI tool, something happens before it does anything with your words. It breaks them apart. Not into letters, and not exactly into words either, but into chunks called tokens. That process happens invisibly and instantly, but it shapes everything about how the AI reads your input and generates a response.
Tokens are the basic unit of text that AI language models work with. Understanding what they are takes about two minutes. Understanding what they explain about AI behavior takes a little longer, and it's worth it.
A token is roughly equivalent to a word, but not exactly. Common short words like "the" or "and" are usually one token each. Longer or less common words often get split into two or more tokens. The word "tokenization" might be broken into "token" and "ization." A word in a language the model has less training data for might be broken into even smaller pieces. Punctuation, spaces, and line breaks are tokens too. As a rough rule of thumb, one token is approximately four characters of English text, which works out to about 75 words per 100 tokens.
Why does this matter? Because AI language models don't process unlimited amounts of text at once. Every model has what's called a context window, which is the maximum number of tokens it can hold in its working memory at one time. That includes your input, any background instructions, and the response it's generating. When you hit that limit, the model can't see what came before it. In a long conversation, earlier parts of the exchange can effectively disappear from the model's awareness. This is why AI tools sometimes seem to forget what you told them at the start of a long session. They haven't malfunctioned. They've simply run out of room.
Tokens also explain why AI tools aren't free to operate at scale. The cost of running a language model is largely calculated in tokens, both the ones coming in and the ones going out. When organizations think about deploying AI across their operations, token consumption is one of the main variables they're managing. A system that processes thousands of customer queries a day is processing millions of tokens, and that has a direct cost attached to it. Understanding tokens is part of understanding the economics of AI at scale.
There's a practical implication here for anyone using AI tools at work. The way you write your prompts affects how many tokens you use, and more importantly, it affects how much of the context window is available for the model to work with. Verbose instructions, long pasted documents, and extended back-and-forth conversations all consume tokens. Being concise isn't just good communication practice when working with AI. It's also more efficient in a technical sense.
None of this requires any technical background to act on. You don't need to count tokens or think about them consciously in most situations. But knowing they exist, knowing that AI models have a working memory limit measured in them, and knowing that the cost and behavior of AI systems is bound up in them gives you a more accurate mental model of what's actually happening when you use these tools. And a more accurate mental model tends to make you better at using them.