What is a token — and why Russian costs more than English
Here's a surprising thing. Take the same sentence in two languages — English and Russian. In Russian it costs the model more. Same meaning, same idea — yet it eats more money and more room.
Sounds odd, right? But in a couple of minutes you'll see why. And along the way, why your long prompt sometimes cuts off mid-sentence.
The culprit is this little chunk: the token.
A model doesn't read in words
You read in words. A model doesn't.
Before it understands anything, it chops your text into small pieces. Those pieces are tokens.
A token isn't a letter, and it isn't always a whole word. More often it's part of a word. The model splits "programming" into "program" + "ming". A short, common word like "the" or "cat" goes in as a single piece.
So a token is just the unit a model uses to measure text. Like a centimeter for length.
Why Russian costs more
Now, that thing about price.
These pieces were worked out mostly on English text. So English words break up cleanly and in big chunks: a whole word is often one or two pieces.
Russian fares worse. Cyrillic is "unfamiliar" to the model, so it crumbles it fine.
Roughly:
- in English, one token is about 4 characters, around ¾ of a word;
- in Russian, often half that — around 2 characters. A single word easily breaks into 3–4 pieces.
The takeaway is simple: the same idea in Russian is more tokens. Which means more expensive and "heavier" for the model.
Where it hits you
Tokens aren't some abstraction from a pricing page. They decide two very down-to-earth things.
- Price. Paid models charge by the token: separately for what you send and what comes back. More pieces, bigger bill.
- Limit. A model holds only so many tokens at once. Hit the ceiling and the oldest part "falls out" — the model forgets it.
That's why a long Russian prompt sometimes cuts off mid-sentence, or the model "forgets" the start of the chat. It's not being difficult — the pieces just ran out.
What to do about it
You don't need to count tokens by hand. A couple of habits are enough.
- Write shorter. Extra politeness and filler are pieces too. Cut them and the meaning survives.
- Don't paste the whole file. Drop in just the part you need, not all hundred pages.
- For bulk work, English can be cheaper. If you're running a model over a big text and paying per token, the same thing comes out smaller in English.
Once it clicks that a model measures everything in pieces, neither "limits" nor "price per 1,000 tokens" feels scary anymore. You just see what you're paying for.
Short story-lessons, an agent simulator and daily practice — in our mobile app. Free.