What is a context window — and why AI forgets the start of a conversation
Every model has a limit: how much text it can hold "in its head" at once. That limit is the context window. Get this one idea, and half of AI's odd behavior stops surprising you — like forgetting what you agreed on at the start.
The window is the model's desk
Picture a desk that fits a limited number of pages. Everything on the desk, the model sees at once: your question, the chat history, attached files, its own upcoming answer. Whatever doesn't fit, the model doesn't see at all.
Window size is measured in tokens (we cover those separately). Some models hold thousands, others millions. But there's always a limit.
Why the model "forgets"
Once a conversation grows longer than the window, the oldest page "falls off the desk" to make room for the new one. The model doesn't remember it — for the model, it no longer exists. That's why the AI loses the thread in a long chat: the beginning got pushed out.
The model isn't being "lazy" or "difficult." It simply can't see what didn't fit in the window.
How to use this
- Key things first or last. Don't bury important instructions in the middle of a wall of text.
- Don't dump extra. The less clutter in the window, the sharper the answer — and the cheaper (see tokens).
- New task, new chat. Old context won't get in the way or confuse the model.
Once you hold the picture of a desk with pages, a lot of AI "quirks" become predictable — and manageable.
Short story-lessons, an agent simulator and daily practice — in our mobile app. Free.