Drop a whole book into a model — and ask without chunk-searching

Here's a weekend idea: take a big file — a contract, your lecture notes, a year of your journal — drop the whole thing into a model and ask in plain words. "Where does it talk about the late-payment penalty?" And it answers, with the quote.
And here's the catch. Not long ago this wasn't this easy.
Why this just became possible
A big text didn't fit "in the model's head" at once. So you sliced it into chunks, stuffed them in a database, and searched for the right scrap for each question. That's called RAG — and for a beginner it was a half-day build of its own.
On June 13, GLM-5.2 shipped: a million tokens of context, open weights under MIT, and a dirt-cheap API. A million tokens is how much text the model holds "in its head" in a single request. A whole book fits in there — several, even.
So for an average file, slicing and searching is no longer needed. Drop it whole — and ask.
What you'll learn
- What a context window is. "A million tokens" isn't about speed — it's about how much text the model sees at once. Build this and you'll finally feel what that means in practice.
- How this differs from RAG. Chunk-search isn't magic, and it isn't the only path. For a file that fits whole, you simply don't need it. Worth knowing.
- One API call with a big input. The whole file goes into the request, your one question on top. The answer comes back. Same "send → receive" loop, just a big input.
One honest caveat: non-English text eats more tokens than English, so count the capacity on the low side. And if a file is truly huge — years of chats — it won't fit whole, and you'll want a notes assistant with search.
A ready starter prompt
Don't tell the agent "let me ask questions about my file" — it'll guess the format and drag in RAG you don't need here. Give it context, a model choice, and boundaries:
Make it so I can ask questions about my PDF.The strong prompt leaves no room for guessing: the flow is visible, the big-window model is chosen, the quote requirement and an honest "not found" are spelled out. The first result lands closer to what you wanted.
What you end up with
A little console window. You point at a file and ask: "which clause covers auto-renewal?" — and get an answer with the exact paragraph quoted. You ask your year-long journal: "what did I write about that running idea?" — and it finds and stitches it together.
The magic is on the outside. Inside it's one request: the whole file plus your question.
The weekend plan
- Saturday: a script that reads one file and asks it one question. Test it on a contract or your notes.
- Sunday: wrap it in a simple chat — ask questions about one file back to back without restarting. Hand it to a friend with their PDF.
This is the kind of project that stays in your toolkit: boring long files stop being scary.
Short story-lessons, an agent simulator and daily practice — in our mobile app. Free.





