A pocket translator that keeps pace with your speech

Here's the idea in one line: a tiny translator app. You speak Russian into your phone — and English (or any other language) flows out of it while you're still talking. Not "said it → sent it → waited → heard it", but almost simultaneous, like a live interpreter in your earpiece.
And here's the trick — this didn't work before. A normal translator waited until you finished, then sent the whole thing and held a pause. In May, OpenAI shipped a voice model, gpt-realtime-translate: it translates speech as a stream, keeping pace with the speaker — 70+ input languages. That's what the whole "simultaneous" feel rides on.
Why this one
There are a thousand translators in the store. But almost all of them are "type text, get text". A live conversation doesn't go like that — while you type, the other person has moved on. A voice that doesn't lag is a different experience. You'll want to show friends and take it on a trip.
And there's not much "magic" here. The app is a pipe: it grabbed sound from the mic, streamed it to the model, plays back the translation. One voice model does all the work.
What you'll learn
- Voice as input. Not text, not an image, but live sound from the mic. A completely different kind of data — and your first time working with it.
- Streaming instead of "request-response". The familiar "send it all → get it all" loop doesn't apply. Sound flows in chunks, and the translation flows back. That's what realtime means.
- "The prompt is a setting." You tell the model the target language and the tone ("translate calmly, informal") — and the behavior changes without a single line of logic.
A ready starter prompt
Don't ask the agent for "a voice translator" — it'll drown in library choices. Give it the flow, the model and the limits:
Make an app that translates my speech to English by voice.The strong prompt leaves no room to guess: the flow is visible, the exact model is visible, and so is what the button does. The first attempt lands much closer to what you wanted.
What you end up with
You're in a café abroad. You press the button, say "where's the nearest pharmacy" — and the phone says it in English before you've even finished the sentence. The waiter answers — you flip the toggle, and now the translation flows back the other way. The conversation runs with almost no pauses. And you built it yourself over a weekend.
Start with one button, see it through — and you'll have a translator you're not embarrassed to take on a trip.
Short story-lessons, an agent simulator and daily practice — in our mobile app. Free.





