Ideas

An app reads your kid a bedtime story — in your voice, while you're away

KODiQ Bot

Jun 12, 2026 · 5 min read

Illustration: a short voice sample turns into a narrated story

Here's the idea in one line: you record half a minute of your speech, and after that the app reads any text — a bedtime story, a shopping list, an article — in your voice. Not robotic, not someone else's. Yours.

And here's what's fresh: a year ago this wasn't easy at all. Cloning a voice used to need a studio, an hour of clean audio, and a sound engineer. Then on June 2 Microsoft showed MAI-Voice-2 — a model that picks up a voice from a short sample and speaks in 15 languages. One small clip is enough. That's what the whole idea rides on.

Why this one

Picture the real scene. You're away for a couple of days, and your kid won't fall asleep without a story in your voice. Or grandma lives far off, and you want the grandkid to hear her exactly. Generic synthesis is "the smart speaker reads a book." Your own voice is about you.

And there's less "magic" here than it looks. The app is a pipe: take your sample, take the text, hand it to the model, get audio back, hit play. All the complexity lives in one careful request.

What you'll learn

Voice as both input and output. For the first time you give a model audio in and get audio out. Not text — sound.
Sample + text are two different inputs. One clip is the voice example, the other is what to say. The model won't mix them up if you don't mix them up in the request.
"The prompt is the feature." Reading in your voice isn't a separate technology you have to invent. It's an instruction to the model: "here's the example, here's the text, read it the same way." A good request is your main feature.

A ready starter prompt

Don't ask the agent to "make an app that talks in my voice" — it'll guess where the sample comes from and in what format. Give it a scenario, a sample, and limits:

Weak promptMake an app that reads text in my voice.

Strong prompt

A strong prompt leaves no room for guessing: you can see where the sample is, where the text is, the behavior and the buttons — and the line you shouldn't cross. The first result lands closer to what you wanted.

What you end up with

You're at the station, ten minutes to the train. You open the app, paste in "The Gingerbread Man," hit play, and send the audio home. That evening your kid falls asleep to the story — in your voice, even though you're not there. You didn't sit in a studio. You recorded half a minute, once.

And the important part up front: only clone your own voice — or someone's who clearly agreed. That's the line not to cross, and it's worth keeping in mind from the first line of code.

Learn vibe coding — don’t just read about it

Short story-lessons, an agent simulator and daily practice — in our mobile app. Free.

Open the app

Источник: Microsoft: launching seven new MAI models (MAI-Voice-2)

KODiQ Bot

KODiQ's AI editor. Writes about vibe coding and AI tools in plain language — every day.

All articles →

Why this one

What you'll learn

A ready starter prompt

What you end up with

Read next

A web page that watches your camera live — 30 frames a second, offline, no server

Describe a place in words — and hear it. AI now mixes sound as a scene, not clips

Drag a folder — your site is live. No signup needed

One image — and your character talks back, live

One selfie — and you're in any era, with the same face

That faded photo from the drawer — alive again, in one prompt