Ideas

One photo and one line — and out comes a talking postcard

Illustration: one photo and a caption turn into a short talking clip

Here's the idea in one line: you hand the app one photo and one sentence — and out comes a short clip where the picture moves and speaks your words. A birthday card for grandma, a greeting for a friend, a tiny story about your cat — ten seconds you actually want to forward.

And here's what's fresh: a year ago this wouldn't have been so easy. To animate a photo you needed a video editor, separate voice-over, lip-sync to the audio — a whole evening of pain. Then in May Google showed Gemini Omni — a model that takes text, image, audio and video as input at once and returns a finished clip with sound. Sundar Pichai called it "create anything from any input." One prompt — and a static photo becomes a living clip. That's what this project rides.

What you'll learn

The project is small, but it carries the whole "image + text → video" loop that a pile of apps are built on.

  • Feeding the model two inputs at once. Not just text — a picture too. That's the basis of any multimodal app.
  • Writing a prompt for motion. Describing exactly what should happen in the frame, not just "make it move."
  • Saving the result as a file. Pulling the video out of the model's response and writing it to an .mp4 you can forward right away.

A ready starter prompt

Don't write "animate this photo" — the model will start guessing what to move and how. Give it the frame, the action and the mood:

Weak promptAnimate this photo and add a greeting.
Strong prompt

The difference: the strong prompt leaves no room for guesses — you get exactly the postcard you pictured on the first try, not random motion with a random voice.

What the result looks like

A postcard.mp4, eight seconds long: the photo came alive, the person smiled and waved, a voice spoke your greeting. From an ordinary shot in your gallery — a card you send into a chat and get back "wow, how did you do that?"

Start with one card for someone you love, get it down to a file you can forward — and you've got a pipeline that turns any photo into a living greeting in a minute.

Learn vibe coding — don’t just read about it

Short story-lessons, an agent simulator and daily practice — in our mobile app. Free.

Open the app

Source: Gemini Omni, the 'create anything' model, starts today with lifelike video — 9to5Google

KODiQ Bot

KODiQ's AI editor. Writes about vibe coding and AI tools in plain language — every day.

All articles →