Gemini Becomes the Prompt Engineer for Google’s Gen Media Stack
Google DeepMind developer advocate Guillaume Vernade demonstrates a gen-media workflow built around Gemini as the orchestrator rather than as a one-shot generator. Using The Wind in the Willows, he shows Gemini reading the full book, producing structured prompts and scripts, and handing them to Nano Banana, Veo, Lyria and TTS models for images, video, music and narration. His broader case is that multimodal production depends less on a single model than on schemas, reference assets, state management, cost controls and prompt handoffs between specialist systems.
AI Engineer·May 18, 2026·19 min read