Patrick Loeber

Patrick Loeber is a Member of Technical Staff at Google DeepMind, where he works on developer relations and developer experience for Google’s AI models, including the Gemini API and Google AI Studio. He is also known for teaching Python, machine learning, and AI through his blog and YouTube content.

Any-to-Any Agents Rely on Orchestrated Multimodal Models, Not One Network

Google DeepMind’s Patrick Löber presents “any-to-any” agents as an orchestration problem rather than a claim that one model already handles every modality. In his architecture, Gemini reads and reasons across PDFs, images, audio, video and other sources, then uses function calling to invoke specialized native models for images, speech, live audio, video or embeddings. Löber argues that the useful shift is not generating every possible format, but letting an agent decide when a diagram, spoken explanation or other output is warranted.

AI EngineerMay 20, 202610 min read