
Chintan Parikh
Product manager for LiteRT, part of Google AI Edge at Google DeepMind, focused on on-device AI runtime and edge AI acceleration.
Fine-Tuning Pushed FunctionGemma From 46% to 90% Function-Calling Accuracy
Cormac Brick, a Google AI Edge engineer, argues that on-device agents are becoming practical when developers either use system models such as Gemini Nano through Android AI Core or ship narrow, fine-tuned tiny models with LiteRT-LM. His main example is FunctionGemma, a 270 million parameter function-calling model that rose from about 46% accuracy out of the box to more than 90% on most tested app-intent functions after synthetic-data fine-tuning. Brick presents the tradeoff plainly: system GenAI is easier when it fits, while app-shipped tiny models require more work but can run locally, offline, and with more control.
Gemma 4 Moves On-Device AI From Chatbots to Local Agents
Chintan Parikh of Google DeepMind argues that on-device AI is moving from local chatbots toward local agents, as smaller Gemma 4 edge models become capable of tool calling, structured output and reasoning on phones, laptops and embedded hardware. With Weiyi Wang joining the Q&A, Parikh presents LiteRT as the deployment layer for that shift across Android, iOS, desktop, web and IoT. His case is pragmatic rather than absolute: edge inference can improve latency, privacy, offline use and cost, but teams still have to manage memory, quantization, accelerator support and when to call the cloud.