Steven Willmott

CEO of Safe Intelligence, an Imperial College London spinout building AI safety and validation technology for reliable machine-learning systems. He previously co-founded and led API management company 3scale, which was acquired by Red Hat.

Agent Safety Requires Specs, Not Just Larger Eval Sets

Steven Willmott of SafeIntelligence argues that larger models are not automatically safer agents: the same capability that lets them handle more tasks can also help them understand adversarial instructions and misuse broader infrastructure access. His proposed answer is spec-driven validation, in which an agent is tested against an implementation-independent behavioral spec covering rules, domain boundaries, rights and roles, ground truth, domain knowledge and robustness requirements. The point is to make security and reliability testing follow from what the agent is allowed to do, not just from a dataset of expected answers.

AI EngineerMay 31, 20267 min read