Hassabis Says AI Drug Discovery Could Transform Medicine Within 20 Years

Demis HassabisTwo Minute PapersMonday, May 25, 202612 min read

Demis Hassabis told Two Minute Papers’ Károly Zsolnai-Fehér that AI could help produce cures for most diseases on a 10- to 20-year horizon, but he framed the claim as a platform problem rather than a countdown. The DeepMind chief argued that AlphaFold is only one component of a broader drug-discovery system, with Isomorphic Labs and DeepMind building multiple specialized models to predict biological behavior, design molecules and eventually accelerate validation. He stressed that clinical testing and regulatory trust remain separate bottlenecks, and that evidence from working AI-designed drugs would have to come before any process change.

Hassabis expects disease progress to look discontinuous, not incremental

Demis Hassabis framed the ambition to “cure all disease” less as a steady stream of visible updates and more as a platform problem that may remain quiet for years before producing a step change. Károly Zsolnai-Fehér raised a remark Hassabis had made in April 2025: that one day AI may help cure all disease, “maybe within the next decade.” Zsolnai-Fehér said he was excited enough by the line to register curealldisease.com, add some data, promise updates, and then leave it untouched for more than a year.

The screen showed the URL as a simple green text overlay: “curealldisease.com.” The joke was that there had been nothing to update. Hassabis’s response was that this may be the wrong expectation for the kind of progress he has in mind.

Hassabis said it “won’t be a gradual thing.” His analogy was AlphaFold: once the system became accurate enough, DeepMind could fold “all 200 million proteins in one year.” In his view, AI-enabled drug discovery may have a similar threshold dynamic. For several years, there may be little to post; then, if the platform works, the rate of meaningful breakthroughs could become difficult to keep up with.

The platform Hassabis described is being built across Isomorphic Labs and DeepMind’s science group. AlphaFold is one component. Protein structure prediction matters, but he emphasized that it is “only one step in the drug discovery process.” The goal, as he put it, is to build “another half a dozen to a dozen AlphaFold level models” covering different parts of that process, integrate them, and test them against disease profiles in the preclinical stage.

Half a dozen to a dozen

additional AlphaFold-level models Hassabis said are being built for other parts of drug discovery

If that work is proven out over “a few more years,” Hassabis said, it may produce “an engine” or platform that can be applied to almost any disease area. He was careful, however, about what the ten-year claim does and does not mean. He did not claim that clinical validation disappears. A platform might generate potential cures within that rough timeframe, but those still need to be tested in the clinic, and that process “could still take more time over a decade.”

When Zsolnai-Fehér pressed the shorthand — “cure all disease in nine years?” — Hassabis would not endorse the precise countdown. His answer was broader: “roughly that kind of timeframe,” or more conservatively, “in the next 10 to 20 years,” because he does not see “any laws of physics” that prevent it.

That caveat matters because Hassabis separated two bottlenecks that often get collapsed into one. First is drug discovery: generating and optimizing a candidate. Second is clinical testing: proving safety and efficacy in humans. DeepMind and Isomorphic are focused on the first part now, but Hassabis said AI could also help with the second by stratifying patients and predicting dosages better. If AI can accelerate both discovery and trials, he said, that would be “a step change in human health.”

The models need to move beyond protein shape into biological behavior

Hassabis’s account of AI-enabled drug development starts with AlphaFold but does not stop there. He described “advanced versions” of AlphaFold that can predict more than a static protein structure: interactions between proteins, and interactions between proteins and molecules. The harder questions come after that. A drug candidate has to do something useful in the body, and it has to avoid doing unacceptable harm.

The properties Hassabis named are the ones that turn a promising computational result into a plausible drug-development program: where a molecule binds, what its “ADME properties” are, and what can be predicted about “absorption and toxicity.” He grouped these with the side effects that can make or break a candidate. The larger point was not the acronym; it was that AI systems must learn to predict multiple biological and pharmacological consequences, not only molecular shape.

He also pointed to the chemistry of design and manufacture. A platform cannot merely say that a biological target is interesting. It has to help decide what compound to design, how it would be made, and how it would bind to the relevant pocket or target in the protein. His framing was not that a single model will replace the pipeline, but that many specialized models must be built and connected across the pipeline.

Zsolnai-Fehér pressed on the limits: what parts of drug discovery are not just currently hard, but may remain outside what AI can change? Hassabis named regulation as “a bit outside the scope of AI,” while still arguing that regulatory processes could eventually be sped up by evidence. In his view, once a few AI-assisted or AI-designed drugs pass through the full traditional process and actually help patients, regulators may be able to examine how accurate the models’ predictions were. If, for example, there were ten AI-designed drugs and nine worked, that backtesting could help establish which model predictions are trustworthy and which are not.

Only then, he said, could parts of the process potentially be accelerated, skipped, or replaced with more efficient alternatives. The important sequence is evidence first, process change second. Zsolnai-Fehér compared the possibility to mRNA vaccines, where an emergency need during COVID accelerated testing and trials for a new technique. Hassabis accepted that as one possible precedent but resisted a rhetoric of rushing. Human health, he said, is a matter for “the next 10 centuries”; the point is not to compress everything recklessly into the next five to ten years, but to recognize “such exciting technology coming down the road” that “anything may be possible.”

Co-Scientist is still an assistant, not an autonomous discoverer

Zsolnai-Fehér described Co-Scientist as a system that can “invent new things,” and asked Hassabis to explain what it is. A viewer post shown on screen asked whether Two Minute Papers would cover “Google Deepmind’s Co Scientist,” calling it potentially “revolutionary.” Hassabis’s description was more precise: Co-Scientist is “a sort of fine-tuned version of Gemini” with extra tools and harnesses aimed specifically at scientific work, including hypothesis generation, data analysis, and literature summarization. He likened the current form to “a great research assistant” helping with daily work.

That distinction — assistant rather than autonomous scientist — remained important. When Zsolnai-Fehér later asked what Co-Scientist had already invented, Hassabis said it is “more like an assistant today rather than autonomously discovering things,” though autonomous discovery may be the next step. He said it is currently assisting scientists and mathematicians, and that DeepMind hoped to announce more results soon.

The examples he did give were from earlier versions of related systems: finding more efficient matrix multiplication, using AlphaVolve and other tools to improve computer-science algorithms, and “turning invention on itself” to make systems more efficient. Hassabis described the current release as only “scratching the surface” of what he hopes the approach can become.

Zsolnai-Fehér had tried the hypothesis generator himself, including outside the usual biomedical focus. He said he gave it ideas in ray tracing and global illumination, an area he described as having little training data and relatively few researchers. The system asked him to narrow the idea, then took about eight hours and returned results he called “absolutely amazing,” though he also said he wished he had more time to explore them.

Hassabis’s answer to that was revealing: the scarcity is no longer only ideas, but time and attention. He said he too needs more time to use these systems, and joked that researchers now need “really good AI assistants” to handle administrative work so they can spend more time with tools like Co-Scientist.

The Einstein test is meant to backtest scientific originality

Zsolnai-Fehér proposed a more demanding benchmark than imitation or conversational fluency: an “Einstein test squared.” The basic version, associated in the exchange with Hassabis, would stop a model’s knowledge at around 1901 and ask whether it could produce the breakthroughs Einstein published in 1905, including special relativity. Zsolnai-Fehér’s stronger version would use today as the cutoff and ask the system to invent something genuinely new and Nobel-worthy.

Hassabis said that is “obviously what we would want,” but stressed that the backtest matters. If a model trained only on knowledge available before 1905 could independently produce Einstein’s annus mirabilis papers — four breakthroughs across different areas — then the same kind of system, trained on modern physics, might deserve serious attention if it proposed “something better than string theory.” The point is not that a model’s speculative answer should be believed by default. The point is that reconstructing past scientific leaps under a controlled knowledge cutoff would provide evidence that the techniques are capable of new science.

That standard also clarifies why ordinary benchmarks are not enough for Hassabis’s scientific ambitions. A model that summarizes literature, critiques ideas, or generates plausible hypotheses is useful, but the stronger claim requires evidence that it can cross a boundary that human science had not already crossed. The Einstein test is a proposed way to show that a system is not merely recombining known results in obvious ways.

Closed-loop discovery works differently when the verifier is physical

Zsolnai-Fehér used ray tracing to ask a deeper systems question about AI science. In rendering, he said, a sampler may produce a noisy image, and a denoiser may clean it up. But stronger systems connect the two: the denoiser tells the sampler where it needs more samples, especially in high-frequency regions, so the two modules become a fused system rather than separate stages. He asked whether DeepMind has something analogous between a hypothesis generator and a verifier.

Hassabis identified that as “closed-loop automated discovery.” He said many frontier labs are thinking about recursive self-improvement, and he sees some domains where the loop could work more readily. Coding and math are easier because verification can be fast and clear: the system can check whether an answer is correct or whether it is making progress. Such settings can also generate synthetic data.

Physical and natural sciences are harder. In physics, chemistry, and biology, the verifier may require “an automated lab” or some other interaction with “the world of atoms.” That makes the loop longer and shifts the bottleneck. The relevant question becomes whether discovery is limited by hypothesis generation or by validation.

Hassabis said Isomorphic is thinking about automated labs, but he is waiting to see what data cannot simply be obtained from a contract research organization and waiting for robotics to advance further. DeepMind’s robotics work, he said, is going well enough that he could imagine setting up some kind of lab in 18 to 24 months. During this part of the exchange, the video showed an Isomorphic Labs logo, matching the part of the answer where Hassabis located some of this thinking.

He also described a material-science case where the need is already concrete. DeepMind is building an automated lab in London for material science because it has “200,000 designs of new materials” without a way to test them quickly enough. Some of those designs may include superconductors and other interesting materials, but without fast physical verification, the promise remains bottlenecked.

200,000

new material designs Hassabis said DeepMind is sitting on without enough testing capacity

Even in math and coding, where recursive loops are more straightforward technically, Hassabis said the labs are thinking about safety. The concern is the process running with “no human in the loop.” The technical attraction of a fused generator-verifier system is precisely what raises the governance issue: if the system can improve or validate itself recursively, the absence of human oversight becomes material.

Gemini is already being used as a personal research and health tool

Zsolnai-Fehér opened the substantive exchange with a personal example. His mother had received health scan results as a large video file, and the family had to wait weeks for evaluation. Anxious during the wait, he gave the file to Gemini, citing its long-context capability. Gemini analyzed the scan and said not to worry. Later, a doctor verified that the result was fine. Zsolnai-Fehér thanked Hassabis and referred to Gemini and a newly released system shown on screen as “Gemma 4” as “a gift to humanity,” describing the latter as free, local AI that he thought could likely do similar work with some compression.

Hassabis responded cautiously but positively. He said he was glad Zsolnai-Fehér’s mother was fine, and that DeepMind had heard “a lot of anecdotes” of people using Gemini for health reasons, including some cases involving life-saving advice. He called health “an incredible use case.”

For his own work, Hassabis said he does use Gemini beyond research, though not yet as a confidant in the Jensen Huang sense. His main use is brainstorming: project ideas, project names, and creative directions. He also uses it as a sparring partner to help think through steps in an idea, and to summarize unfamiliar areas of research quickly when he wants “the main key points.”

Zsolnai-Fehér asked whether Hassabis uses Gemini to criticize his ideas — for example, turning on a stronger reasoning mode and asking it to find flaws. Hassabis said he uses it in that direction, though more collaboratively than adversarially. He acknowledged that he may need to try a harsher prompt: “be harsher, come up with the flaws in this.” His current framing is still primarily a sparring partner rather than an oracle.

AlphaFold’s second-order impact may become its own prize-worthy lineage

Zsolnai-Fehér raised a comment from John Jumper that had stayed with him: Jumper looked forward to seeing someone use AlphaFold to invent something that wins a prize. Zsolnai-Fehér called this a “second-order Nobel” — recognition not merely for AlphaFold itself, but for discoveries enabled by AlphaFold.

Hassabis said the idea is plausible because of the scale of AlphaFold’s use. More than three million researchers are using it, he said, and they are doing “incredibly important” and impactful work. If one of those downstream efforts eventually produces a Nobel-level discovery, Hassabis said, that would be “an amazing moment.”

Over 3 million

researchers Hassabis said are using AlphaFold

This line of discussion also explains why Hassabis treats platform-building as central. AlphaFold is not only a discovery in itself; in his account, it is an instrument that changes what other scientists can attempt. The same logic underlies his expectations for Co-Scientist and the drug-discovery tools he described: the highest-value systems may not simply output final answers, but may expand the search space and productivity of many researchers at once.

Games remain a proving ground for agents, economies, and social systems

Zsolnai-Fehér asked about DeepMind’s new partnership with EVE Online, noting his own history as a player. The screen showed EVE Online promotional art. Hassabis described EVE as unusually valuable because its community helps build the game through alliances, factions, a functioning economy, and dynamic storylines. It is, in his words, “a whole universe” created not only by designers but by players.

That makes it a sandbox for AI ideas. Hassabis said DeepMind has long admired the game and that he knows Hilmar, the company’s CEO, from his own game-development days. The interest is not just better non-player characters. He described EVE as a proving ground for AI interacting with economies, storylines, and player alliances, continuing DeepMind’s tradition of using games as a safe environment for experimentation.

When Zsolnai-Fehér asked whether AI agents might be embedded in the game to play with human players, Hassabis said “potentially,” but also suggested agents might assist players or function more like a game master that drives storylines forward. He emphasized that the work is early and that DeepMind wants to brainstorm with the community.

The exchange ended with an EVE-specific joke about watching an AI bot get scammed in Jita, and Zsolnai-Fehér offering to “double your money” as the one guaranteed non-scam. The joke carried a substantive point: EVE is not just a game board. It is a social and economic environment where trust, deception, coordination, and incentives matter. That is why it is an interesting testbed.

Data and Training AI in Robotics and Physical Systems AI Labs and Strategy Evals and Benchmarks AI Research Methods AI in Healthcare and Life Sciences Agents and Autonomy