Stanford Merges AI and Data Science Institutes Around Open Scientific Discovery

Jonathan LevinStanford HAIFriday, May 15, 202612 min read

Stanford’s AI+Science Conference opened with James Landay announcing that the university is merging the Human-Centered AI Institute and Stanford Data Science into a single institute for AI and data science across Stanford. Landay, president Jonathan Levin, Surya Ganguli and Risa Wechsler framed the move around a common argument: AI is becoming a scientific instrument, but one that will require open research, domain-specific rigor, uncertainty-aware methods and human judgment about which questions matter.

Stanford is merging HAI and Data Science into a single university-wide institute

James Landay used the conference opening to announce that Stanford is merging the Human-Centered AI Institute and Stanford Data Science into one university-wide home for AI and data science. He described the combined institute as “the front door for AI and data science at Stanford,” intended to serve every school, every discipline, and any researcher with “a question worth asking.”

Landay framed the merger as a response to a shift from anticipation to arrival. HAI, he said, was created more than eight years ago by Fei-Fei Li, John Etchemendy, Chris Manning, and Landay around the conviction that AI would not remain inside computer science. The structure was deliberately not a lab, department, or school-based center, but a standalone institute spanning all seven Stanford schools. The point was to bring humanists, social scientists, clinicians, legal scholars, educators, artists, and scientists into AI’s development from the beginning.

That founding assumption, in Landay’s telling, has become a practical requirement. Industry is moving faster than expected; governments are trying to respond; capital is flowing at unfamiliar scale; and the technology is changing “on a timeline measured in months or even weeks, but not years.” Stanford Data Science brings to the combined institute large-scale data expertise, cross-disciplinary centers, and MARLOWE, which Landay called a high-performance computational instrument already supporting Stanford work from foundational brain models to tumor detection in radiology.

The new institute’s commitments, as Landay stated them, are open science, a larger form of team science, and responsibility beyond Stanford. The openness commitment was the clearest contrast with industry. Frontier labs may produce extraordinary work, he said, but much of it will be closed: model weights, data, methods, and safety evaluations “increasingly behind walls.” Universities, in his account, should choose differently.

Universities exist to share what we learn. To put knowledge in the open where any researcher, any student, any government anywhere in the world can build on it.

James Landay · Source

Landay cast openness not as academic sentiment but as a strategy for keeping AI’s benefits broad and its development accountable. The second commitment, team science, was aimed at problems he argued industry will not work on: long-horizon scientific questions, public-interest applications without commercial markets, independent evaluation of societal impacts, and AI for fields where data is messy, users are diffuse, and returns may be many years away. Those problems, he said, need larger teams, sustained funding, engineering capacity, and shared compute infrastructure.

The third commitment extended the institute’s mandate beyond campus. If the institute is serious about serving humanity, Landay said, it has to work across institutions, borders, and sectors: with universities and research institutions globally, civil society, governments setting rules, and communities whose lives AI will reshape. Health, climate, education, and governance were presented as global challenges requiring a global orientation.

Landay connected the announcement to the institute’s human-centered premise. The conference, he said, was not only about what AI can understand, but about “what people understand.” That distinction was central to his argument: AI and data science should be developed in service of human understanding, not as a substitute for it.

AI reached the national science agenda faster than Levin expected

Jonathan Levin used his opening remarks to emphasize how quickly AI became central to scientific discovery. He recalled joining the President’s Council of Advisors on Science and Technology, which he described as the nation’s leading science advisory group, about five years earlier. At the group’s first meeting in early 2021, members were given a charge drawn from a letter President Biden had written to Eric Lander, then the nation’s science advisor.

The charge listed five major science and technology questions for the country: what public health lessons could be learned from the pandemic; how scientific and technological breakthroughs could address climate change; how the United States could lead in technology, particularly in competition with China; how to ensure the fruits of science and technology were widely shared; and how to support the long-term health of the nation’s science and technology enterprise.

Levin stressed that these questions remained important and fundamental. But he noted what was absent: there was “absolutely no mention of artificial intelligence or the things that might be unlocked by artificial intelligence.” Within a year, he said, the council had a working group on AI, was writing reports on it, and was discussing it regularly. By the final meeting, a little over a year before the conference, members were asked what they thought would be most exciting in the next several years. Levin said the majority pointed to AI unlocking discoveries, acceleration, and innovation in their fields.

That change, for Levin, validated Stanford’s earlier decision to create the Human-Centered AI Institute and Stanford Data Science before AI’s current public prominence. He said Stanford did not “fully foresee” the current moment, but was “prescient” in creating structures around AI in 2017 or 2018. Two principles mattered in those structures: keeping humanity at the center, and making the work interdisciplinary.

Levin drew a line between what universities can do and what companies are likely to do in a period of rapid technological advance. A university’s distinctive advantage, in his telling, is the ability to bring people developing a technology together with people who hold deep disciplinary knowledge across many domains. The conference’s breadth—physics, math, biology, chemistry, climate, and engineering—was presented as an example of that institutional role.

The gathering, Levin said, was a chance to take stock, share knowledge, and spark research ideas that could lead to future innovations. His institutional claim was that when a technology moves this quickly, universities have a distinct role if they can convene across domains and connect technical work to disciplinary knowledge.

AI is being treated as a scientific instrument, but not like older instruments

Surya Ganguli placed AI in a lineage of scientific instruments that changed what humans could know. A slide put Galileo’s telescopes from 1609, Hooke’s microscope from 1665, and modern AI server racks under the claim “New tools open new vistas.” The telescope opened views into the outer reaches of the cosmos and changed humanity’s understanding of its place in the universe. The microscope opened the inner world of cells and contributed to germ theory, with large consequences for health and well-being.

AI, in Ganguli’s framing, is similarly consequential, but different in kind. Telescopes and microscopes extended sight across scale. AI helps detect, understand, and exploit complex, high-dimensional patterns in immense datasets that unaided human minds cannot grasp. That is what makes it, for Ganguli, a new instrument for scientific inquiry.

A slide organized the claim in two directions: “AI for Science” and “Science for AI.” AI for science was described as “a new instrument for discovery.” Science for AI was described as the idea that “the demands of science will drive better AI.” The slide’s summary was the reciprocal claim Ganguli wanted to foreground: “AI is enabling scientific discovery and scientific challenges will strengthen AI.”

Across the conference’s domains—life, Earth, and the universe—Ganguli identified a common pattern: AI helps researchers sift through large, complex data to predict, understand, and control nature. In life sciences, he pointed to foundation models of genomes, cells, and brains that may yield biological insights and therapeutics. For Earth, AI can model complex weather and climate processes. For the universe, it can help with questions from subatomic particles to cosmology and with the discovery of new mathematics, which he called “the very first language of the universe itself.”

Ganguli also separated AI for science from consumer AI. Scientific applications, he said, are not the same as generating “fun videos and songs.” Science demands rigor. He gave physics as the benchmark: quantities such as the electron magnetic moment can be measured to 13 decimal places. To meet that standard, he said, AI must become more explainable, trustworthy, data-efficient, and capable of handling uncertainty, causal reasoning, and experimental design.

13 decimal places

precision Ganguli cited for measurements such as the electron magnetic moment

The argument was also a recruitment pitch to AI researchers. If they want to work on difficult and consequential problems for both AI and society, Ganguli said, AI for science is a place to move. The constraints of science are not peripheral requirements; they are, in his words, AI’s “next crucible.”

Ganguli’s neuroscience examples show what digital twins can make possible

Surya Ganguli used his own neuroscience work to give substance to the claim that AI can become an instrument for discovery. In collaboration with experimentalists, he said, his group records from many neurons and builds “digital twins” of the brain: models of neural activity and behavior. Those twins can then be analyzed using explainable AI, and control theory can be used to control both the model and the biological system.

His formulation was deliberately linguistic: the goal is to learn the language of the brain and then “speak back to it in its own language.” The examples he listed spanned retina, perception, mouse vision, epilepsy, navigation, altered states, and primate visual neurons.

According to Ganguli, his group has built a digital twin of the primate retina that can reproduce decades of experiments in hours. He said they have also worked on reading the language of the brain and writing simple percepts directly into the brain by controlling only 20 neurons. He described work decoding what a mouse is seeing from brain activity, creating a digital twin of the epileptic brain that allows seizure control, and modeling mouse navigation well enough to predict brain activity patterns when a mouse enters a new environment for the first time.

Ganguli’s most provocative examples concerned subjective experience and neural description, and he presented them as examples from his group’s line of work rather than settled general capabilities. He said they can give a mouse ketamine and see neural correlates of the resulting “out-of-body experiences,” raising questions about the sense of self that he suggested may now be answerable. He also described recent work combining digital twins of monkey brains with vision-language models so that monkey visual neurons can “speak” in natural language about what they like to fire to.

He treated this not as a finished endpoint but as a direction. One speaker later in the day, Andreas Tolias, would say more about the visual-neuron result. Ganguli speculated that there may be a future in which AI helps animals “talk to us” by telling humans what they are thinking. In context, the speculation rested on a specific technical chain: measurement of neural activity, model-based digital twins, explainable AI, and language models that convert model behavior into interpretable descriptions.

Astrophysics has data-rich instruments before it has the inference machinery it needs

Risa Wechsler grounded the AI-for-science argument in cosmology and astrophysics. Her motivating questions were basic and large: What is the universe made of? Why is the universe’s expansion accelerating? Is the standard cosmological model right at roughly the 1% level, or is there new physics? What is dark matter—what kind of particle, what mass, and whether it interacts with normal matter beyond gravity? How do galaxies, including the Milky Way, form and evolve?

The common feature of those questions, she said, is that they require extracting physics from very large, complex datasets, sometimes involving billions of objects and measurements from many kinds of instruments. AI’s value is not simply that it can process more data; it may allow scientists to ask those questions in new ways.

Wechsler pointed to the Vera C. Rubin Observatory as an example of the decade’s changing data regime in astrophysics and cosmology. She described the LSST camera as the world’s largest digital camera and said it had been installed on the Vera Rubin Observatory telescope in Chile. The accompanying slide described a 3,200-megapixel camera, a novel three-mirror design, and a 10-year survey that will image the entire Southern sky every few nights.

Rubin survey feature	Value or description
Camera	Largest digital camera ever built; 3,200 megapixels
Survey duration	10 years
Coverage	Entire Southern sky every few nights
Nightly data volume	About 20 TB per night
Final scale	About 20 billion galaxies and 20 billion stars
Nightly changes	Several million changing objects or events

Figures from Wechsler’s remarks and the Rubin Observatory slide shown during the opening

The result, Wechsler said, will be both an extraordinarily deep map and a movie of everything in the sky that changes. She described about 20 terabytes of data per night being shipped from Chile to SLAC in Menlo Park, reprocessed, and sent out to the world within minutes. The speed matters because some of the millions of nightly events—asteroids, exploding stars, merging neutron stars, and possible new discoveries—are important enough that astronomers may want to point other telescopes on Earth and in space at them immediately.

Rubin is only one of several transformative surveys arriving in astrophysics this decade, alongside Roman, JWST, DESI, gravitational wave observatories, and radio observatories, as listed on the slide. Wechsler described the field as “very, very lucky” to be entering an extremely data-rich regime. But the harder problem, in her account, is turning that data into physical discovery.

Prediction alone is not sufficient for her science. The slide stated the need for “calibrated posteriors, physics-based models, and uncertainty that propagates from raw data to physical parameters.” Wechsler emphasized uncertainty quantification because the field is trying to measure physical quantities with high precision. The question is how to combine theory-driven and data-driven approaches: foundation models might help synthesize data, while forward models encode physics and instruments.

Her conclusion was a constraint on generic AI optimism. AI offers powerful new tools, but they are not off-the-shelf tools for astrophysics. The science requires new inference layers, built for the questions and standards of the field.

Human understanding remains the standard, not an optional interface

The organizers repeatedly returned to the role of human understanding. Landay said the conference was about “what people understand, not what AI understands.” Risa Wechsler made the same point through the practice of science: AI changes what becomes tractable, but it does not decide which problems matter or what their answers mean.

Her questions were institutional as much as technical. What does AI make newly possible in science, and what remains hard? How can human decision-making and understanding remain central in scientific work? How can AI be used to help researchers think more deeply, rather than merely more quickly? What kinds of judgment and trust are needed in tools and processes? How should institutions and incentive structures change to provide them?

AI does change what problems are tractable, but it doesn't tell us what problems matter.

Risa Wechsler

That theme connected the scientific examples to the institutional announcement without collapsing them into one claim. Ganguli argued that science will force AI to become more trustworthy, explainable, uncertainty-aware, and causally capable. Wechsler argued that fields such as cosmology need inference systems that preserve uncertainty and connect data to physics. Landay argued that a university institute should support open research, public-interest applications, independent evaluation, and global engagement.

The program’s architecture followed those tensions: AI for Life, AI for Earth, AI for the Universe, and a panel on the role of human understanding in the future of scientific discovery. Wechsler’s closing invitation asked what is newly possible in science and “how do we bring the tools, institutional structures, and habits of mind together to do it well?” The way science is done, she said, is changing quickly, and the requirements for doing it well—for science and for society—are “just being built.”

AI in Healthcare and Life Sciences AI Research Methods