Orply.

Stanford HAI

Stanford HAI studies, guides, and develops human-centered AI technologies and applications.

AI Research Challenge Draws 200 Teams to Study Organizational Change

Stanford HAI and Google DeepMind’s AI for Organizations Grand Challenge is presented as an effort to study AI’s effects on organizations directly, rather than treating workplaces merely as places where AI tools are deployed. Melissa Valentine and other organizers argue that the central questions are how AI changes coordination, collaboration, alignment and collective performance, with DeepMind positioned not only as sponsor but as a research setting. The scale of the response — about 200 teams from more than 150 universities, narrowed to 13 finalists — is used to show broad academic demand for that inquiry.

Martin Gonzalez · Chris Watkins · Anita McGahan · Simon Bouton · Steve Perry · Robert Sutton · Rebecca KarpJun 8, 20265 min read

Responsible Mental Health AI Depends on Measurement, Co-Design, and Trust

At Stanford’s 2026 AI for Mental Health Symposium, Carolyn Rodriguez, Ehsan Adeli, Brandon Staglin and Vaile Wright argued that the urgent question is no longer whether people will use AI for mental health, but whether the field can make that use safe, clinically meaningful and trustworthy. The panel’s case was that responsible deployment will require measurable standards for quality and harm, early involvement from clinicians and people with lived experience, regulatory and payment systems that support trust, and designs that strengthen rather than replace human relationships.

Brandon Staglin · Ehsan Adeli · Vaile Wright · Carolyn RodriguezJun 8, 202619 min read

Mental Health AI Is Scaling Before Its Safety Framework Is Settled

At Stanford’s 2026 AI for Mental Health symposium, Russ Altman, Jina Suh and OpenAI’s Sara Johansen treated mental-health AI as a deployment problem already underway, not a speculative research agenda. Suh argued that general-purpose AI systems are now part of a public-health surface and should be evaluated across users’ full journeys, including consent, referrals, aftermath and the labor pushed onto clinicians, crisis lines, families and reviewers. Johansen described OpenAI’s effort to manage that risk through layered model and product policies that route people toward human support, while acknowledging the difficulty of doing so at platform scale.

Russ Altman · Jina Suh · Sara JohansenJun 8, 202614 min read

LLMs Play Games Better When They Write Simulators First

DeepMind research scientist Wolfgang Lehrach argues that language models should not be asked to play games directly when their outputs are slow, strategically weak, or illegal. In a Stanford HAI seminar, he presents Code World Models, which use LLMs to translate natural-language rules and play traces into executable game simulators that planners such as Monte Carlo Tree Search or reinforcement learning can use. He also describes Autoharness, a narrower system that synthesizes code to check action legality, as part of the same broader case for turning LLM knowledge into executable structure rather than immediate moves.

Wolfgang LehrachJun 5, 202617 min read

AI Is Moving Deeper Into Science, but Validation Remains the Bottleneck

At AI+Science: AI for the Universe, Kyle Cranmer, Carina Hong and Douglas Finkbeiner argued that AI is already embedded in scientific work, but its value depends on where validation happens. Cranmer framed physics applications around prediction and inference, where formal checks, simulator calibration or uncertainty correction determine whether model output can support scientific claims. Hong made the parallel case in mathematics, where Lean-style formal proof gives some AI results a clean score but leaves problem selection and theory-building with experts. Finkbeiner said astronomy’s newer disruption is the desk-level AI collaborator, which can improve research work while increasing the need for verification and scientific judgment.

Kyle Cranmer · Douglas Finkbeiner · Benjamin Nachman · Carina HongMay 15, 202623 min read

AI Tools Target Labeling, Simulation, and Scaling Bottlenecks in Research

At Stanford’s second AI+Science lightning-talk session, three researchers presented AI less as a general-purpose scientific shortcut than as infrastructure for specific measurement problems. Matt DeButts argued that PRC-linked patronage can reshape Chinese-language media markets by helping already favorable outlets survive; Samuel Young showed how self-supervised learning can extract particle structure from unlabeled detector data; and Benjamin Dodge described using AI-scale computation to make Gaussian process priors practical for 3D maps of Milky Way dust. The shared claim was that AI’s value depended on a sharply defined bottleneck: too many articles to label, too few reliable detector labels, or too large an inference problem for conventional computation.

Risa Wechsler · Samuel Young · Matt DeButts · Benjamin DodgeMay 15, 20268 min read

AI Is Pushing Science Beyond the Paper as Its Core Artifact

In closing remarks from an AI and science meeting, Risa Wechsler argued that AI is reshaping scientific fields unevenly, depending on their data, theory and modes of inquiry, and that scientists should use the moment to choose structures aligned with human values. Surya Ganguli pushed the question toward scientific communication itself, suggesting that papers may be too narrow an artifact for AI-assisted science and that richer institutional records of research could better transfer knowledge. Both framed AI for science as a design problem around human purposes, not just faster automation.

Surya Ganguli · Risa WechslerMay 15, 20265 min read

AI Is Making Scientific Throughput the New National Advantage

Dario Gil, the U.S. Department of Energy’s Under Secretary for Science, used his AI+Science keynote to argue that AI is shifting scientific advantage from access to instruments and computing toward the throughput of integrated discovery systems. He presented DOE’s Genesis initiative as the national-scale architecture for that shift, linking data, AI models, high-performance computing, experimental facilities, and industry partners into closed-loop workflows. Gil’s case was that the test is not more papers, but whether faster scientific cycles can produce measurable gains in productivity, security, and industrial capability.

Darío Gil · Risa WechslerMay 15, 202613 min read

Stanford Merges AI and Data Science Institutes Around Open Scientific Discovery

Stanford’s AI+Science Conference opened with James Landay announcing that the university is merging the Human-Centered AI Institute and Stanford Data Science into a single institute for AI and data science across Stanford. Landay, president Jonathan Levin, Surya Ganguli and Risa Wechsler framed the move around a common argument: AI is becoming a scientific instrument, but one that will require open research, domain-specific rigor, uncertainty-aware methods and human judgment about which questions matter.

James Landay · Risa Wechsler · Surya Ganguli · Jonathan LevinMay 15, 202612 min read

AI-for-Science Advances Depend on Evaluation, Not Just Generation

In a Stanford AI+Science lightning-talk session introduced by Surya Ganguli, four young researchers made a common case: AI-for-science is useful only when paired with rigorous evaluation. Aishwarya Mandyam, Amar Venugopal, Steven Dillmann and Alda Elfarsdóttir each treated AI systems or outputs as claims to be tested — through uncertainty estimates for clinical policies, causal checks on generated text, executable benchmarks for scientific agents, and empirical links between corporate climate language and later emissions.

Aishwarya Mandyam · Surya Ganguli · Aldís Elfarsdóttir · Amar Venugopal · Steven DillmannMay 15, 20267 min read