Neuroevolution Offers AI a Path Beyond Bigger Models

Risto MiikkulainenEye on AITuesday, June 2, 202619 min read

Risto Miikkulainen, a UT Austin professor and vice-president of AI research at Cognizant AI Labs, argues that neuroevolution offers a different path for AI than simply scaling larger models. In a conversation with Craig Smith, he says gradient descent is well suited to optimizing toward known targets, but population-based evolutionary search is better for problems where the goal is uncertain, the landscape is irregular, and useful solutions may require diversity, novelty and recombination.

Evolution is useful where the target is not already obvious

Risto Miikkulainen frames neuroevolution as a search method for problems where a human designer does not know the right answer in advance, and where following a local improvement signal is likely to miss better possibilities elsewhere. The central difference, in his account, is not that evolution is magic or that gradient descent is obsolete. It is that evolution searches differently.

The reason this matters now, rather than only as a long-running research tradition, is that Miikkulainen sees three shifts arriving together: evolution strategies are being applied to billion-parameter models, simulated worlds are making it easier to evaluate agent behavior, and the field is trying to give neuroevolution a common textbook-and-software foundation.

Gradient descent begins with an individual solution and improves it step by step, using a gradient to move toward a known objective. Even reinforcement learning, which includes exploration, is still trying to infer where the gradient is and move in that direction. Evolutionary search begins with a population: 30 agents, 100 agents, perhaps 1,000. Those candidates are spread across the solution space as widely as possible. Instead of a single system climbing locally, the method maintains many alternatives and evaluates which ones are worth preserving.

That population structure matters because the search spaces Miikkulainen is interested in are often too large, jagged, or poorly understood for humans to navigate directly. A gradient may point toward a nearby improvement while missing a larger opportunity across a rough part of the landscape. Evolution can maintain candidates in areas reinforcement learning might never reach. In genetic-algorithm-style approaches, it can also recombine two good but different solutions, creating an offspring that inherits parts of both encodings and, potentially, parts of both abilities.

What's really different about evolution is that it's a population-based method. So you don't have just a single agent. You have 30 or do you have hundred or maybe a thousand agents. And you spread them out around the space of solutions as widely as possible.

Risto Miikkulainen · Source

Miikkulainen’s strongest claim is that this difference produces creativity: solutions that surprise the people who set up the search. He says that in evolutionary computation, “human competitive” results are routine enough to have a competition category: designs at least as good as, and ideally better than, human-designed ones. For him, that surprise is the point. The computer is not merely executing a program; it is discovering behavior the programmer did not explicitly put in.

His own path into the field began in the 1980s, combining evolutionary computation and neural networks. He describes NEAT, the algorithm developed with Ken Stanley when Stanley was his PhD student, as an early success because it was robust, easy to apply, and required relatively little parameter tuning. NEAT evolves the connectivity and architecture of comparatively small neural networks. It proved useful for agents whose value is expressed in behavior: robots, rockets, cars, virtual game characters, and other sequential decision-making systems.

That emphasis on behavior is central to much of Miikkulainen’s account of neuroevolution. Later, when he contrasts systems that “imitate the statistics of the world” with agentic AI, he places neuroevolution on the side of discovering how agents should act: what they should remember, how they should react to a changing environment, what kind of strategy they should follow. In the cases he finds most promising, the desired solution is not a known mapping from input to output. It is a decision process to be discovered.

Neuroevolution is not one algorithm or one architecture

Craig Smith raises the question of whether evolutionary AI is an architecture issue, an algorithm issue, or something that can sit on top of existing models such as transformers. Miikkulainen’s answer is deliberately broad: evolutionary optimization can be applied to almost anything that can be mutated or crossed over. Strings, trees, programs, physical designs, neural network weights, and neural architectures can all be search objects if they can be encoded.

Neuroevolution is the subset in which the evolving objects are neural network encodings. Those encodings may be simple concatenations of weights in a fixed architecture, or graph-like representations of connectivity. Some methods evolve only weights. Others evolve architecture. NEAT evolves both connectivity and topology, customizing a network’s structure to a task. Recurrence is especially important in this view because it determines what a system remembers and how it uses the past.

But Miikkulainen separates that older, smaller-network tradition from another branch: using evolution to optimize architectures for deep learning systems that are then trained with gradient descent. This includes neural architecture search and meta-learning. Evolution may search over activation functions, loss functions, modular structure, layer layouts, channels, or other design choices. Once a candidate architecture is selected, gradient descent can still set the weights.

That makes neuroevolution less a rival school than a design and discovery mechanism that can be combined with transformers, convolutional networks, diffusion models, reinforcement learning, biology-inspired models, and large language models. Smith characterizes it as a “horizontal” technology rather than one school of AI, and Miikkulainen accepts the framing with an important qualification: it is not the solution to everything. If the task is modeling the statistics of a large dataset, deep learning is “perfectly fine” and often excellent. Neuroevolution is most compelling, he says, when there is an opportunity for creative discovery and little certainty about what the right solution should look like.

Search object	How Miikkulainen describes it	Typical role
Weights in a fixed network	A standard architecture is encoded as parameters that evolution mutates or optimizes.	Adjust behavior without changing the architecture.
Connectivity and topology	Methods such as NEAT evolve the structure of relatively small networks, including recurrence.	Discover task-specific behavioral circuitry.
Deep-learning architecture	Evolution searches over design choices such as layers, modules, losses, activations, or channels.	Use evolution for architecture design and gradient descent for weight training.
Programs or code	Evolutionary programming and genetic programming evolve code or code-like representations.	Discover executable strategies or algorithms.

Miikkulainen treats neuroevolution as a family of search methods, not a single model architecture.

The practical consequence is that neuroevolution can operate at multiple levels. It can optimize a small controller directly. It can search over a deep-learning system’s architecture. It can tune a large pretrained model for a specific behavior. It can evolve the code that generates a strategy. In each case, the common feature is not the representation but the evolutionary loop: generate variation, evaluate candidates, preserve or recombine what works, and continue searching.

The scaling problem has become a new opening

For years, neuroevolution faced an obvious scaling objection. Modern neural networks contain billions of parameters. Classical neuroevolution methods such as NEAT typically evolved networks with thousands or perhaps tens of thousands of values. Gradient descent scales because backpropagation supplies information about how each weight should change. Evolution, by contrast, appears to require mutations or crossovers that somehow get many parameters right without that direct gradient signal.

Miikkulainen says recent work has made that problem more interesting. His group at Cognizant AI Labs, and later groups including Oxford and Nvidia, have used a form of evolutionary optimization called evolution strategy to optimize billions of parameters. The method creates a cloud of candidate solutions around the current best solution, evaluates them, and moves in the direction indicated by the better candidates. It searches in parameter space, not action space, and can change every weight using evolutionary optimization.

He describes this as both old and new. Evolution strategies date to the 1950s and 1960s, with variants such as CMA-ES using covariance information to infer useful directions of change. What is new is their application to very large models. Cognizant’s work applied the approach to fine-tuning open-source models such as Qwen and LLaMA after pretraining. Other groups, he says, have begun exploring even pretraining from scratch with evolution strategies, although he describes that as cutting edge and not yet settled.

Billions

of model parameters Miikkulainen says evolution strategies can now optimize

In fine-tuning, the setup remains recognizable: a base model is evaluated on examples that express the desired behavior, and candidate parameter changes are rewarded when they perform better. The task might be a particular kind of math reasoning or producing shorter, more direct answers. The evolutionary strategy does not receive a gradient explaining how each weight should move toward the correct answer. It receives a performance signal: this solution did better on average than the others.

The comparison with gradient descent is not cleanly resolved. Miikkulainen concedes that there is still a landscape of rewards, and that the field does not yet fully understand why population-based search sometimes appears to find more principled changes. His explanation is that the landscape may be rough enough that following the gradient is misleading. A sufficiently broad cloud of candidates can jump over small jagged features and sense larger reward structures that local gradients obscure.

The distinction inside evolutionary computation matters. Genetic algorithms are closer to the intuitive Darwinian picture: many candidates are placed broadly in the search space; the better ones are selected; and their encodings can be recombined, allowing large jumps when two different parents have solved different aspects of a problem. Evolution strategies are more local and do not use crossover in the vanilla version Miikkulainen describes. They still use a population, but it is a cloud around the current best solution, evaluated and shifted toward better regions of parameter space.

He leaves open the possibility that future large-parameter evolutionary methods may combine the scalability of evolution strategies with mechanisms from other evolutionary approaches, including recombination. The point is not that one flavor has displaced the other. It is that different population-based mechanisms expose different kinds of search behavior, and the field does not yet know which combinations will work best at modern model scale.

The pandemic case shows both the promise and the bottleneck

The most concrete public-policy example is the pandemic decision-making system Miikkulainen’s group built for non-pharmaceutical interventions. The decisions included whether to close schools, stop buses, wear masks, or do contact tracing. The system used data on cases, deaths, hospitalizations, and government actions around the world. Because governments tried different interventions, there was diversity in the data, and the system could learn from those differences.

Miikkulainen says the group could train the system overnight and produce country-specific suggestions the next morning for any country where they had data. The output was not just a generic recommendation. It could run scenarios relevant to a country and evaluate intervention choices against projected consequences. In the decision-copilot framing, a health official could see both a recommended action and the expected effects, then adjust assumptions or policies and compare the predicted outcomes.

He contrasts the technical readiness of that system with the difficulty of getting governments to listen. Other groups faced similar communication barriers. At UT Austin, he mentions Lauren Meyers’s group, which had studied pandemics for decades and had very good understanding. That group had an audience in Austin city government; according to Miikkulainen, Austin did well compared with other cities because officials listened to science.

His group aimed globally, which was harder. Calling national leaders and getting attention was not realistic. The one success he identifies was Iceland. In fall 2021, Iceland was deciding what to do as schools opened after summer. Through an academic contact, Rúnar, the group had access to government channels reaching the health ministry and even the prime minister. They ran models relevant to Iceland, tested scenarios, and made suggestions. Miikkulainen says that, as far as he knows, those suggestions were communicated to ministers and some were followed.

The point he draws is not that AI can solve geopolitics or that governments will automatically adopt technical recommendations. When the possibility of using evolutionary decision systems for problems such as the Russia-Ukraine war comes up, Miikkulainen is cautious. He says the technology is most applicable when goals are clear and agreed upon. If a society decides what it wants to optimize—profit, production, scientific progress, equality, wealth distribution, environmental protection—then AI can search for decision strategies to get there. But conflicts with many personal, political, and institutional agendas are harder.

His larger claim is that the technology is ready for some well-matched problems, while communication and institutional adoption are not. The pandemic system was, in his view, technically ready. The hard part was making science legible and credible to the people with authority to act on it.

Diversity is the mechanism, not a side effect

A recent stock-trading competition supplies one example of why Miikkulainen thinks diversity matters. Smith refers to an Alpha Arena competition in which an unnamed “mystery model” outperformed other competitors, with what he calls forensic footprints pointing toward neuroevolutionary AI. He also mentions Julian Togelius’s work applying evolutionary methods to financial trading, including evolving the code that outputs a strategy rather than simply evolving the strategy itself.

Miikkulainen does not claim knowledge of that specific model. He treats the example as a natural fit for evolutionary computation. Trading rewards difference. If a strategy is the same as everyone else’s, it offers little edge. If an optimizer simply trains on the same historical data in the same way as other participants, it may discover similar behavior. Evolutionary systems, by design, maintain diversity and can search for strategies that others would not anticipate.

That is not merely a property of finance. Miikkulainen says evolutionary computation often needs explicit mechanisms to preserve diversity, because diverse candidates are what make discovery possible. One of the major developments of the past decade, in his account, is novelty search and quality diversity. Novelty search rewards candidates for being different from what has already been seen, even without directly rewarding performance. The surprise is that such novelty can produce stepping stones: unusual intermediate solutions from which better solutions later evolve.

Quality diversity combines that novelty pressure with performance pressure. A system can reward both being new and being good. Miikkulainen presents this as one of the ways evolution discovers genuinely surprising outcomes. It avoids narrowing the search too early around what already looks promising.

If you want diversity, then that is where evolution actually steps in and can give you that.

Risto Miikkulainen

The same logic explains his view of evolving code. Genetic programming traditionally worked in languages such as Lisp because their structure made recombination easier. More recently, he says, there are languages designed to be evolved, built to be compositional so that parts of one program can be combined with parts of another. The recombination can be random, but he insists that this is not random search. The parents have already been selected because they are good. Random recombination supplies the creativity: it avoids inserting the designer’s assumptions about which pieces should go together.

The offspring must still be viable. Some representation is needed so that recombination produces functional code rather than unusable fragments. But the principle is the same as in neural networks or trading strategies: preserve what works, recombine without overconstraining the result, and evaluate many offspring so that occasional improvements can drive the population forward.

Research systems are another place to search

AlphaEvolve is one example of evolutionary computation reappearing outside its original research community. Risto Miikkulainen says he was not involved in the Google DeepMind project, but treats it as evidence that evolutionary computation is being discovered by people outside the field because it fills a missing niche. In his telling, that is a sign of usefulness rather than a matter of disciplinary ownership: researchers who were not primarily evolutionary-computation people found the technique because it solved a problem they had.

Sakana AI is discussed in similarly cautious terms. Several of Miikkulainen’s co-authors on the neuroevolution book are from Sakana, and Craig Smith describes a Sakana system that initiated a research problem, designed experiments, wrote a paper, and submitted it to a conference that Smith thinks may have been ICML or another major venue. Smith says the paper was accepted and then withdrawn, and characterizes it as “not a groundbreaking paper,” while adding that acceptance itself was interesting. Miikkulainen’s response is brief but clear: applying these methods to research is a “tremendous opportunity.”

The most ambitious research targets he describes are not merely automating paper generation. They include continual learning and metacognition: systems that know what they know. Miikkulainen doubts these will simply emerge from current architectures. Instead, he argues that researchers should look to biological neural networks, because they are the only examples we have of systems that handle memory, continual adaptation, and self-knowledge robustly.

Neuroevolution enters, in his proposal, as the mechanism for searching over architectures when researchers can specify constraints and primitives but not the right design. A researcher might set up an environment that requires continual adaptation, then evolve architectures that cope with it. Or one might require a system to answer questions about its own performance: whether it really knows a fact, whether an answer was correct, whether it made something up.

Miikkulainen’s proposed experimental style is deliberately modest at first. He tells students the first experiment should be so small they would be embarrassed to describe it. For metacognition, such an experiment might train a network to answer a question and then answer whether it was right. If a standard architecture does not handle that behavior, evolution could search for circuitry that does.

He says neuroscience may supply useful constraints: feedback loops, reverberating circuits, spiking neural networks, or other mechanisms. But he also suggests that evolution could explore beyond what neuroscience currently measures. Neuroscientific theory is constrained by available tools, such as electrodes and MRI. Evolutionary simulations might propose computational structures that give neuroscientists new hypotheses.

The hippocampus is his preferred starting point because its circuitry is relatively well studied and it is tied to spatial representation and memory. Rather than beginning with language, which he calls the top of human cognition, one could begin with navigation: remembering where objects are, returning to locations, distinguishing remembered knowledge from confabulation. Such a domain could support experiments on memory, continual learning, introspection, and metacognition without requiring an LLM interface.

From there, Miikkulainen’s longer-term interest extends to the evolution of language. Earlier work has evolved communication among virtual creatures, but he says it usually stops at signaling. Language requires grammar, flexible structure, roles, and other features beyond animal communication. With modern compute and simulated worlds, he thinks researchers can begin asking why language evolved, why it evolved only in humans, what other communication systems are possible, and whether human language is the best one.

World models matter because agents need places to act

World models enter Miikkulainen’s account because evolutionary systems need domains where candidate behaviors can be tested, and because AI is moving from imitation toward agency. Models that imitate known statistics can be trained from data. Agents change the world. They make decisions, receive consequences, interact with other agents, and must reason about future effects. For that, he says, they need a model of the world, whether learned from data, built as a simulator, or provided as an environment such as Minecraft.

Miikkulainen says Cognizant is not currently developing its own world model, but uses world models for agentic AI. Minecraft has been one of the most versatile such environments available, in his account, though he says the landscape has changed in recent months as more models have appeared. Smith mentions systems that can generate effectively unbounded visual or spatial worlds, and Miikkulainen calls that a “great opportunity” for evolutionary methods.

The evaluation requirement is central to Miikkulainen’s account of where neuroevolution can be used. Evolution needs a domain of interaction and a way to measure performance. That measurement does not have to come directly from historical data. It can come from simulation. In finance, one might backtest or trade and observe performance. In healthcare, one cannot simply try arbitrary treatment strategies on real patients; Miikkulainen says a surrogate model is needed. If data can train a surrogate model, evolution can search for decision strategies inside it.

This is where he sees immediate applications: business, healthcare, medicine, science, transportation, marketing allocation, and clinical-trial design. The common structure, as Miikkulainen describes it, is not that these domains are “AI problems” in the abstract. It is that they contain decisions with measurable consequences, and that direct real-world experimentation is expensive, slow, risky, or impossible. A surrogate world lets evolutionary search run many alternatives before humans decide whether and how to act.

The near-term business case is decision support, not replacement

Miikkulainen describes Cognizant as a large company of about 350,000 people moving from staff augmentation and consulting toward AI transformation. Cognizant AI Lab’s role is to provide technology for that shift. Its current priorities include multi-agent systems and neuroevolutionary decision strategies for business.

He does not claim neuroevolution should replace all business decision-making. The most immediate applications he names are narrower: marketing budget allocation, transportation, clinical-trial design, and domains where current methods are relatively uninformed or where value can be shown against an existing baseline. He emphasizes small wins. AI is hard to deploy when it attempts to replace an entire established process. It is easier, he argues, to run alongside humans, replace a small piece, and gradually build confidence as value becomes clear.

That framing also shapes his view of decision copilots for executives or officials. A CEO might use such a system to work through possible decisions, but Miikkulainen says Cognizant already had a prototype of this general kind in pandemic decision-making, where the user was not a CEO but a health official. A key feature is not just that the AI recommends an action. It should immediately show the predicted effect of that action, and the human decision-maker should be able to alter the proposed decision and compare predicted consequences. That interactive loop matters because it lets the user test alternatives they believe might work and, if the model projects worse outcomes, potentially convince themselves that the recommendation is reasonable.

Cognizant’s commercialization path is still early. The lab has developed basic technology and is building teams that can implement it with customers. A product may come later, but current deployment still requires expert teams working with domain specialists. AI remains difficult to deploy, and Miikkulainen does not present neuroevolution as removing the need for domain knowledge.

Multi-agent systems are another part of the business direction. Miikkulainen describes a progression from big data to deep learning to LLMs to agentic AI, and then to multi-agent AI. Multiple agents can communicate, solve problems together, evaluate one another, and keep a system in check. Neuroevolution may have a role, he says, in creating creative agents that learn specialized decision strategies, or in fine-tuning models to reason expertly about particular domains such as medical knowledge.

The textbook is meant to give the field a common foundation

Miikkulainen and his co-authors have produced what Smith describes as an authoritative textbook for neuroevolution, analogous in ambition to Sutton and Barto’s role for reinforcement learning. Miikkulainen says there have been books on evolutionary computation, but not on neuroevolution specifically. The goal is to attract attention and give researchers enough shared background to extend the field.

The book covers the historical origins of the ideas and the current points of expansion: combinations of neuroevolution with reinforcement learning, deep learning, biology, generative AI, and LLMs. Miikkulainen frames the timing as important because many of these combinations are “breaking out” now. The effort also includes software and digital resources, including demos, student exercises, and a GitHub site for community contributions, paper pointers, and new developments.

The collaboration with Sakana AI was for the book rather than an ongoing Cognizant-Sakana project, though Miikkulainen leaves open future collaboration. He says the authors came together because they had different but compatible perspectives that, when combined, formed a useful foundation for the field.

His own research remains split between Cognizant and UT Austin. At UT, he says his group is currently focused largely on cognitive science: modeling human behavior, including patients with stroke or dementia, and earlier work on the visual cortex. He sees that work converging with neuroevolution because systems that exhibit behavior will eventually need cognitive aspects of behavior as well.

He also argues that AI’s current progress depends heavily on interaction between industry and academia. Industry supplies compute, data access, engineering resources, and systems that work. Academia can explore ideas that are further out, including architectures that are not yet feasible for industry deployment. Miikkulainen sees the relationship as synergistic: both publish, both attend the same conferences, and researchers increasingly keep one foot in each world.

That division of labor matters for neuroevolution because many of its most ambitious questions require scale. In the 1980s and 1990s, Miikkulainen says, researchers had good ideas in natural language processing but lacked the data and compute to know they would work. He sees evolutionary computation in a similar position: many ideas existed before the field could scale them. Modern compute, simulations, and large models make it possible to revisit those ideas at a different level.

Data and Training AI Research Methods Agents and Autonomy AI Infrastructure and Compute Enterprise AI Adoption