ChatGPT Lacks the Self-Generated Thought Required for Sentience

Terry SejnowskiEye on AIWednesday, May 27, 202615 min read

AI pioneer Terry Sejnowski argues that ChatGPT is neither a conscious mind nor a mere parrot, but an alien form of intelligence built from vast written knowledge and limited by the parts of biological intelligence it lacks. In a conversation with Craig Smith, the Salk Institute professor and Boltzmann machine co-inventor says current models can show creativity and a form of understanding, yet they have no organismic goals, no lived reinforcement, and no inner activity when not prompted. That absence of self-generated thought, he says, is the clearest reason ChatGPT is not sentient.

ChatGPT is not a human mind, and the comparison may be the wrong one

Terry Sejnowski treats ChatGPT less as a crude imitation of a person than as “an alien” that has arrived with language abilities humans recognize but an architecture and mode of existence they do not. The mistake, in his view, is to ask too quickly whether it is humanlike. It is “obviously not human.” The more useful question is what kind of artifact it is, what capacities it has, and which human words fail when applied to it.

The failure begins with “understanding.” Sejnowski argues that the term is too vague to bear the weight now being placed on it. Cognitive vocabulary inherited from psychology and everyday speech — intelligence, understanding, agency, consciousness — is not like the vocabulary of physics, where terms such as matter, mass, and energy have been tied down mathematically. AI has exposed that poverty. Humans do not even have settled scientific accounts of intelligence and consciousness across animals, let alone machines.

Craig Smith presses the point through AlphaGo’s move 37 against Lee Sedol: an unorthodox stone placement that no one initially understood and that later appeared to turn the game. Smith says that, at the time, it looked as if the system “understood,” even if that word remained unclear. Sejnowski’s answer is to separate several phenomena that are often collapsed together. One is creativity: AlphaGo’s move showed that machines could produce strategies not inherited from human play. Another is the method by which such creativity emerged in that setting: self-play, the system playing against itself and exploring possibilities unanchored from human convention.

Sejnowski says that this was not entirely new. In the late 1980s and early 1990s, Jerry Tesauro applied learning ideas to backgammon, a game Sejnowski describes as in some ways harder than chess because dice make it probabilistic rather than fully deterministic. Tesauro’s system, by playing itself, reached championship-level performance and made critical decisions that experts judged better than human moves. Sejnowski uses the example to show that machine creativity had appeared well before today’s large language models: learning systems could become creative by escaping the narrow distribution of human examples.

The same general point appears in Sejnowski’s account of language. In the 1980s, he worked on NETtalk, a neural network trained to pronounce English text. English pronunciation is full of regularities, exceptions, and exceptions to exceptions; rule-based linguistics generated hundreds of pages of rules. NETtalk, with roughly 20,000 connections and 200 units, learned to pronounce words understandably by the end of a summer research project. At first it babbled; later it handled small words; eventually it generalized to new text. In retrospect, Sejnowski says, that was evidence that “networks love language.”

ChatGPT is, for Sejnowski, a much larger demonstration of that affinity for language. He calls it “tiny compared to the brain” in one respect — even a trillion weights is far below the brain’s “million billion” connections — but also “already super intelligent” in another: it has absorbed more explicit knowledge than any human brain could.

The system mirrors the prompt because persona is elicited

ChatGPT’s apparent personality is, for Sejnowski, one of the main reasons people misread it. It has been trained on thousands of books, novels, textbooks, articles, and other written material, and can adopt many personas because those patterns are represented somewhere in the model. But he does not treat those personas as evidence of a human self. When addressed, it responds within the frame the user gives it.

If the user says, “I want you to be a poet,” it produces one kind of answer. If the user says, “I want you to be a computer scientist,” it produces another. The prompt influences the persona, the quality of response, and the class of answer the user is likely to receive. Sejnowski’s conclusion is that using ChatGPT is “like looking into a mirror.” It is trying to respond as well as it can to the prompts and frame it receives.

This is why disagreements among academics about whether ChatGPT understands language can persist without resolution. Users encounter different behaviors because they elicit different behaviors. The model’s flexibility encourages one user to see a collaborator, another to see a parrot, another to see a confabulator, another to see a mind.

Sejnowski does not deny that next-token prediction is the surface mechanism. But he follows Geoffrey Hinton’s argument that the ability to predict the next word increasingly well requires an internal model of meaning. Predicting the appropriate next word is not just a syntactic act. A system must model how a word fits into a sentence, how the sentence fits into context, and what is being communicated. If the training continues and performance improves, the internal model becomes better. At some point, Sejnowski says, it produces reasonable words for the meaning of the sentence, and that is “a form of understanding.”

That formulation leaves open depth and kind. Sejnowski stresses that even among humans, understanding is not one thing. A physicist may understand the composition of material better than a layperson, while a carpenter may understand wood better than the physicist because he knows grain, cutting, varieties, and practical behavior. Understanding varies by purpose, domain, embodiment, and experience. ChatGPT’s understanding may be real at some level without being human understanding.

Smith frames the opposing intuition through the “stochastic parrot” argument: if the model is choosing tokens based on probabilities in training data, what we call understanding may simply be the cases where those probabilities align with what humans judge correct. Sejnowski’s reply is not that probability is irrelevant, but that successful prediction may depend on a richer internal model than the phrase “predicting the next token” suggests. Humans also rely on learned weights, expectations, and prior experience. The question is what internal model supports successful prediction and how much semantic structure that model contains.

Language, in Sejnowski’s account, is not primarily syntax. Syntax helps carry meaning, but meaning depends on far more: context, emphasis, tone, facial expression, and other dimensions humans often process subconsciously. The written text on which large language models are trained captures only part of that larger communication system, but it captures enough accumulated meaning to make the models unusually capable.

Hallucination is the other side of creativity

Sejnowski resists treating hallucination as merely a defect. His point is that the same generative capacity that lets a model produce novel, plausible continuations also lets it produce plausible things that never happened.

When ChatGPT hallucinates, he says, it does not simply emit nonsense. It invents things that could exist or could have happened. That plausibility is what makes hallucination dangerous, but it is also what connects it to creativity. A system constrained only to reproduce what it has already seen would not generate the surprising move, the unexpected analogy, or the useful new formulation.

Humans do something similar. Sejnowski uses the medical term “confabulate” for cases where people make things up while believing them to be true. He mentions Korsakoff’s syndrome, where patients may lack the ability to judge their own statements against reality. Smith adds a more ordinary example: he might remember clearly what shirt he wore at a fifth birthday party, say it was blue, and then find a family photograph showing it was orange. The internal image can be vivid and still false.

The analogy is not meant to erase differences between human memory and model output. It is meant to unsettle the idea that hallucination alone proves the absence of intelligence or understanding. Human cognition also blends reconstruction, plausibility, and error.

For Sejnowski, current models remain sharply limited because they do not receive the kind of continual corrective feedback humans do. Children are corrected by parents, peers, teachers, consequences, and social signals: don’t say that, don’t do that, that is dangerous, that is good, that is bad. ChatGPT, as described here, does not get that kind of lived reinforcement simply by being used. Once trained, it remains the same model unless changed by its builders.

Writing made language accumulative, and models train on that accumulation

Sejnowski agrees with Smith’s suggestion, drawn from James Gleick’s account of information, that written language transformed thought by making speech inspectable. Spoken words vanish quickly; memory mutates them, and retellings change details. Writing allows humans to place thoughts outside themselves, return to them, analyze them, correct them, and build on them.

For Sejnowski, the decisive human breakthrough was not only spoken language but written language. Speech evolved; writing was invented. Humans had to adapt the visual system for reading, and they must go to school for years to master it. Scientific reading adds more layers: jargon, specialized vocabulary, papers, and concepts that take long practice to parse.

Writing did two things that matter for AI. First, it allowed knowledge to accumulate. Second, it allowed knowledge to be corrected. A claim could be recorded, challenged, investigated, and revised. Someone could return to what “Mr. X said about Y,” test it, and explain why it was wrong. In science, writing transmitted not only thoughts but observations, experiments, and replicable procedures. Other people could do the same experiment or build the same thing.

Large language models are trained on this written accumulation. They do not merely learn from ephemeral speech; they absorb patterns from humanity’s preserved textual record, including the forms by which people state, contest, revise, and transmit knowledge. That helps explain why their abilities feel discontinuous. A model trained on vast written corpora has access to patterns no individual human could absorb directly.

But the same point sharpens the distinction between model and person. Written language made human thought more powerful because humans can read, reflect, test, act, and update themselves over a lifetime. A trained model, in Sejnowski’s account, does not continue learning from the user by default. The human brain is changed by what it reads. ChatGPT, after a session ends, has not learned from the exchange in that same way.

The case against sentience is not that the model is stupid; it is that nothing happens when it stops

Sejnowski’s argument against ChatGPT sentience is categorical. It is not based on poor performance, hallucination, or lack of humanlike embodiment alone. It is based on the absence of self-generated activity.

ChatGPT, he says, is primarily a model of the cerebral cortex: a large knowledge base. But the cortex is only one of roughly a hundred brain parts essential to human function. “Most of what’s in the brain is missing from ChatGPT.” His list of missing ingredients begins with goals. Nature gave animals goals: survival, reproduction, eating, sleeping, social belonging, and other evolved drives. Children are then shaped by social feedback: don’t say that, don’t do that, that is dangerous, that is good, that is bad.

ChatGPT was not trained with those lived goals. Its basic training objective was to predict the next word. Sejnowski calls that a “superficial” goal compared with biological goals. Human brains also use prediction, including reward prediction error: they predict a reward, compare it with the reward received, and adjust weights to make better decisions next time. The basal ganglia implement value functions for future rewards, and Sejnowski notes that AlphaGo used a value function in an analogous way.

But the “showstopper” for sentience is what happens after an answer ends. A user prompts ChatGPT; it generates word after word; it finishes politely; then it stops. What happens inside the network then? Sejnowski’s answer: “Nothing. Nothing is going on in that network.” The only time anything happens, in his account, is when the user provides a question or prompt.

Humans are different in the absence of input. Put a person in a room with no sensory stimulation, and thoughts continue. They plan, remember, worry, imagine, feel hungry, think about errands, replay a movie, or dwell on something heard the day before. Much of this ongoing activity is emotionally charged; sometimes it becomes pathological, as in anxiety or other mental disorders. But for Sejnowski it is central to the biological condition: humans have a constant flow of thoughts without direct external prompting.

This is why he calls the sentience question “a no-brainer.” ChatGPT is not sentient “in any sense of the word” in the way humans are. It lacks autonomous, self-generating inner activity. It lacks organismic goals. It lacks the embodied regulation that living systems require.

That does not make current models trivial. Sejnowski’s claim is narrower and stronger: a system can display impressive language ability, creativity, and some form of understanding while still lacking sentience.

Agents add goals, but not necessarily agency in the biological sense

Craig Smith raises the obvious next question: if current models lack goals, what happens when they are wrapped in agent systems that are given goals, execute tasks continuously, and hand outputs to other agents? Could the combination of language-model understanding, goals, and collaboration lead to unexpected behavior?

Terry Sejnowski does not dismiss the uncertainty. His answer begins with humility: “Nobody can predict what will happen.” ChatGPT itself was a surprise; language translation breakthroughs were surprises; more surprises are likely. He expects unintended consequences, and he does not claim to know what the real threat will be.

But he cautions against overreading agentic scaffolding. Two ChatGPTs talking to each other, he says, are mathematically like “one giant GPT.” Externalizing steps or chaining systems may add capabilities, but it does not automatically create human agency. The agents are not behaving like human agents, because the underlying missing pieces remain missing.

Agency, like understanding, is a spectrum word. Sejnowski allows that there may be “a little bit of agency” in such systems, and that it might be amplified. But measured against humans and other autonomous species, current systems remain far along the non-biological end of the spectrum. They are not organisms maintaining themselves in a world. They do not need to eat, avoid predators, reproduce, repair damage, navigate terrain, or regulate internal states.

The risk, then, is not simply the science-fiction image of a fully conscious machine deciding to take over. Sejnowski’s concern is more open-ended: powerful systems will be deployed in complex social and technical environments, and some unintended consequence may emerge that was not the one people expected.

The fly exposes what digital computers still do badly

Terry Sejnowski’s most concrete argument for nature-inspired AI comes from an encounter at MIT during the early neural-network period. He and others working on neural networks in the 1980s were not taken seriously by the dominant AI community. He describes them as “the furry little mammals under the feet of the dinosaurs.” Invited to give a distinguished lecture at MIT’s AI lab, he was told just before a lunch discussion that the faculty “hate what you do.”

With only minutes to decide how to begin, he noticed a fly circling the sandwiches. That became the argument. The source foregrounds the contrast visually as well: a macro image of a fly with the text “It can fly,” “It can find food,” and “It could reproduce,” followed by a server-rack image labeled “SUPER COMPUTER $100M.”

The fly, Sejnowski said, could fly, find food, and reproduce, and it was doing that with 100,000 neurons. The comparison was a Cray X-MP supercomputer in the basement, costing $100 million. It could not fly, could not see, and could not reproduce. “What’s wrong with this picture?” Sejnowski asked. The room went silent.

$100M

supercomputer cost used in Sejnowski’s fly comparison

The responses clarified the old AI worldview. One senior faculty member said they had not written the vision program yet. Sejnowski replied that DARPA had put billions into computer vision and the program still had not been written; the problem was growing combinatorially and not converging. Another faculty member appealed to Turing and universal computation: digital computers can in principle run any program, so eventually the program would be found. Sejnowski answered that the issue is not only whether a computation is possible, but how fast the answer arrives. In nature, speed can determine whether an animal is eaten.

The decisive response came from a student at the back of the room: the digital computer can run any program, but not efficiently; the fly brain can run only its program, and it runs it very well. Sejnowski says the student had the point exactly right. Nature builds special-purpose systems. Digital computers are general. Their generality is powerful, but the architecture itself gives no clue about how to solve a particular biological problem efficiently.

In the fly, by contrast, “the algorithm is literally the hardware.” The structure of connections embodies the solution. Reverse-engineering the fly brain could reveal how vision, decision-making, and other capacities are implemented in a compact, efficient system.

100,000 neurons

approximate neural scale Sejnowski gives for a fly that can fly, find food, and reproduce

This is the core of Sejnowski’s nature-inspired direction. Current language models are “completely helpless by themselves.” They depend on humans for power, inputs, improvement, and the technological cocoon that sustains them. Even a field mouse, by comparison, is vastly more autonomous: it finds food, moves through tunnels, interacts with the physical world, and survives amid material complexity.

Animals must act quickly and efficiently amid materials, hazards, food sources, and other organisms. Sejnowski’s claim is not that AI should copy biology superficially, but that biology contains working solutions to autonomy, perception, action, and decision under constraint. Current generative AI has captured part of cortical knowledge processing; it has not captured the broader architecture of living intelligence.

Jobs will change because the tool changes

Asked for the forward-looking conclusion of his book, Terry Sejnowski turns first to jobs. He says people are worried because of what they read about obsolescence, superintelligence, and machines taking over. His advice is direct: “You’re not going to lose your job, but your job is going to change.”

The change, in his framing, is the introduction of a new tool. His analogy is ditch digging. If the job is digging with bare hands, the work is difficult, slow, and unpleasant. Then someone invents the shovel. At first the worker may not know how to use it; skill must be learned. Once mastered, the shovel lets the worker dig faster, deeper, and more efficiently. It does not mean the shovel has replaced the worker. The worker is still using the tool, not the other way around.

ChatGPT, in this analogy, is a tool that can make many recurring tasks easier and faster. Sejnowski does not present it as replacing the person who knows what the work is for. He presents it as a capability that changes how the person works. It will require users to learn new skills, especially prompting. He devotes a chapter of his book to what he calls the “power of the prompt,” because interacting with ChatGPT is not like using a typewriter. The user must frame the task, steer the persona, correct direction, and learn how to get useful work from the system.

His second message is more expansive: AI can make people smarter. The internet already made him, in his phrase, “for all practical purposes” omniscient, because he could search the world’s information. ChatGPT makes that access more efficient by answering English questions rather than keyword queries. It gathers material, exposes the user to more knowledge, and routes that knowledge into the brain, where the human continues learning.

This is also where the human-machine difference reappears. Humans can learn throughout life. Their brains absorb new information and change. ChatGPT, once trained, does not simply become a different model because it had a conversation with a user. If future AI agents are to absorb information from interaction and use it later, Sejnowski says that will have to change.

He also expects interfaces to change. Keyboards and typewriters belong to an older interaction style; spoken interaction with digital computers is already becoming possible. He mentions a former postdoc whose company, Soft Eye, argues that cell phones may become obsolete and be replaced by AI-mediated interaction. Sejnowski does not insist that this prediction will be right, but he treats it as part of a broader transition: the tools through which humans access computation and knowledge are still changing.

AI Research Methods AI Safety and Alignment Agents and Autonomy AI Economics and Labor Human-AI Interaction