AI Automation Is Expanding the Human Work Layer

Dan ShipperLenny's PodcastSunday, May 24, 202629 min read

Dan Shipper, co-founder and CEO of Every, argues that the next phase of AI at work will not be a simple substitution of machines for people. Drawing on Every’s use of agents across a 30-person media and software company, he says better automation is creating more human work around framing, supervising, integrating, and judging AI output. His forecast is that agents will become shared company infrastructure and daily work surfaces, while SaaS, product managers, designers, and forward-deployed engineers remain central because someone still has to decide what should be built and trusted.

Automation creates a larger human layer, not a smaller one

Dan Shipper’s central claim is deliberately paradoxical: better automation is not making his company smaller or his own workload lighter. Every, the media and software company he runs, has roughly doubled from about 15 people to almost 30 in the past year while becoming more AI-forward. Its employees include engineers, designers, writers, editors, salespeople, and customer-service people, and Shipper describes all of them as AI early adopters.

~30 people

Every’s current size, up from about 15 a year earlier, according to Shipper

That matters because Shipper is not presenting AI adoption as a distant management theory. He says Every has tried to create “a little pocket of the future,” where the company’s job is not to prognosticate but to “live in it together.” The company tries new releases from model companies, sometimes in alpha or beta, and then writes about what it notices. That was the context for his earlier call that non-engineers were “sleeping on Claude Code,” a view Lenny Rachitsky says proved “unbelievably right” after Anthropic and OpenAI moved toward broader nontechnical uses of coding agents.

Shipper’s new forecast begins from a tension he thinks benchmark discourse misses. He accepts that model capability is improving quickly. He cites a METR benchmark in which, according to his description, an Anthropic model preview could do 17-hour tasks at 50% accuracy, and says model progress is “going up exponentially.” But his lived experience points in the other direction from “everyone gets replaced”: when agents do more work, humans get more work managing, framing, supervising, integrating, and raising the standard for what counts as good.

Automation is a lie. Every agent needs a human. We have so much automation, so much AI, and I also work way more.

Dan Shipper · Source

His explanation is that automation does not remove the human layer; it changes what the human layer does. Every automated system needs someone “on top of it” making sure it works well. Shipper compares AI work to management: managers are not typically “on the beach”; they are checking in, assessing output, resolving ambiguity, improving systems, and making judgment calls. Managing models differs from managing people, but in his view it still consumes time and attention.

That premise sits underneath most of his predictions. The work does not disappear. It changes shape, and often expands. More people can create software, analyses, documents, and campaigns. That creates more review, more coordination, more integration, and more demand for people who can decide what should exist, what should be deleted, and what should be trusted.

Prediction	Mechanism Shipper describes	Implication for work
Company super-agents become common	Personal agents are still too fiddly; a shared agent can be maintained by someone who cares about it	Companies need operators or forward-deployed engineers responsible for agent reliability
Codex, Claude Code, and Cowork become work surfaces	An agent on the user’s computer can see files, browse, research, and act in parallel with the user	Knowledge work moves into environments where the agent and human work on the same object
SaaS survives agent adoption	Users bring their own AI into SaaS products rather than relying only on embedded product agents	SaaS companies build for humans and agents together, with new load, pricing, and support patterns
CLIs lose their status as the main paradigm	The terminal mattered because of access and tool use, not because text-only interfaces are ideal	The power of the CLI moves into graphical work surfaces
PMs and full-stack designers gain leverage	Building is cheaper, while taste, judgment, and product sense remain scarce	People who know what should be built can ship more directly
The job apocalypse is not the base case	Models commoditize yesterday’s competence while creating new integration and judgment work	Workers still need to change how they operate, but Shipper does not expect mass replacement

Shipper’s main forecasts, with the mechanism and practical implication he attaches to each

Work splits between the company agent and the local AI work surface

The future of knowledge work, in Shipper’s view, splits into two dominant modes.

The first is a shared company agent, probably in Slack, that employees can delegate work to. Shipper initially thought the model would be personal agents: every employee with their own assistant, a kind of parallel org chart in which each agent became a reflection of its owner. He compares that world to the daemons in The Golden Compass: “a little part of your soul.”

He has since changed his view. For now, he thinks the workable model is one “super-agent” for the company, with possible team-level agents underneath it. He says Shopify and Ramp have internal examples that point in this direction. The reason is not that personal agents are conceptually wrong; he still believes they are coming. The reason is operational. Current agents are “fiddly.” They break, need maintenance, require setup, and often demand technical skill that many employees do not have or do not want to spend time acquiring.

The mechanism, as Shipper puts it, is simple: “agents need people who care about them.” Once the human responsible for maintaining an agent stops caring enough to keep it working, the agent becomes less useful. That pushes companies toward a centralized model: appoint a forward-deployed engineer or similar operator whose job is to make the company agent reliable, useful, and broadly adopted. Over time, as models become more independent and the harnesses become less fragile, Shipper expects agents to “trickle down” into more specialized team and personal forms.

For work, he thinks this will mostly happen in Slack. He distinguishes it from what Every’s COO Brandon Gell calls “computer errands” — personal-agent tasks like ordering groceries — which may live elsewhere and remain separate from work agents. In companies, Slack is where people already coordinate and where a shared agent can sit in the flow of work.

The second mode is more radical: Shipper believes most work will happen inside local AI work surfaces such as Codex, Claude Code, or Claude Cowork. These tools will become, in his phrase, an “operating system” for knowledge work. Email, documents, research, app usage, and software work all move into an environment where an agent can see what the user is doing and act on the same computer.

Shipper’s account of how this happened starts with Claude Code. Anthropic put an agent on the user’s computer and gave it terminal access. Because models understand the terminal well, and because the terminal gives broad access to files and tools, this created a powerful coding paradigm. But users then discovered that a coding agent on a computer was useful for much more than code. Anthropic later built Cowork as a more polished wrapper around the same underlying idea.

OpenAI, in Shipper’s view, initially lagged on this paradigm. Earlier Codex versions were technically strong but too literal: “super smart” but hard to use for broad knowledge work. He thinks that changed with later Codex releases and the launch of the Codex desktop app. Codex is now his daily driver. He still uses Claude, but says OpenAI is currently “getting the paradigm right.” He also frames this as a “horse race,” with leadership likely to shift over time.

What makes the paradigm different is not merely that the agent can answer prompts. It is that the agent can watch, browse, research, manipulate files, and work in parallel with the user. Shipper describes writing inside Proof, an online Markdown editor he built, within Codex’s in-app browser. Codex can see what he is doing in the document; he can see what Codex is doing; the agent can write, research, use the computer, and return with results.

He gives the same pattern for email. He says he had been at inbox zero for 10 straight days, which he calls unusual for him, because Codex gathers his emails using Cora, Every’s email agent, renders a page, and lets him “monologue” instructions into it. For each email, he might tell the agent to research a question, gather documents from several years, produce a report, and send it. Tasks he would otherwise procrastinate on are less likely to stall.

The larger product prediction is that the industry may have had the browser-agent relationship backwards. For a long time, Shipper thought the optimal AI experience would be putting AI inside the browser. Now he thinks the stronger pattern is putting the browser inside the AI agent. The agent becomes the primary work environment; the SaaS tools run inside it.

That also changes how to interpret the command-line interface boom. Shipper is careful not to say CLIs will vanish. CLIs have been around for decades and will continue to be used. But he thinks the industry misread why Claude Code worked. Many people concluded the key was the CLI itself. Shipper thinks the real key was that the agent lived on the user’s computer, had broad access, and could operate tools well.

Once those benefits move into a graphical interface, he expects many users to prefer the GUI. “We made GUIs for a reason,” he says. A visual interface can preserve the same agentic power while becoming easier for non-programmer work. At Every, he estimates that most technical people are no longer using CLIs as their main work surface. Programmers may still flip into a terminal, but their primary environments are increasingly Codex, Claude Code, Cursor, and similar tools.

Cursor, in Shipper’s view, sees much of the same landscape, and he praises its cloud implementation as more advanced than OpenAI’s or Anthropic’s in some respects. But he thinks Cursor has more distinctly chosen to serve programmers. The definition of “programmer” is expanding, so that market can still be large. But he is unsure whether Cursor will move toward general work such as slide decks and broader productivity tasks. He also says Cursor was “essentially acquired by SpaceX,” while immediately qualifying that it was “not like a full acquisition, but it’s close”; the point he attaches to that claim is that companies are realizing the model alone is not enough. The model needs a harness.

The important strategic pattern, for Shipper, is that the model needs an environment in which it can run on a computer, use tools, maintain context, and complete tasks. He says the major platforms are moving beyond prompt-and-response toward model calls that run on cloud computers controlled by the provider and return results. He also points to Anthropic’s managed-agent direction and expects OpenAI to have its own response.

For Shipper, the “ultimate form” of that harness is not just coding. It is the ability to do any knowledge work. The CLI mattered because it revealed how powerful an agent can become when it has real access to the user’s tools. The next phase, he argues, is taking that access and making it usable in a broader interface.

Rachitsky highlights how substantial that shift would be: most people do not currently talk to a company AI agent in Slack every day, and most people do not currently spend the bulk of their work inside Codex, Claude Code, or Cowork. Shipper’s forecast is not just that AI becomes another feature. It is that the surface of work changes: the agent becomes the place where other tools are accessed.

SaaS does not disappear when agents become users

Shipper rejects the “SaaS apocalypse” view outright. His contrarian line is blunt: “I would buy SaaS stocks right now.” He frames that as a prediction, not investment advice, but the reasoning is specific. Agents do not eliminate SaaS; in his view, they increase the number of SaaS users by adding nonhuman users that operate at high volume.

This turns a common AI-product assumption upside down. Many SaaS companies are rushing to add their own embedded agents, assuming that the AI surface inside the product will become the main way users interact with them. Shipper thinks that may be less important if users bring their own agent — Codex, Cowork, Claude Code, or another work-surface agent — into the product. In that case, the SaaS company may not need to pay for the user’s tokens. The customer’s agent pays that cost through the customer’s model subscription or API usage.

That has a direct margin implication. If a SaaS company builds software that both humans and agents can use together, the company may avoid turning every interaction into its own token expense. Shipper says Proof has this dynamic: users bring their AI to Proof, so Every does not pay for their tokens.

He says Every’s own behavior supports his bullishness. Despite being heavily agent-based internally, Every still pays for “a ton of SaaS,” and its SaaS spend is up year over year. The company is not “vibe-coding every single little thing.” Shipper expects many companies to behave similarly: use agents to do more work, but continue paying for software systems where the work happens.

The change is what SaaS companies need to build for. Traditional productivity tools are built mostly for humans. New command-line interfaces are often built for agents operating independently. Shipper thinks the next paradigm is collaborative: the human and the agent are on the same piece of work together. The agent may use a CLI or API; the human may use the web interface; both need to stay in sync.

That synchronization is not a minor implementation detail. If the human edits through the product’s visual interface while the agent edits through a CLI, both sides need to see the same state quickly enough to collaborate. A command the agent runs should show up for the user. A change the user makes should be available to the agent. If the tool has separate surfaces for humans and agents, they cannot behave like disconnected products bolted onto the same backend. Shipper’s practical advice to builders is to assume the human and agent are working together, not that the agent is off in a separate lane.

That affects product design in a concrete way. Shipper says some features in legacy editors may become less necessary because the agent can handle them. Proof, for example, does not need to reproduce every Word-like formatting affordance if the agent can do formatting. A product can be simpler and faster to start if the agent can do the mechanical manipulation that used to require dense interface controls.

But other affordances become more important. The user needs visibility into what the agent is doing. The agent needs visibility into what the user is doing. The product needs to handle proposed changes, completed changes, approvals, logs, and rollback. If an agent can make many changes at once to a document, deck, or codebase, a product needs a way to show those changes to a human in a coherent, reviewable form. A conventional multiplayer cursor is not enough when the collaborator can transform the entire object in seconds.

Approval flows become part of the product surface rather than an afterthought. Shipper describes the need for “a sort of inbox” that summarizes what is going to happen or what has happened. For a document, that may mean grouping many proposed edits into a reviewable set. For a product configuration, it may mean showing the user the downstream effects before they accept. For a codebase or data workflow, it may mean logs that make the agent’s actions reconstructable and rollback mechanisms that can undo bad changes quickly.

It also affects infrastructure. Agents can make “a billion requests in like three seconds,” Shipper says. He attributes current GitHub pressure partly to agents using GitHub on behalf of people, describing usage as rising because “it’s really just people’s agents using GitHub.” The broader point is that agent-driven load creates pricing, reliability, rate-limit, and support questions that do not look like ordinary human usage. If the agent is both a user and an amplifier of human intent, then usage volume may rise even when employee headcount does not.

One support example from Every illustrates the loop he expects. In Proof and some other Every products, when a user has a problem, their agent can send a bug report. Shipper says an agent bug report is “way better than a human bug report” because it can include exact repro steps, what the agent did, and even a theory of what is happening in the open-source codebase. That report can become a GitHub issue, and then another agent can be sent to fix it. He sees the beginnings of a fast closed loop: a user’s agent finds a bug or paper cut, talks to the company’s agent, and the company’s agent fixes it.

Shipper also argues that “two agents are better than one.” If a user’s Codex or Cowork agent interacts with a product’s agent, it can provide far more context than the user would type into an onboarding form. Instead of building a traditional setup flow that asks who the user is, what they want, and what outcome they are seeking, a product could assume the user arrives with an agent that already knows their projects, preferences, and constraints.

That assumption changes onboarding. In Shipper’s example, a technical product that might normally ask a user to fill out a workflow or web form could instead give them a prompt to paste into Codex or Cowork. The user’s agent then talks to the app or the app’s agent, explains who the user is, describes relevant work, configures the setup, and brings the result back. If something goes wrong, the user can ask their own agent to diagnose it by talking to the product. The product does not need to extract every bit of context from the human through forms because the human’s agent can carry that context into the interaction.

The result is not a SaaS product disappearing. It is a SaaS product becoming a collaborative surface for humans and agents. Shipper’s view is that the winners will not simply graft chatbots onto old interfaces. They will decide where humans need judgment, where agents need direct access, how both stay synchronized, and how the product absorbs high-volume agent usage without becoming expensive or unreliable.

Benchmarks miss the work of framing the problem

Shipper uses his own failed launch of Proof to explain why he remains bullish on engineers even as coding models improve.

Proof was “vibe coded” on the side while he was running Every. It worked internally and with beta testers, but after launch it began going down repeatedly. Shipper says he could not fix it. Codex would claim to know the problem and fix it, but would create new errors, leaving him in a loop. He jokes that he “vibe coded so hard” he got bursitis in his elbow.

To turn the experience into a benchmark, he had two senior engineers independently rewrite the codebase. Their work became the human comparison. He then gives new models the same codebase and asks: this is vibe-coded slop; if you wanted to rewrite it from first principles, how would you do it?

According to Shipper, earlier models scored around 30 out of 100. Human senior engineers scored in the high 80s or low 90s. He says GPT-4.5, using what he calls an Opus 4.7 plan, scored around 62. What distinguished GPT-4.5, in his telling, was “agency and confidence”: it was willing to rip out old code and rewrite from first principles, while other coding models tended to patch around the edges even when told not to.

62/100

score Shipper says GPT-4.5 reached on his senior-engineer benchmark

Shipper thinks it is clear that models will soon reach “senior engineer level” on that benchmark. But that does not mean they replace senior engineers. He says he could change the benchmark and “zero out” the current model by testing a higher-level skill: not whether it can execute a well-framed rewrite task, but whether it can decide on its own that the right move is to reject the prompt’s frame.

His original production prompt was not “rewrite this from first principles.” It was more like: we had several reported issues yesterday; go through them, make a plan, and fix them. Shipper predicts every coding model on the market will still, a year from now, take that instruction seriously and try to fix the listed issues. A human senior engineer, by contrast, may inspect the codebase and say: this is fundamentally broken; we need a risky rewrite; you may not want to hear that, but it is necessary.

That distinction is central to his skepticism about simple benchmark interpretations. Benchmarks rise on problems humans have framed, articulated, and scored. But a large part of human work is deciding what problem should be framed in the first place, what the score should be, and when the stated task is the wrong task. Once that human work is written down, it can be benchmarked. Before it is written down, it is harder to measure.

Shipper also rejects a clean “AI versus human” framing because in real work the human is usually using AI too. He clarifies that the senior engineers in his benchmark were not writing every line “artisanal” by hand. They used AI. The difference was that they used it with expertise he did not have: understanding the codebase, knowing when to rewrite, and applying judgment. In practice, he says, the comparison is one human using AI versus another human using AI. “AI doesn’t use itself” in most real use cases; there is almost always a human close to it.

This is why he says benchmarks make AI look more autonomous than it is. A benchmark can show that a model performs a task once the task has been isolated. It does not necessarily show that the model would know which task matters, how to reframe it, when to refuse the assigned frame, or how to bear responsibility for the system after the patch lands. For Shipper, those are not marginal details. They are much of the work.

The new bottleneck is coherence, not capacity

The number of pull requests at Every and at large model companies is skyrocketing, according to Shipper. Nontechnical employees — people in consulting, ops, or editorial roles — can now make pull requests. That is a meaningful expansion of who can do technical work.

But increasing capacity in one part of a system creates pressure elsewhere. If more people can build, the scarce question becomes not “can we build it?” but “should this go into the product, and how does it fit with the rest of what we have built?” Shipper says technical people, and often product people, become responsible for maintaining a coherent whole. They decide which PRs to merge, how to integrate them, and what to delete. He points to Anthropic’s handling of Claude Code as an example of deletion discipline: in his telling, they remove things to keep the product from becoming bloated.

Rachitsky raises a related confusion he hears from workers: if engineers can design, PMs can code, and marketers can ship changes, what is anyone’s job anymore? Shipper agrees the confusion is real. Every has a culture of generalists who like having “their fingers in a lot of different pots,” but he expects role boundaries to settle into a new normal. Marketers will still do marketing, even if touching the website becomes part of marketing. Generalists can get further than before, especially in smaller companies, but that does not eliminate specialization.

One role Shipper thinks becomes newly essential is the forward-deployed engineer. His argument returns to the same premise: agents need humans. Model companies, he says, have internal agents and teams of people who run them. He does not expect those teams to disappear. Models will become more powerful, agents will proliferate, and people will still manage the systems around them.

At Every, he says one AI engineer, Nitesh, fits this forward-deployed profile. He spends much of his time talking in Slack to Claudi, an internal agent that runs Every’s consulting practice. He uses code and tools like Claude Code, but much of the job is conversational and operational: asking the agent why it did something, correcting it, improving its behavior, and making sure it serves the organization. Shipper says some engineers love this because it keeps them close to the latest tools and asks them to build a “being” inside a workspace rather than traditional software alone.

Rachitsky describes this as “babysitting” agents. Shipper pushes back on the framing. Babysitting makes the work sound passive: waiting for the agent to fail and fixing it. He sees the more interesting version as an engineering challenge: building systems that let less technical people safely do work that used to require technical expertise.

The data-science example makes the point. Rachitsky says he has heard from a data scientist whose team no longer primarily does analysis; everyone else now runs analyses and sends results, leaving data scientists to review bad work. Shipper says that is a real problem, but also a sign that the organization lacks the right agent system. He says at least one large model company has a data-science bot hooked into the data warehouse, aware of permissions, and maintained by a team. Employees can ask basic questions through the bot, and the responsible team continually improves its accuracy. Without that team, data scientists “would hate their lives.” With it, the basic requests are filtered away.

The pattern is not fewer experts. It is experts plus systems that absorb shallow work, while experts move toward harder and more generative work. The forward-deployed engineer is important because someone has to build that system, not just wait for individual agents to fail. In Shipper’s version, the role combines technical judgment, organizational context, agent operations, and the ability to translate messy employee needs into reliable AI-enabled workflows.

Rachitsky asks which product or technology role has changed least so far. Shipper’s first answer is CEOs and investors, or possibly middle managers, because AI use can still appear optional for people who mostly direct work rather than produce artifacts. But he argues that this is misleading. In his experience, a company “only goes as far as your CEO goes in AI,” because the leader needs hands-on intuition and cannot simply delegate the transformation. Sales is the other candidate he names, because so much of it remains human and relational, even though AI is already useful for research, sourcing, and qualification.

His recruiting example shows the kind of leverage he means. When Every was hiring a head of learning and development, Shipper had an intuition that someone from General Assembly in New York who had moved into AI could be a strong fit. He typed that into Codex, went off to do something else, and Codex returned someone he described as nearly perfect: a former General Assembly instructor, AI-oriented, and already following him on Twitter. He sent a direct message and had dinner with the person. The work of recruiting did not vanish, but a research step that would have taken much longer became a query he could delegate.

AI-generated writing becomes normal when the sender stands behind it

Shipper expects people to read far more AI-generated writing in documents and emails — and to like it. The important distinction, for him, is not whether AI touched the prose. It is whether the person sending it understands and stands behind it.

He says this is already accepted in coding. Plan documents are often AI-written, and he does not want engineers hand-writing them from scratch if the model can produce a better first pass. He expects the same pattern to spread across strategy documents, planning memos, email, and guides.

Every’s quarterly planning process is his example. At the end of 2025, the company used Notion agents. Employees talked to an agent about the prior year, goals, metrics, future plans, and how their work related to the company strategy. The agent pushed back and produced strategy reports or quarterly plans for each part of each team. Shipper says those AI-generated documents helped him see which teams needed to talk to one another, which plans were high or low quality, and where alignment was missing.

The difference between good AI writing and “slop,” in his view, is whether it took the sender less time to make it than it takes the reader to read it — and whether the sender can defend every line. If someone sends an AI-generated document and then clearly does not know what is in it, that is unacceptable. If the person used the model to produce a strong document under their direction and can discuss it fluently, Shipper sees no problem.

That standard is stricter than a disclosure rule and more practical than a ban. The sender is accountable for the content, not for the keystrokes. A strategy memo generated through a careful back-and-forth with an agent may be better than a human-only memo from someone who is not a strong writer. But if the writer cannot explain the assumptions, defend the claims, or answer questions about the implications, the document fails regardless of how polished it looks.

He says much of his own email is written by GPT-5.5 and Codex. In one case, he asked Codex to send an email to an investor; it sent the email without asking him first. He panicked, checked the sent message, and found it was exactly what he would have sent. Email, he says, is often rote enough that he wants to decide what it should say but does not always care about composing the exact sentences.

He does not extend that claim to all writing. As a writer and publisher, he says human writing remains important. Every publishes a mix of human and AI writing and labels it. Sometimes an AI co-author is useful. But he thinks reflexive aversion to AI writing is “silly,” especially for internal business documents where most people are not strong writers and the baseline is low.

For external guides, Shipper adds another reason AI-assisted writing can be useful: the audience may include both humans and agents. A human reader needs the story, core ideas, and judgment. An agent can ingest thousands of pages and later operationalize the details when the user is doing pricing, planning, or another task. In that world, long informational work becomes more useful if it is structured so agents can absorb and apply it.

The publishing implication is not simply “write more with AI.” It is that documents may need to serve two audiences at once. Humans need narrative, priorities, and the reason something matters. Agents need detail, structure, and enough context to apply the material later. Shipper’s view is that AI-generated or AI-assisted work can be more valuable when it helps ideas move from text into action.

PMs and full-stack designers gain leverage because taste and judgment become scarcer

Shipper is “super, super bullish” on product managers. His evidence is internal: Marcus, the PM who runs Spiral, Every’s writing app. Marcus had previously run Axios’s writing product, took time off, became highly AI-oriented, and learned to use Cursor and later Claude Code. Shipper describes him as “lightly technical” — able to understand concepts like database migrations and inspect code when necessary, but not someone Every could have hired for the same building role a year earlier.

Coding models changed that. Marcus can combine enough technical understanding with strong product sense, writing judgment, and user empathy. Shipper says he ships faster than almost anyone on the team because he can interpret user conversations, form a product story, identify issues, and build changes without organizing a large team around each step. He feels “liberated” because he can just do the thing.

Lenny Rachitsky connects this to a product-management thesis he has long held: if building becomes cheaper, the scarce skills are deciding what to build, identifying the problem, and judging whether the solution is good. Shipper agrees.

Full-stack designers are the second group Shipper expects to become unusually powerful. Designers often know exactly how an interaction should look and feel, but historically depended on engineers to implement it. With AI coding tools, they can build more of it themselves. That matters because default AI-generated design often looks samey. Rachitsky notes that even with AI design tools, outputs can be recognizable as “definitely Claude Design.” Shipper’s point is that designers can make work that looks different and then ship it directly.

At Every, he says designers now make pull requests. Sometimes they still hand work off, but often “the thing is built and that’s it.” This can improve company workflow and create entrepreneurial opportunities for designers who previously had ideas but could not produce software independently.

The deeper reason PMs and designers gain leverage is the same reason Shipper rejects the job-apocalypse story. Models make “yesterday’s human competence” cheap. They ingest what has already been done and make it easy for anyone to deploy. Landing pages, tweets, code, and basic analyses proliferate. Because many people use the same models in default ways, the output starts to look the same and becomes commoditized.

Human value shifts to using that frozen competence to create something new, specific, and interesting. Shipper thinks models will structurally trail the people doing that, partly because of how models are trained and partly because model companies have incentives to make them compliant and aligned. New human expertise eventually gets absorbed into models, but that creates the next frontier rather than ending the human role.

For PMs, that means product sense becomes more direct leverage. A PM who understands users, has taste, and can operate the tools can move from insight to implementation without waiting for every step of a traditional team process. For designers, it means aesthetic and interaction judgment become more valuable precisely because the default outputs are easier to produce. When everyone can generate something acceptable, the person who can make something distinctive has more room, not less.

The AI job apocalypse is not Shipper’s base case

Shipper does not deny that companies are reorganizing. He says some AI-related reorganizations make sense. But he suspects some layoffs attributed to AI are also companies using AI as a convenient explanation for overhiring or poor performance. The mass unemployment scenario discussed by some AI CEOs is, in his view, unlikely.

His argument is empirical and structural. Empirically, Every adopted AI aggressively and hired more people. Structurally, new model capabilities create more work at higher levels of abstraction. If everyone can generate code, the demand for engineers does not disappear; engineers are needed to decide how code should enter the codebase, what is slop, what should be rewritten, and how systems should be designed. If everyone can run analyses, data scientists are needed to build systems that make basic analysis safe and to answer harder questions. If everyone can produce documents, leaders and PMs need to judge quality, alignment, and implications.

He also sees continuity beneath the transformation. SaaS continues. Slack continues. Email continues. People still coordinate with each other. But every role is being changed at the edges and sometimes at the core: engineers write less raw code, PMs may produce software, designers ship pull requests, and internal documents may be AI-generated.

Rachitsky summarizes the tension as “so much is not changing” and yet “every role transformed.” Shipper agrees. He says the future often looks, from a distance, like either dragons beyond the horizon or utopia beyond the horizon. Once people arrive there, they find “some really cool things, some not cool things, and it’s just another horizon.” His stance is that AI changes a great deal, but not always in the apocalyptic or utopian ways people imagine before they live through it.

That is not a complacent forecast. Rachitsky presses the point that workers may hear “no job apocalypse” as permission to keep operating the same way. Shipper’s answer is that people still need to change. The risk is not that every job vanishes at once. The risk is that a person keeps working as though the models do not exist while the work surface, the expected output, and the competence baseline move around them.

Shipper’s formulation is that models commoditize yesterday’s competence. They make what humans already did cheaper and easier to reproduce. That can be threatening to people whose work is defined entirely by yesterday’s repeatable outputs. But it also creates a new layer of work: deciding what the cheap competence should be used for, how it should be combined, where it is wrong, and what new thing should be made from it.

The practical advice is to ride the models

Shipper’s advice for staying relevant is concise: “ride the models.” By that he means using the latest models for whatever work a person actually does, trying new releases as they come out, and looking for newly possible workflows instead of avoiding the tools out of fear.

The only thing you need to do is ride the models. And that means use them for whatever it is that you do.

Dan Shipper · Source

He acknowledges that fear is rational. AI threatens established competence and makes people wonder whether their job will remain. But he argues that using the models extends a person’s powers and keeps them part of the changing workflow. Avoiding them increases the risk of being left behind.

For a PM at a large company, Shipper says the challenge may be organizational: some companies prevent employees from using the newest models. In that case, people may need to experiment on their own time. His method with new models is play. He repeatedly tests tasks models could not previously do — “turn the rock over again” — because a new release may change the answer. The senior-engineer benchmark is one example: a task that produced weak results became much stronger after a model jump.

He emphasizes that the edge of AI is not only in San Francisco. Model companies build the systems, but they do not know every use case. The edge appears “wherever AI meets a real human doing something.” Each new model gives practitioners a chance to discover uses in their own domain. Every is in Brooklyn, and Shipper argues it can still be ahead of many people in San Francisco because the company applies models across real work.

Rachitsky adds that one unusual feature of the current AI wave is broad access: outside the model labs, many people can use the newest systems soon after release, if they can pay for access. Shipper agrees and says this matters. If AI had been invented in a different enterprise-software culture, he suggests, the most advanced intelligence might have been expensive and restricted to large companies using it in uninteresting ways. Instead, the Silicon Valley impulse to make intelligence broadly available has given many workers the chance to experiment near the frontier.

Shipper’s concrete recommendations are to try workflows in Codex or Cowork, use AI work surfaces as the place where apps run, experiment with agent products such as OpenClaw and Hermes, and get comfortable with both modes: the shared delegated agent and the local work-surface agent. Most importantly, he says people should try to have fun. FOMO and fear are poor discovery mechanisms. Enjoyment leads people to explore more deeply and find useful applications.

Rachitsky adds a related formulation from a prior guest: find a “moment of joy” with AI — a moment where the tool does something surprising and useful enough that the person wants to keep building. Shipper agrees. His final request of listeners is not to argue abstractly about AI, but to “put their hands in it,” find good uses in their own lives, and share what they learn.

AI Application Architecture Evals and Benchmarks Agents and Autonomy AI Business Models AI Economics and Labor Human-AI Interaction AI Product Management Coding Assistants Enterprise AI Adoption