Orply.

Block’s Autonomous Engineering Push Moved AI From Code Generation to Delegation

AI EngineerSunday, June 28, 202610 min read

Angie Jones of the Agentic AI Foundation argues that turning a conventional engineering organization into an autonomous one requires far more than buying coding agents or frontier models. Drawing on her work leading AI enablement at Block, she says the shift depended on changing repositories, delegation workflows, review systems, compute environments, and organizational adoption strategy so engineers could move from AI-assisted coding to agent-led delivery. Her account ends with an unresolved question: whether the same transformation that made software work more autonomous also helped make workers more replaceable.

Autonomous engineering did not begin with better code generation

? angie-jones describes the central problem bluntly: Block had widespread AI tool usage, large token bills, and no corresponding acceleration in shipping. By her account, roughly 90% of Block’s 3,500 engineers were regularly using tools such as Goose and Claude Code to generate code. On paper, that looked like deep AI adoption. To the CEO, it looked like engineering “wasn’t using AI at all,” because features were not reaching customers any faster.

Jones says both views were true. Engineering was using AI. But usage was still mostly inside the IDE: questions, autocomplete, boilerplate, and isolated assistance. Code generation had improved, but code was not the bottleneck.

That diagnosis shaped the rest of the work. Jones separates AI enablement into three phases: experimentation, adoption, and impact. Block had moved past experimentation because most engineers were already using AI. It had not reached impact because AI had not been integrated into the way software was designed, delegated, reviewed, and shipped.

Her working definition of an “agentic engineering org” was therefore not an organization where engineers occasionally used AI tools. It was an organization where engineers used AI agents as their primary means of producing engineering outcomes. That meant treating agents as core collaborators: decomposing problems, delegating work, reviewing and verifying output, and directing agents as the default mode of operation.

Code is not the bottleneck.

? angie-jones

Jones had spent the first half of 2025 leading AI enablement for Block’s entire company: 12,000 employees across marketing, design, finance, legal, and other functions. When the CTO tasked her with building an agentic engineering organization, she says there was no playbook. She went looking for one in other companies’ blogs and found mostly admissions that everyone was making it up as they went along.

Block’s early position mattered. Jones says the company had been building Goose, its internal coding agent, even before LLMs supported tool calling. Block also worked with Anthropic as design partners for the initial release of MCP, and Goose became the reference implementation for the MCP client. Internally, some of Block’s most curious engineers were among the earliest users of this class of coding agent.

But early access did not produce organizational transformation by itself. The harder work was changing the engineering system around the agents.

The maturity model measured delegation, not tool usage

To make the transformation operational, ? angie-jones built a maturity model for how engineers relate to AI agents. She says an earlier version existed in Q3 of the prior year, and that Steve Yegge’s “Gas Town” article helped her reorganize it.

The model has six stages:

StageLabelEngineer-agent relationship
0UnengagedThe engineer does not use AI tools in the workflow.
1AssistedThe engineer uses AI for autocomplete or similar assistance, but not agent mode.
2ConversationalThe engineer chats with agents, but does not use them to produce pull requests.
3DirectedThe engineer delegates tasks to agents and reviews the output.
4ParallelThe engineer runs multiple agents in parallel.
5AutonomousThe engineer delegates complete tasks, and the agent can produce shippable results without ongoing human guidance.
Jones’s AI maturity model for engineering work, adapted from Steve Yegge’s Gas Town article.

By Jones’s assessment, most of Block’s engineers were between stages one and two by the end of the first half of 2025. They were using AI, but primarily as assistance or conversation. The goal was stage five: complete-task delegation with shippable results and minimal hand-holding.

The difficulty was not simply training. Jones emphasizes three constraints. First, the work was highly experimental; there was no established playbook. Second, tools and models were changing so quickly that a useful practice one week could be obsolete the next. Third, many engineers were already fatigued by top-down pressure from leadership to “AI or die.”

Her response was not to try to level up all 3,500 engineers individually. She used the “1/9/90 rule” from digital communities: about 1% create, 9% interact, and 90% passively consume. Jones argues that the pattern maps closely to AI adoption in engineering. A small group will deeply explore agentic patterns and discover effective techniques. A somewhat larger group will tinker. Most people will not spend extra cycles figuring it out for themselves.

That led to a strategic choice: focus on forming the 1% power users, not on mass self-improvement. Jones created an AI Champions program with about 50 engineers across the company — “actually ~1.43%,” as her slide noted — selected from critical teams and repositories.

This was not a volunteer program. Jones says she needed engineers who could dedicate at least 30% of their time to AI enablement, who would not abandon the work when AI behaved non-deterministically, and who represented the most important codebases. She spent a week talking with tech leads and managers to identify them.

The first scaling move was to make repositories agent-readable

The first target for the Champions was not a new model or a new IDE workflow. It was the repository.

? angie-jones says that in June 2025, models were good enough to write a feature, but not reliably enough to follow a team’s conventions and standards. Developers did not trust agents enough to delegate work. Her theory was that if engineers embedded AI guidance directly into repos, agents would perform better and the benefits would extend beyond the Champions.

That made the repo the leverage point. Repositories are already the shared reference for engineers contributing code. If the repo contains the context, rules, and workflows agents need, every engineer who works there benefits from the preparation done by the power users.

Jones describes this as turning repos into AI-friendly systems. The standard components included:

  • context files such as agents.md or claude.md, which give agents guidance on the repository;
  • rules files, which define guardrails;
  • repeatable AI workflows, including slash commands and later agent skills;
  • AI code review, preferably configured with instructions about what matters in that codebase;
  • AI attribution on pull requests.

The implementation was intentionally not one-size-fits-all. Block’s Champions came from Square, Cash App, Afterpay, Tidal, and across frontend, backend, mobile, data, and infrastructure. They worked in large legacy monorepos, smaller services, and mobile apps.

Jones says that variety was important because it pressure-tested patterns across different engineering realities. Monorepos created distinct challenges. For JVM developers already accustomed to inheritance patterns, the team put shared context and rules at the root and layered more specific guidance at the service level. Web approaches did not necessarily work for mobile. Android and iOS sometimes needed different approaches from each other.

Jones let each Champion determine what worked for their repo, while allowing teams with similar shapes and constraints to converge naturally on similar tools and patterns. She says engineers liked this because it avoided a top-down mandate and preserved local judgment.

This moved the Champions toward stage three: directed delegation. Agents were beginning to write pull requests, and engineers were reviewing the output. But the effect still had limits. Jones says not enough engineers outside the Champions had reached that level, and even the Champions were still “babysitting” the agents.

Delegation worked when it happened inside existing work systems

The next scaling problem was interface. Engineers receive work in issue trackers, GitHub issues, and Slack. ? angie-jones wanted agents to accept delegation from all three places, so that using them would feel native rather than like a separate skill.

The Champions implemented delegation flows from Jira or Linear, GitHub issues, and Slack. Jones’s Slack example is the clearest demonstration of what changed.

An engineer noticed a product bug and asked in Slack whether others had seen it. A second engineer had not. A third engineer mentioned Goose directly in the Slack thread and asked whether it had seen the bug before and whether it could check. Goose went to the repository, pulled files, confirmed the bug, identified where it was, and returned three possible implementation options with code snippets in Slack. The engineers selected option one. The third engineer asked Goose to implement it. Goose returned with a pull request link.

Jones says the full cycle — discussion, diagnosis, issue creation, alignment, and fix — took about five minutes, all inside Slack.

Agents had become part of the sprint. Engineers could assign Linear and Jira tickets, as well as GitHub issues, to an agent and have it implement the work end to end. In one early use, Jones says the team ran out of work and had to pull in more tickets twice.

The reason this did not require every engineer to become an AI expert was the repo preparation. The Champions had already laid the foundation that made agents effective inside those codebases. Delegation became accessible because it happened where engineers, product teams, and managers already worked.

After three months of the Champions program, Jones reported measurable movement:

MeasureReported change
AI-authored codeUp 69%
Reported time savingsUp 37%
Automated pull requestsUp 21x
Results Jones reported three months after launching the AI Champions program.

At that point, she says she was comfortable saying Block was truly delegating work to agents. The organization was ready to pursue stage four: multi-agent parallelism.

Parallel agents shifted the bottleneck to review, machines, and coordination

Moving from delegation to parallelism was, in Jones’s words, “almost free” because the organization had become good at assigning work to agents. The new problems appeared immediately downstream.

The first was code review. ? angie-jones says engineers were tripling or quadrupling the number of pull requests they produced, but those PRs then sat waiting for review. Code review was already difficult to keep up with before AI increased PR volume.

Jones does not present this as fully solved. Her phrasing is narrower: Block “stopped the bleeding a bit.” The company had to use bots to help review bot-produced PRs.

Earlier in the process, AI code review had been optional because, in Jones’s assessment, the reviewers were bad enough that they angered engineers. After the repo-readiness work, and with better models and tools, she says the results improved. She specifically gives a “shout out” to CodeRabbit and says Block enabled CodeRabbit on all repos.

Block also created an auto-fix loop. If CodeRabbit identified issues, another agent would automatically fix those issues and commit them to the PR. The practical benefit, Jones says, was that human reviewers were no longer complaining about sloppy bot PRs; by the time PRs reached them, they were in better shape.

The second problem was compute environment. Multiple agents working in parallel bumped into each other, and engineers’ laptops could not handle the load. Jones describes machines running out of memory and CPUs choking. Block moved to dedicated cloud workspaces, where each agent ran in its own isolated environment. That allowed agents to run in parallel and from anywhere.

As engineers began running four or five agents at a time, and as that number grew, a small group including many AI Champions began building an internal orchestrator called BuilderBot. Jones says BuilderBot was needed to reach an autonomous engineering org because individual agents were no longer the whole system; the work now required coordination among multiple agents operating across different environments and codebases.

The third problem was company-scale context. To build anything close to autonomy, Jones says agents had to understand where things lived and what depended on what. Block built what she calls a company world model based on the entirety of its 25,000-repository codebase: a machine-readable view of every service and how they connect.

That world model allowed BuilderBot and the agents it delegated to, in Jones’s description, to pull context as needed and understand the system landscape while implementing. Multiple agents could explore different parts of the system in parallel, build their own understanding, and return to the orchestrator. BuilderBot could then assemble a plan spanning multiple codebases. Jones says this was especially useful for offerings that crossed multiple products.

With that, she says Block reached stage five. Engineers could delegate complete tasks, and agents could produce shippable results without human hand-holding.

The boundary also moved beyond engineering. Jones says anyone at the company could mention BuilderBot in Slack and have it fix a bug or implement a feature. They did not need GitHub access. For a moment, she says, “this felt like a dream.”

The achievement ended in unresolved questions about where autonomy leads

After describing the arrival at stage five, ? angie-jones shows a CNN Business screenshot attributed on-screen to CNN Business and Ramishah Maruf. The visible headline reads: “Block lays off nearly half its staff because of AI. Its CEO said most companies will do the same.” The article image shows Jack Dorsey.

Jones says that although all layoffs are tough, this one felt different. It made her ask whether the work she had led had contributed to the outcome: whether enabling employees to do “the most incredible work of their careers” ultimately resulted in their dismissal. The day before, she says, she had felt proud and in awe of the way the organization was working, believing she had successfully built an autonomous engineering org. Then the question became: “To what end?”

What are we doing? Where are we heading? And are we sure that it's where we want to end up?

? angie-jones · Source

The tension is left unresolved. Jones’s account is operationally specific about how Block moved from AI usage to agentic delegation: power users, AI-friendly repositories, native delegation surfaces, automated review loops, isolated cloud workspaces, orchestration, and a company-scale world model. It is also explicit that technical capability did not answer the organizational and human questions that followed.

The frontier, in your inbox tomorrow at 08:00.

Sign up free. Pick the industry Briefs you want. Tomorrow morning, they land. No credit card.

Sign up free