AI-Native Services Firms Can Turn Labor Markets Into Software-Margin Businesses

Charlie WarrenY CombinatorWednesday, June 3, 20268 min read

YC’s Charlie Warren argues that AI-native services companies are not copilots for existing firms but services businesses rebuilt so AI performs much of the work and customers buy the outcome directly. In his Startup School talk, Warren says the venture-scale opportunity is in outsourced, outcome-oriented markets such as legal services, tax, insurance, audit, regulatory support and healthcare, where AI operating leverage could push services margins toward software-like levels. His test is whether founders can control variance, reduce COGS, price on value and design the process itself as the product.

The test is whether services margins can move toward software margins

AI-native services companies are not copilots sold into existing firms. Charlie Warren defines the category as services businesses rebuilt from scratch so that AI does most of the work and the customer buys the outcome directly. The relevant markets are work now performed by people-heavy firms: legal services, tax, audit, insurance, FDA regulatory support, mortgages, parts of healthcare, logistics, and similar categories.

The financial premise is what makes the category venture-scale. Traditional services firms, Warren says, top out around 30% margins. Pure software and agent companies can have higher margins, but often in smaller markets. The bet for AI-native services is that “AI operating leverage” can push a services company closer to software-like margins — “say 50% plus” — while operating in markets that may be two to three times larger than software markets.

50%+

margin level Warren says AI operating leverage should plausibly approach

That leverage has to show up in the profit and loss statement. Warren reduces the structure to the basics: revenue minus cost of goods sold equals gross profit; gross profit minus operating expenses equals operating income. Put another way, operating income is revenue minus COGS minus OPEX. But for these companies, the details determine whether the business is a startup with compounding leverage or simply a labor business with AI tooling.

Line item	Warren’s emphasis
Revenue	Contracts may be easy to sign; repeated delivery is the hard part.
COGS	Model costs, hosting costs, and humans in the loop each need an owner, a number, and a trendline.
OPEX	R&D, sales, and general administrative costs still matter in the usual way.
Operating income	AI services companies will be judged on it faster than founders may expect.

Warren’s operating lens for AI-native services companies

COGS is the line to obsess over from day one. Model costs, hosting costs, and human labor all need explicit measurement. Zero-margin or negative-margin pilots can be useful for learning, but Warren warns founders not to get hooked on them. The core bet is that as the product improves, COGS falls and gross margin rises. Founders do not need to reach software-like margins immediately, but the trajectory has to be believable.

The right market is already buying the outcome

The market-selection rule starts with behavior change. The preferred markets are ones where the work is already outsourced and the budget already exists. The startup displaces a vendor rather than persuading a customer to adopt a new workflow.

That is what Warren means by “low trust”: not that the work is unimportant, but that the buyer cares about the final product more than the vendor’s internal method. The customer wants the outcome. The AI-native company rebuilds how that outcome is produced.

The best markets share four traits. First, the work is already outsourced and outcome-oriented. Second, task-level judgment is low in most places: the workflow can be decomposed so that many steps are automatable, with human judgment concentrated at specific points. Third, the overall service still has a high intelligence threshold, meaning models plus humans are required to produce work the customer will accept. Fourth, regulation can help rather than hurt, because legal accountability and higher expectations raise the bar for competitors.

His example is Panacea, a YC company providing FDA regulatory services for biotech and medtech customers. The company’s on-screen homepage says it combines an “AI FDA regulatory toolset” with experienced in-house 510(k) experts, and describes support for pathways including 510(k), PMA, IND, NDA, and 513(g). Warren’s point is not that regulation disappears. It is that regulated services can be rebuilt around AI-assisted experts while still delivering the accountable outcome the customer needs.

Better models should strengthen the service, not erase it

Warren’s market test is whether better models make the company stronger or make the company unnecessary. He calls this the “Sam Altman test”: as models improve, does the service gain leverage, or does the model itself commoditize the service?

The dangerous case is a company that exists only because current models are not yet good enough. If the model alone can soon do the job, the service has no durable role. The desirable case is one where the company becomes more capable as frontier models improve.

That distinction is also why founders cannot use humans merely to cover for product gaps. Human-in-the-loop work can be part of a massive technology business, but founders need to be honest about why the humans are there. If they are exercising necessary judgment at critical points, the model-human system may be the product. If they are compensating for weak automation, the company is using labor to mask unfinished product work.

Markets involving equipment and onsite labor are a separate warning. Those can be good businesses, but the “software margin math” does not apply when a company owns and operates physical things. In Warren’s view, that makes real leverage much harder and puts the opportunity closer to robotics than AI-native services.

The product is an operation

AI-native services require a different founding-team mix from ordinary software startups. Warren says the usual advice still applies: founders should work with people they know and trust, and solo founders should recruit the best people they have worked with before. But the best teams in this category combine domain fluency, model fluency, and operational rigor.

Domain fluency matters because these companies often sell into skeptical, regulated, professional-services markets. Direct experience is best, Warren says, but learned expertise can work if the founders can “bleed credibility.” Model fluency matters because founders need to understand what frontier models can do today and design the company to benefit as models improve. Warren is explicit that “there is no substitute for great tech here.”

The third requirement is less glamorous: operational rigor. These businesses run on variance, throughput, cycle times, standard operating procedures, staffing models, and process design.

The product is an operation.

Charlie Warren

General Legal, another YC-backed example, illustrates the mix Warren wants. The company describes itself on screen as “The AI-Native Law Firm for the Growth Stage,” offering “BigLaw quality contract review” with turnarounds “in hours, not days,” flat-fee pricing, and service “inside your Slack.” Warren points to the founders’ law firm experience at Cooley and Fenwick and technical leadership at Casetext, but he emphasizes their operating model: they use shift work to reduce cycle times and attract strong lawyers, treating staffing design as part of the service architecture.

Variance kills faster than slowness or price

For AI-native services, the human is the interface to the customer, while the internal product helps that human scale work nonlinearly. Warren says that inversion changes what counts as a product metric. Throughput and cycle time have to be tracked with the seriousness a software company might give to daily active users.

The central operating risk is variance: non-uniform outputs from the actual service. Warren argues that customers will fire an AI-native services company for inconsistency faster than they will fire it for being somewhat slower or more expensive than incumbents. A buyer can tolerate some cost or speed disadvantage if the output is reliable. Inconsistent work destroys trust, and trust loss causes churn.

That is why “humans in the loop” cannot simply scale linearly with revenue. If every additional dollar of revenue requires a proportional increase in staff, the company has not created the leverage the model depends on. The humans also have to like the software, because they are its users. Early unscalable work is acceptable for learning, but the endpoint is clear: automating the process is the product.

This operating reality changes early sales. Warren warns against the “early demand trap”: it can be easy to sign many pilot customers before the product exists, but serving them can consume the team and prevent the product from being built. His advice is to cap the first pilots to a small handful.

In those early accounts, the pilot itself is the product. Founders should avoid standardizing too early and instead use pilots to find where AI creates unusual leverage versus where the company is merely automating an obvious step. The lesson from those pilots should feed product development quickly.

Pricing should follow value, not the founder’s cost structure

Because AI-native services compete with labor rather than with other software tools, pricing is harder than in traditional SaaS. The buyer is comparing the startup against internal staff, outsourced vendors, or professional-services firms.

Warren presents per-unit pricing as the cleanest option: per return, per claim, per loan, or another unit of work. It is simple to understand and maps naturally to service delivery. Outcome-based pricing can align incentives well, but it is harder for the company to forecast. Panacea, for example, prices on the completed consultant study rather than the industry norm of hourly work.

The pricing strategies to reject are cost-plus pricing and pure undercutting. Cost-plus pricing permanently caps upside because the company prices from its own cost base rather than the value delivered. Straight-line undercutting can make the work look cheap or lower-quality. The instruction is simple: price on value.

Buying a services firm usually brings the wrong assumptions

Warren cautions founders not to buy an existing services business as a shortcut to revenue and then add AI on top. He sees the temptation, especially for founders with operating backgrounds, but calls it generally a trap.

There is one exception he considers reasonable: buying may make sense if the company needs a regulatory moat quickly, such as insurance licensing. Otherwise, the objection is that founders cannot acquire product-market fit. Legacy services firms come with legacy expectations around metrics, hiring, performance, and customer delivery. Adding AI does not immediately change those realities.

The build-versus-buy preference follows from the rest of his argument. An AI-native services company is not just a services firm with a model attached. It is a company whose process, staffing, software, quality control, pricing, and P&L are designed together from the beginning. In Warren’s closing formulation, founders need to treat “the process as the product and the product as the process.”

AI Startups and Funding AI Business Models AI Economics and Labor