Stripe Says Agent Payments Need Deterministic Controls, Not Browser Automation

Steve KaliskiAI EngineerSaturday, June 6, 202610 min read

Stripe’s Steve Kaliski argues that autonomous agents can use probabilistic reasoning to discover products, services and tools, but payments should move through deterministic infrastructure. In his talk, he presents Stripe’s approach to agent commerce: scoped payment credentials, HTTP-based paid tool calls and structured checkout APIs designed to prevent agents from paying the wrong merchant, buying the wrong item, authorizing the wrong amount or exposing the wrong credential.

Agents can discover probabilistically, but they should not pay probabilistically

Steve Kaliski drew the central line between the part of agent behavior where non-determinism is useful and the part where it becomes dangerous. Discovery and exploration benefit from LLM behavior: a model can search across a large corpus, recommend code, identify products, or find businesses. Credentials, payments, and checkout are different. They “require determinism.”

Discovery and exploration benefit from non-determinism. Credentials, payments, and checkout require determinism.

Steve Kaliski

That distinction drives the payment architecture he described. The problem is not simply that agents need to spend money. Kaliski’s premise is that they already do. When a developer uses Claude Code, Cursor, or another LLM-backed application, the agent is consuming tokens, and those tokens are economically meaningful. The spend may be mediated through a subscription or translated from input and output tokens into dollars, but the agent is already acting within an economic system. The narrower question is how to extend that beyond the LLM provider: other currencies, payment methods, merchants, and spend patterns.

Kaliski reduced the agent loop to two operations: call an LLM and call tools. Both can involve spend. In the payment context, the relevant tools are search, credential management, and payment. Search can remain probabilistic. Credential handling and payment execution cannot.

The failure cases follow from letting a probabilistic system operate payment surfaces meant for humans. Kaliski grouped them into four categories: the wrong place, the wrong thing, the wrong amount, and the wrong credential.

“Wrong place” is a domain and identity problem. An agent may land on a site that looks like a legitimate merchant but is not the merchant it intended to use. “Wrong thing” is a product-selection problem: an agent trying to buy a purple T-shirt may buy an orange one, or may buy something ten times more expensive than intended. “Wrong amount” covers drift and interpretation: prices vary by region, taxes and currencies complicate totals, and the number an agent extracts from a page may not be the amount the user intended to authorize. “Wrong credential” is the risk of handing over payment credentials in a way that is either unsafe or incompatible with many payment methods. A credit card can be pasted into a form, Kaliski said, but that is “not good,” and many other payment methods are hard or impossible for an agent to relay that way.

The base approach is to let an agent operate a browser like a human: take a card number, browse a site, fill forms, click pay, and observe the result. Kaliski treated that as the wrong abstraction for money movement. It is finicky, slow, hard to observe, and carries monetary risk. His comparison was to web automation more generally: this is why MCP and APIs exist. Stripe’s own product surface illustrates the split. A dashboard is for humans; a CLI command creating a payment intent is closer to what “robots prefer”: code.

The ideal flow binds the transaction to a merchant, enforces spend policy, uses APIs rather than browser manipulation, and relies on verifiable identities.

Delegated credentials need a smaller blast radius

Shared Payment Tokens, or SPTs, are meant to let an agent collect a payment credential and share it with a seller without simply handing over an unrestricted card number or assuming all payment methods behave like cards. Kaliski described them as a way to encode a mandate around how a credential may be used: by whom, for how much, in what currency, and for how long.

The example on screen showed a Shared Payment Token as a credential object with an ID, creation timestamp, deactivation fields, usage limits, and seller details. The usage limits included currency, expiration, and maximum amount; the seller details included a network identifier. The important property is that the token carries constraints enforced outside the agent’s probabilistic reasoning. Without such a token, the agent or user must trust the seller to charge the amount the agent parsed from a web page. With an SPT, Stripe enforces the limit. If the agent was duped by a domain or misread an amount, the credential can still be bounded by the amount and seller the agent intended to target.

Risk	Base browser approach	SPT approach described by Kaliski
Wrong place	Agent may paste or submit credentials to the wrong domain.	Token can be scoped to a particular seller.
Wrong amount	Seller may charge more than the amount the agent intended or parsed.	Usage limits can cap amount and currency.
Credential exposure	A card number can be handed to a page directly.	The seller receives a granted token rather than unrestricted credential use.
Auditability	Outcome depends on observing a browser flow.	Stripe enforces limits and provides an auditable flow.

How Shared Payment Tokens address the payment failures Kaliski identified

The demonstration used two Stripe accounts: one for the seller and one for the agent. The seller had a standard Stripe integration that created a payment intent for $50. In the agent flow, the agent had already collected a Visa card from a human operator, a subscription-backed harness, or another source, then provisioned a Shared Payment Token for that card.

The underlying Visa card could have a much higher credit limit, but the token granted to the seller was limited to $25, set to expire in 30 days, and scoped to a particular seller account. When the seller attempted to use the token for the original $50 payment intent, Stripe rejected it. The terminal error said the requested amount was greater than the remaining amount capturable with the shared payment granted token. When the amount was lowered, the payment went through.

That failure was the point. The agent collected a credential, shared it, and applied a limit; the seller then tried to charge more, and Stripe enforced the limitation.

Kaliski also stressed that the seller is not meant to be blind in this setup. In the same way a seller that collects a card may know the card brand, last four digits, and credit type, the seller still receives relevant information for risk analysis. Stripe’s intent, as he described it, is not to make the transaction secret from the merchant. It is to let the seller use its existing risk systems while preventing the credential from being usable outside the delegated policy.

The developer implication is direct: the credential should not be the policy. A card number handed to an agent or pasted into a checkout page is too broad. A seller-scoped, amount-limited, time-bound token turns the credential into a constrained instrument that can be relayed across payment methods while preserving merchant risk signals.

A paid tool call should carry the terms of payment in the protocol

Credential sharing solves only one part of the problem. The next gap is how an agent associates a payment with the thing it is trying to obtain. Kaliski introduced the Machine Payments Protocol, or MPP, which Stripe built with Tempo, as a way to put payment terms directly into the request-response flow.

Kaliski framed tool calls as HTTP requests. If agents are making HTTP requests to tools, and some of those tools require payment, then the payment requirement should be represented directly in the protocol rather than handled through improvised browser interaction or a prearranged API key.

The MPP sequence shown on screen began with a client requesting a resource. The server responded with HTTP 402 Payment Required and a challenge. The client fulfilled the payment challenge, retried the request with a credential, the server verified and settled the payment, and the response returned 200 OK with a receipt. The slide summarized the approach as “MPP = api call w/ money.”

402

HTTP status code used in the Machine Payments Protocol flow to signal payment required

The seller can tell the agent what payment is required, who the recipient is, what asset or rail is involved, and what resource the payment unlocks. The agent can then satisfy that requirement and retry the request.

In the demo, a request to a protected endpoint failed with a 402 response. The response included a WWW-Authenticate payment challenge and a problem JSON body. The terminal displayed the available payment option: protocol mpp, asset pathUSD, amount 0.01, and a recipient address. After approval, the payment settled, a receipt was returned, and the protected resource became available. Kaliski noted that the transaction landed on the blockchain.

The blockchain detail appeared again in the Q&A, when an audience member asked whether the base was hosted by Tempo or internal to Stripe. Kaliski answered that Stripe supports multiple protocols and networks, including Base and Tempo. The transaction data lives natively on those chains, while Stripe replicates a product view of that data in its own systems.

MPP extends the deterministic boundary. The agent does not merely infer that some payment may be necessary. The server communicates the requirement in a structured way; the payment is tied to the resource request; and the result includes a receipt. For Kaliski, that closes another part of the gap between probabilistic planning and safe execution.

Checkout needs structured negotiation, not screen scraping

Not every purchase is an API call. Ecommerce checkouts carry details that matter: product identity, quantity, shipping, tax, VAT, fulfillment options, restrictions, discounts, totals, and payment method. Kaliski described the risk as an agent sitting between a buyer and a merchant and incorrectly relaying some of those details, increasing disputes and chargebacks.

Stripe and OpenAI built the Agentic Commerce Protocol, or ACP, to provide a standard set of APIs and objects for expressing checkout state across the web. The slide’s checkout object included an ID, status, currency, line items, base amount, discount, subtotal, tax, and total. The emphasis was not on the novelty of those fields but on their structure. Instead of an agent trying to read a checkout page visually or scrape a UI, the seller can expose the cart state directly.

ACP establishes a back-and-forth among the agent, the seller, and the payment service provider. Every time the agent creates a checkout, updates quantity, selects shipping, changes a payment method, or proceeds to payment, the seller can respond with the latest checkout state. The agent can then act on structured data rather than inferred page state.

Kaliski demonstrated the idea with Stripe Press, Stripe’s book store. He contrasted the human-friendly storefront with a robot-friendly equivalent. ACP can express a product catalog in JSON, including images, descriptions, and pricing. A chat-like interface asked for book recommendations about AI; the visible store showed titles including “The Scaling Era” and “Smash,” with a total due of $25. On the right side, API requests and responses showed buyer information, line items, quantities, and the seller’s returned cart state.

The point was to preserve determinism after discovery. An agent may use non-deterministic exploration to find or recommend a product. But once it has selected something to buy, the interaction should shift into programmatic negotiation. The agent should receive line items, base prices, applicable tax, fulfillment options, and totals as structured facts from the seller. Payment can then proceed using a Shared Payment Token or another credential mechanism.

For merchants, the implication is that “agent friendly” does not mean surrendering the checkout to an agent’s browser automation. Kaliski described the end state as API-driven commerce flows that are flexible across payment methods, including cards, crypto, and other payment methods Stripe supports. Just as important, the seller remains in control. The seller maintains the expected customer relationship and receives the signals and risk data needed to interact safely with agents.

The architecture is designed to bound risk when agents buy

Kaliski’s advice to sellers was to make applications “agent friendly.” Exposing only human-oriented web UIs increases the chance that non-determinism will interact directly with the business. Structured APIs reduce that chance by giving agents deterministic paths for product discovery, checkout state, and payment execution.

For agents, his advice was to use Shared Payment Tokens, wallets, and related credential-management technologies to keep autonomous spending bounded. The goal is not to remove agency from the agent, but to separate planning from authorization and settlement. A non-deterministic planner can decide what it wants to do; constraints, verifiable parties, and structured negotiation determine what it is allowed to do and how far the damage can spread if something goes wrong.

Kaliski’s final formula was explicit: non-deterministic planner plus constraints, verifiable parties, and structured negotiation equals a small radius of risk. In the architecture he laid out, each primitive corresponds to part of that formula.

Primitive	Where it fits	Failure mode it narrows	Deterministic control
Shared Payment Tokens	Credential delegation	Wrong credential and wrong amount	Seller-scoped, time-bound, amount-limited credential enforced by Stripe
Machine Payments Protocol	Paid tool and resource access	Unclear paid access and loosely associated payment	HTTP 402 challenge, payment fulfillment, verification, settlement, and receipt
Agentic Commerce Protocol	Ecommerce checkout	Wrong thing and wrong amount	Structured product catalog, cart state, tax, fulfillment, totals, and payment-state negotiation

How Kaliski mapped Stripe's primitives to safer agent payments

The Q&A clarified how the same primitives might extend beyond one-off purchases. Asked about giving Claude, for example, $25 a week for a particular model, Kaliski separated subscriptions from more enduring policies. For subscriptions, he compared the model to giving a business a credit card and permitting periodic charges against the same credential, with something analogous to an OAuth access-and-refresh flow for subsequent usage. For larger or ongoing budgets, he said the same mechanism can work with higher limits, while still scoping tokens to individual sellers; users could also create many such scoped tokens.

Asked whether Stripe Projects is effectively a wrapper around the primitives described in the talk, Kaliski said yes. It is built on Shared Payment Tokens, uses the same idea for how a seller or SaaS business expresses products, and applies the recurring-payment concept to monthly usage. In that answer, Stripe Projects functioned less as a separate layer than as an example of the architecture’s intended composition: scoped credentials, structured product expression, and recurring authorization built on deterministic payment controls.

AI Application Architecture AI Security Agents and Autonomy