AI Compresses Years of Software Vulnerability Discovery Into Weeks

Chamath Palihapitiya

Jason Calacanis

David Sacks

David Friedberg

Nikesh AroraAll-In PodcastMonday, June 8, 202614 min read

Palo Alto Networks chief executive Nikesh Arora told the All-In podcast that AI has changed cybersecurity by making years of latent software vulnerabilities discoverable in weeks. After testing Anthropic’s Claude Mythos against Palo Alto’s own code, Arora said the company found flaws that would normally have taken five to seven years to identify, raising the stakes for enterprises with weaker defenses. His broader argument was that AI will erode analytical SaaS while increasing the value of data infrastructure, workflow redesign and security systems that can make model outputs reliable enough for production.

AI has turned old code into newly exposed attack surface

Nikesh Arora framed the immediate cybersecurity problem bluntly: “humans have been writing bad code for a very long time.” The change is not that software suddenly became vulnerable. It is that AI can now inspect old code at a speed and depth that changes the economics of finding those vulnerabilities.

Palo Alto Networks tested Anthropic’s Claude Mythos for six weeks against its own code base, according to Arora. The result, he said, was the discovery of vulnerabilities that would normally have taken the company “five to seven years” to find.

6 weeks

time Arora said Palo Alto tested Claude Mythos against its own code

5–7 years

the vulnerability-discovery work Arora said the AI compressed into that test

That claim mattered because Arora was not describing a weak target. Palo Alto Networks is itself a cybersecurity company, and Arora said it considers itself in the “top percentile” of companies testing its own code. His concern was what the same capability implies when applied to the broader universe of enterprise software, open-source dependencies, and “10 million developers who write code.”

The capability is not limited to spotting isolated bugs. Arora said that when the model is put in “ultra mode,” which he described as persistent thinking that keeps trying until it gets an answer, it can “daisy chain vulnerabilities” by finding new attack paths through combinations of flaws. The concern, in his formulation, is that AI can reason from individual defects toward exploitable sequences.

Jason Calacanis pressed on cost, asking whether the experiment had required $10 million or $100 million of tokens. Arora said the cost was “in the low millions,” and added that the cost curve is already moving down as OpenAI, Anthropic, and others release cheaper and more consistent models.

When Chamath Palihapitiya asked whether Arora “buys the hype,” Arora rejected the framing.

It’s not hype. It’s true.

Nikesh Arora

David Sacks narrowed the issue to capability rather than valuation or rhetoric: “The capabilities are real.” Arora agreed, and pointed to a broader consequence: if AI can reveal years of latent vulnerabilities in a well-tested cybersecurity vendor’s code base, the burden across ordinary companies and open-source software is much larger.

Arora also said he believed these capabilities were “three months away, if not already there,” from being available in the wild. Sacks compared that with an earlier expectation that Mythos-level capability might be six months away in Chinese or open models. Arora’s answer was that attackers do not need to crack the hardest possible target. They need to find a few exploitable flaws in the many systems already running, including older operational-technology code at the edge.

That, Sacks said, creates a race: cyber defenders must find and patch vulnerabilities before attackers use the same tools to find and exploit them. Arora’s assessment was not reassuring. “Not as well as we should be doing,” he said, adding that the situation is good for Palo Alto’s business but problematic for enterprises.

The burden on CIOs, in Arora’s description, is simultaneous and compounding. Vendors are asking customers to patch hardware and software products because the vendors have found vulnerabilities. CIOs are also trying to inspect and fix their own code. Meanwhile, open source remains a large unresolved exposure.

The defense problem is enterprise memory, not just model access

Arora did not present AI-enabled defense as a matter of simply pointing a better model at the problem. Enterprises, he said, lack the memory and context required to distinguish normal from abnormal behavior across their own operations.

Organizations “don’t have memory and context of everything they do every day.” In cybersecurity, that means companies need to collect much more enterprise data if they want to understand what good and bad behavior look like. His estimate was large: enterprises need to collect “ten times the data” from a cyber perspective to defend against AI attackers.

10×

additional enterprise cyber data Arora said companies will need to collect

Asked by Chamath Palihapitiya whether rising model capability increases systemic business risk for large enterprises, Arora said yes “on the cyber side.” He said there are antidotes being built by Palo Alto and others, including capabilities that may reduce the need to patch everything. But the central requirement remains a richer enterprise data layer.

He was less worried about elite national-security targets than about ordinary economic infrastructure. Palihapitiya raised the prospect of foreign state actors using AI to create havoc. Arora responded that a few thousand breaches or attacks already occur annually, and that many happen for “pretty rudimentary reasons.” He said 89% of attacks happen because credentials are stolen or breached.

89%

share of attacks Arora attributed to stolen or breached credentials

His concern was not that every adversary suddenly needs frontier AI to compromise systems. It was that vulnerable institutions and small businesses are already weakly defended. Arora cited the Change Healthcare breach as the kind of event to worry about: physician offices shut down because of ransomware, and UnitedHealth had to provide billions of dollars of credits to physicians to help them run their businesses, according to Arora and Palihapitiya’s exchange.

Palihapitiya summarized the risk as less about “cracking some PG&E power generation facility” and more about economic chaos. Arora agreed. He did not offer a silver bullet. Systems will have to be upgraded, renewed, and fixed over time, he said, and that increases the “terminal value” of the cybersecurity industry.

Analytical SaaS loses pricing power when the model can read the data

Nikesh Arora separated software into categories that AI affects differently. His sharpest claim was about analytical SaaS: “If you’re an analytical SaaS company, it’s over.”

By analytical SaaS, he meant software whose value proposition is collecting data and analyzing it for the customer. If a company can run large language models directly against its own data, it no longer needs incremental SaaS modules that exist mainly to interpret data already stored elsewhere.

He gave the example of marketplace apps attached to systems like Salesforce: a company has Salesforce data, buys an app from a marketplace, and that app helps analyze the data. Arora’s view was that this layer is no longer necessary. “I can just go run an LLM against the data,” he said.

The entire incrementality that has been sold as incremental software modules to all of us doesn’t need to be sold to us because I’d much rather have LLMs run against that data.

Nikesh Arora · Source

Jason Calacanis described a concrete version from his own company. A SaaS product had 20 seats, but almost nobody was logging in. The data remained useful, so they reduced the license count to three seats, connected the product to Slack and Claude, and allowed people to interact with it through natural language. Calacanis said the bill fell by 90%.

Arora pushed the example one step further: pull data from multiple systems into one place and analyze across it. Sales-rep data, productivity data, and inventory data from SAP could be combined to answer questions that previously required several SaaS products. His example was identifying who is selling well, where inventory is low, and where to build inventory in regions with productive salespeople.

That was the boundary of his “SaaS is dead” claim. It was not a blanket dismissal of all software. Arora called infrastructure software “undervalued.” Databases, core storage, data infrastructure, and systems that help companies collect and manage data become more important, not less. He named Databricks, Snowflake, MongoDB, and Oracle as examples of companies or products in that infrastructure category.

The reason is the same one he gave in cybersecurity: enterprises will need vastly more stored and usable data. Arora said companies will need “10 times the data stored in the enterprise” over the next three years. If analytical applications lose pricing power because models can analyze data directly, the infrastructure that stores and organizes that data becomes more valuable.

Between analytical SaaS and infrastructure, Arora placed systems of work or systems of record: software deeply embedded in business processes. These are not simply dead, in his account. They are due for reinvention.

His first step in that reinvention is removing the user interface. Arora called enterprise and consumer software UI “the worst thing we did as technologists,” because it forced humans to interact manually with data behind applications. If agents can listen to a sales call, identify the key points, and post them into Salesforce, Oracle, or another tracking system, then the human should not have to enter data at all.

The deeper efficiency comes not from a better screen, but from changing how work happens. Arora imagined a salesperson finishing a sales call and telling an agent to handle the paperwork and back-office steps. If that becomes reliable, he said, “five people become one” inside a company, and systems of work must be re-engineered over the next five years.

Calacanis added that this is already happening passively: email, Zoom transcripts, summaries, decks, and sales notes are being captured automatically. Arora said that kind of workflow can improve the audit trail because humans are not touching the data; agents manage it consistently.

The application layer is still open, but the model alone is not the product

Arora disagreed with the idea that the model itself will capture most of the economic value. Models are likely to become a utility layer, he said: companies will buy intelligence on demand, choosing a lower-cost or higher-cost capability depending on the task. He used an IQ analogy: a business might not need a “180 IQ” model for a simple customer-service task, but might pay for “250 IQ” on something more demanding.

That distinction matters because most companies do not know how to use models directly. They need application companies that arbitrage between models and solve specific business problems.

Jason Calacanis asked whether OpenAI and Anthropic might become the new Microsoft Office, moving into all application and productivity software. Arora’s answer was more constrained. He does not want every company to rewrite every piece of software internally using OpenAI directly. He wants HR systems, sales-management systems, and other business applications delivered by companies that have embedded AI and agents into the workflow.

The identity of those companies is open. It could be existing vendors like Salesforce, or it could be new AI-native application companies. Arora said the relevant layer is “still not fully formed.” The application layer must build the harnesses and memory that make AI useful in business contexts.

That does not mean incumbents are safe. David Sacks said he is seeing a playbook in accelerators: startups target expensive SaaS seats, rebuild the function with AI, charge based on consumption, and remove 80% or 90% of the cost. Arora agreed with the underlying economics. Replacement markets are attractive because the customer already has a budget. If a new product replaces something worse with something better, revenue is easier to capture.

He identified two fast paths to revenue. One is replacement TAM: replacing an existing product and taking the existing budget. The other is consumer revenue, where it can be easier to collect a small monthly amount from many users. He pointed to subscription spending broadly, saying people likely pay more in monthly subscriptions than ever before.

Cybersecurity, coding, infrastructure, and application software all appeared as major profit pools. Sacks called coding the breakout application of the past year and cybersecurity a clear growth area because the threat and patching cycle has become more dynamic. Arora agreed that model companies are making cyber-capable models available because CISOs want to test code. But he emphasized that the larger onslaught against application software has not yet fully arrived. There are “tens and tens of billions of dollars” in application software waiting to be reinvented, he said.

False positives are the constraint that separates demos from deployment

Nikesh Arora said the under-discussed issue in frontier-model adoption is false positives. Asked whether Palo Alto might build its own model because of its proprietary security data, he redirected the discussion: “What is the false positive rate on 4.8 and 5.5?”

He said Mythos had a 30% false positive rate. In practical terms, that means it sometimes reported a vulnerability where one did not exist.

30%

false positive rate Arora said Palo Alto observed with Mythos

For attackers, that may be tolerable. A model can try many leads, discard the bad ones, and exploit a true vulnerability when it finds one. For defenders, false positives create operational cost and risk. If a model says there is a hole and the company tries to plug it, the team may be fixing something that was never broken.

Arora extended the issue beyond security. An enterprise using a model for insurance claims, payments, or other business processes cannot casually accept a 10% or 20% false-positive rate. “I just lost money,” he said of a faulty claims-payment model. In cybersecurity, he said, he wants 0%.

Calacanis noted the parallel concern of false negatives: reducing false positives should not cause the system to miss real problems. Arora agreed. His point was that post-model work matters. Harnesses, training, validation, and business-specific controls are what turn a powerful model into something reliable enough for production.

His self-driving-car analogy made the risk concrete. If Mercedes said it would use a frontier model to drive a car, Arora said he would not put his kids in it with a 10% false-positive rate. The same logic applies to enterprise deployment: the model may be impressive, but the system around it determines whether it can be trusted.

This also shaped his view of model regulation and control. Chamath Palihapitiya asked whether some powerful models should be restricted to national-security agencies or tightly controlled. Arora said higher-powered models may need vetting as their capabilities increase, but he doubted that holding back U.S. models for three to six months would help in a global race. Someone else may release comparable capabilities in open source.

He offered one reason for skepticism about containment: Arora said a model-company CEO told him the entire weights of the company’s most recent model could fit on a USB stick. “That’s the IP,” Arora said. If the valuable asset is that portable, he asked, how realistic is it to assume it can be held back for six months?

Distribution still matters when models become utilities

Arora treated model capability as only one part of the competitive equation. A company still has to package, sell, and distribute that capability into places customers will actually use.

He avoided commenting on Uber because he sits on its board, saying only that Dara Khosrowshahi is a great CEO. On Waymo, his assessment was concise: the cars work, and the company should be in many more cities around the world, faster.

On Google, where Arora spent years as chief business officer, he said the company is underrated and predicted it will be the first $10 trillion company “in our lifetime.” His reason was not only model capability. Google has the assets needed to succeed, he said, including distribution and sales capacity. A model company still needs a sales force capable of convincing customers to adopt and buy the models, and Arora argued that the three hyperscalers have the largest enterprise sales forces.

On OpenAI, the criticism was that it should “sell faster.” Chamath Palihapitiya had suggested that Anthropic appeared to improve annual recurring revenue faster than OpenAI, and David Sacks attributed that to Anthropic going all in on enterprise and coding. Arora connected this back to profit pools. If model companies require tens of billions of dollars a year to build compute capacity, they need revenue engines that reach the application layer and enterprise budgets quickly.

A brief exchange put rough numbers around that capital intensity: Arora referred to the idea that one gigawatt corresponds to $10 billion of revenue, while Palihapitiya interjected that building it costs $50 billion. The point was not developed in detail. It served Arora’s broader argument that model companies with massive capital needs must capture large, durable profit pools rather than rely on raw model usage alone.

Hardware remains part of that picture. Arora said hardware is still the cheapest way to manage low-latency, high-throughput bits. Data centers, in his description, are doing exactly that. Financial services companies are reluctant to move fully to the cloud because increased latency can reduce profit, he said, so hardware will still be made and still be needed.

David Friedberg asked whether AI is changing hardware development cycles through generative design or silicon workflows. Arora said design is not the long pole; production is. Components are back ordered, factories are constrained, and the attempt to build GPU-based cards for data centers has made everything expensive. Asked whether the U.S. can fill that supply-chain need domestically, Arora answered: “In 10 years.”

Palo Alto’s next edge may be operating leverage, not just distribution

Chamath Palihapitiya asked how Palo Alto should think about acquisitions after eight years of organic growth and an active M&A strategy. Nikesh Arora described two phases and a possible third.

Until roughly a year and a half ago, Palo Alto bought product companies and put them into its go-to-market engine. The logic was simple: if the company was already selling $10 million to a customer, returning two years later and selling $20 million was an efficient way to amortize the cost of sales. Arora said that playbook helped Palo Alto run to more than $150 billion in value.

Then the company saw an inflection in identity, important for both agentic AI and security. Arora said Palo Alto bought a $25 billion company in that category, closed three months earlier. He did not name the company in the discussion.

The next opportunity, in his telling, is broader. If Palo Alto can become the best at using AI to run “the most efficient enterprise business in the world,” its operating margin could exceed the industry. Palihapitiya translated that into gross margins in the 90s and net margins in the 40s or 50s. Arora agreed with the direction. If a company can crack that operating model, he said, “then it doesn’t matter what you buy.”

That does not mean Palo Alto is ready to buy anything. Arora said the next six to 12 months are needed to understand how AI settles down and how to use it effectively in enterprises. He also pushed back against the common assumption that AI simply means fewer people. At Palo Alto, he expects more technical people than ever before because AI is causing everything to require transformation.

The implication was not a near-term promise of horizontal empire-building. It was that AI may change the acquisition math. Historically, Palo Alto’s advantage was distribution: buy a product and sell it through a stronger go-to-market system. The next advantage, if Arora’s view proves right, could be operational: buy a sub-scale company and run it at margins it could not reach alone.

RAG and Knowledge Systems AI Labs and Strategy Evals and Benchmarks AI Security Agents and Autonomy AI Infrastructure and Compute AI Business Models Coding Assistants Enterprise AI Adoption