Orply.

Rust’s Compiler Turns AI Coding Errors Into Pre-Production Feedback

Daniel SzokeAI EngineerWednesday, May 27, 20269 min read

Daniel Szoke, the Rust SDK maintainer at Sentry, argues that Rust is better suited to agentic or “vibe” coding than languages that let models produce runnable code quickly. His case is that TypeScript, Python and JavaScript impose too few constraints, allowing some model-generated bugs to compile, run and fail only intermittently. Rust, by contrast, turns classes of type, memory and concurrency errors into compiler feedback that an agent can use to repair code before it reaches production.

The easy language may be the wrong optimization

The usual case for Python, JavaScript, and TypeScript in agentic coding is that models can produce runnable code in them quickly. Daniel Szoke argues that this is the wrong success condition. A language that accepts a model’s first draft too easily may also accept mistakes that should have been rejected before production.

The conventional view, as Szoke presented it, is that Python, JavaScript, and TypeScript are the natural languages for agentic coding or “vibe coding.” When he asked ChatGPT for the best programming language for vibe coding in 2026, it answered that there was no single winner, but if forced to choose, Python was closest to number one, with JavaScript and TypeScript as a strong second. Szoke said that matched his experience in broad terms, though he would flip the order: TypeScript, in his view, has lately become the top choice for agentic coding.

GitHub’s Octoverse material connected TypeScript’s rise to AI-assisted development. The GitHub-attributed quote said that, by contributor counts, August 2025 was the first time TypeScript became the most-used language on GitHub, surpassing Python by roughly 42,000 contributors. Szoke’s reading was that GitHub “strongly suspect[s]” AI-assisted development helped push TypeScript into that position.

~42k
more GitHub contributors for TypeScript than Python in August 2025, according to the GitHub-attributed material

The reasons these languages look attractive are straightforward. They are common and familiar, often among the first languages new programmers learn. They have large ecosystems of frameworks, libraries, and examples. They are fast to scaffold and run. JavaScript and Python are interpreted; TypeScript adds a light compilation step down to JavaScript, but the loop is still fast enough to run code, observe behavior, and iterate.

TypeScript’s typing support also helps models avoid some misuse of values. But Szoke emphasized the weakness in that protection: TypeScript and typed Python still have escape hatches, including any, that can undermine the type system. His broader point was that these languages are easy for LLMs because they impose fewer constraints.

That ease is the problem. The same dynamic, flexible qualities that make Python, JavaScript, and TypeScript simple for models to emit also make it easy for them to write code with subtle mistakes. Adding typing helps, but in Szoke’s framing it only adds one category of constraint, and not always a strong one.

The classic vibe coding languages are easy for the models to write, but is that even a good thing?

Daniel Szoke

Szoke’s proposed shift is to stop optimizing for first-try runnable output and start optimizing for constraints that catch mistakes during the development loop. A language that rejects more programs up front can be better for agents, not worse, if the agent can read the failure, repair the code, and loop.

Tests and review agents do not close the gap

Szoke’s argument depends on a premise that applies to both humans and models: mistakes are inevitable. LLMs are nondeterministic systems, and although he expects them to become better, he does not expect their errors to disappear. “Just like the smartest humans make mistakes,” he said, engineering processes will need to guard against LLM error.

Tests are an important guardrail, but he argued that they are insufficient on their own, especially when the agent is also writing the tests. If the prompt is not skillful, the agent may write tests after the implementation and end up testing implementation details rather than the intended behavior. Even with test-driven development, tests generally prove incorrectness when they fail; they rarely prove correctness across every possible input combination. In many real programs, it is impractical to test every path and every input.

The same concern applies to code review agents. A review agent is another LLM-based system, subject to its own mistakes. Szoke did not argue against tests or review agents; he argued against relying on them as the only protection when the language itself permits broad classes of mistakes to compile and run.

To explain why model failures can be hard to anticipate, Szoke borrowed a framing from Yuval Noah Harari’s book Nexus. Harari, as Szoke summarized him, dislikes the “artificial” in artificial intelligence because it understates how different these systems are from human thinking. Harari prefers “alien intelligence”: LLMs produce human language, but their internal mechanism is not human reasoning; they predict streams of tokens.

Szoke used that idea to make a practical engineering point. AI-generated code may look polished: sensible variable names, good comments, plausible structure. Yet it can still contain a subtle bug or rely on a heuristic when the program could have checked the real condition directly. The code can be legible to a human and still fail in a way that does not match the reviewer’s expectations.

Murphy’s Law, in his framing, is the operational consequence. Anything that can go wrong eventually will. If a language lacks deterministic guardrails, then human review, agentic review, and tests reduce risk but do not eliminate it. In languages such as JavaScript, Python, and TypeScript, Szoke argued, the absence of those guardrails means failures will occur more often.

Rust turns some runtime failures into compile errors

Rust is attractive to Daniel Szoke precisely because it is constrained. He described it as a compiled language designed for safety and performance: it aims to be as fast as C and C++ while preserving memory safety, type safety, and other invariants through a strict compiler. The promise is not that compiling proves a program is correct in every respect. The promise is narrower and still valuable: if the code compiles, engineers can be reasonably confident that several important classes of bugs are absent.

The compiler enforces type safety, memory safety, and concurrency rules. Szoke also stressed that Rust tries to be beginner-friendly in the way it reports violations. Rust may have a reputation as advanced or difficult, but its compiler errors are designed to provide context about what went wrong and, often, how to fix it. That matters for AI agents because an agent can compile code, receive a structured error, and use the message as the next instruction in its repair loop.

The examples Szoke emphasized are the ones where flexibility in other languages produces production uncertainty. Rust’s type safety is strict: it cannot be bypassed with an any type or unchecked cast in the way Szoke described for TypeScript-like workflows. Rust also has null safety. There is no universal null value. If a value may be absent, the type must say so explicitly, such as through an Option; the compiler then forces the code to check for presence before accessing the inner value.

His central example was Rust’s “fearless concurrency.” In Rust, the compiler checks that data shared across threads is handled in a thread-safe way. That can turn certain concurrency mistakes from intermittent runtime failures into immediate compile errors.

The concurrency example used a shared counter. The code initialized a counter at zero using Rc<RefCell<i32>>, then created 100 async tasks, each of which cloned the counter and incremented the inner value. If all tasks completed, the expected result was 100. The problem was that Rc and RefCell are designed for sharing mutable data within a single thread; they are not synchronized for safe multi-threaded access.

StepWhat the code doesWhy it matters
Initial stateCreates a shared counter with `Rc<RefCell<i32>>` starting at 0.`Rc` and `RefCell` support mutable sharing within a single thread, not synchronized multi-threaded access.
Concurrent workSpawns 100 async tasks, each cloning the counter and incrementing it.The intended result is 100 after all tasks finish.
Failure mode in a flexible languageConceptually similar code may compile and run.The bug may appear only intermittently, such as returning a value other than 100.
Rust compiler responseRejects the program with `future cannot be sent between threads safely`.The diagnostic identifies that `Rc<RefCell<i32>>` is not `Send`, giving the agent a concrete cause to address.
Szoke’s concurrency example shows how Rust surfaces unsynchronized multi-threaded access during compilation.

In a language such as TypeScript, Szoke said, code with the same conceptual bug might compile and run. The failure may only appear occasionally, when the final value is something other than 100. If that counter is a small part of a larger application, the race can be difficult to locate.

Rust rejects the program. The compiler reports: “future cannot be sent between threads safely.” The async block is not Send, where Send means safe to transfer between threads. Szoke acknowledged that this top-level error alone is not maximally helpful. But further down, the compiler explains the captured value causing the problem: counter has type Rc<RefCell<i32>>, which is not Send.

That diagnostic is the feedback loop Szoke wants agents to use. The compiler names the type and the violated concurrency property. An AI agent compiling the project can see that Rc<RefCell<i32>> is the problem and, in Szoke’s words, “immediately go and change this to a thread-safe type,” of which Rust has many.

And every compile error is potentially a bug that you avoid in your production code.

Daniel Szoke · Source

The point is not that Rust makes LLMs better at first-pass code generation. Szoke conceded the opposite: Rust’s constraints make it harder for models to get code right on the first try because there are more rules to follow. His claim is that this is good for agentic coding because agents are not limited to a single shot. They can compile, read failures, and fix them.

Compiler feedback catches what another model may miss

The tradeoff in Daniel Szoke’s argument is direct. Rust makes the model’s first draft harder. But it also converts some important classes of errors into compiler feedback. For an agent in a loop, that feedback is not friction in the same way it is for a one-shot code generator. It is a concrete correction signal from the compiler.

Szoke contrasted that with asking another AI system to review the code. He said he still thinks review agents should be used, but the Rust compiler provides a different kind of safety. It is built into the language workflow, and for the classes of errors Rust checks, it does not depend on whether a reviewer model notices the issue.

He also addressed a common complaint: Rust compile times can be slow. His answer was comparative. Even if compilation takes time, he said, it is faster than letting an AI agent review the code for the same category of issue, and the review agent may miss errors the compiler is guaranteed to find.

This is the core inversion in Szoke’s case for Rust as a vibe-coding language. A language that makes models look good by accepting more code may also accept more mistakes. A language that makes models struggle visibly can be useful if each struggle becomes a specific, actionable error before the code ships.

The Sentry connection is agent monitoring

Szoke noted that he works at Sentry, where he is the Rust SDK maintainer, and that the talk was sponsored. The product link he made was brief: Sentry has agent monitoring features, and attendees were invited to try Sentry or visit its booth for a demo.

The technical case he had laid out rested on the language/compiler loop: LLMs are fallible, tests and review agents are incomplete, and Rust’s constraints can force important failures to surface during compilation rather than intermittently in production.

The frontier, in your inbox tomorrow at 08:00.

Sign up free. Pick the industry Briefs you want. Tomorrow morning, they land. No credit card.

Sign up free