ElevenLabs Says Dubbing v2 Preserves Performance Across 90 Languages
ElevenLabs is introducing Dubbing v2 alpha as an AI dubbing model built around preserving the original speaker’s performance, not just translating a transcript. The company says the system conditions directly on source audio so tone, pacing, emphasis and emotional delivery can carry across more than 90 languages, with sync-aware translation adapting phrasing to fit the timing of the original. ElevenLabs is positioning the launch for creators, marketers and studios that want automated localization without building a separate dubbing pipeline.

Dubbing v2 is pitched as performance transfer, not transcript replacement
ElevenLabs introduces Dubbing v2 alpha as an “end-to-end dubbing model” whose central claim is not simply multilingual conversion, but preservation of performance. The company says the model conditions directly on the original audio performance, rather than relying on a transcript-first workflow. The promised result is that tone, emotion, delivery, pacing, and emphasis carry into the dubbed output.
That distinction matters because the examples are built around material where delivery is part of the asset. In the MrBeast clip attributed on screen to “MrBeast,” Jimmy Donaldson opens with the premise of a survival video: “We just landed on this deserted island in the middle of the ocean. And we’re gonna be stranded here for the next seven days.” The same passage then moves across Spanish and Portuguese while preserving the cadence of a high-energy YouTube challenge: “Y ese bote que se acaba de ir, era nuestra única salida,” followed by the joking question, “¿Por qué hacemos estas cosas?” and the reply, “Creí que sería divertido.”
The clip is used to show more than a translated script. It carries the setup, the escalation, the comic aside, and the countdown pressure before nightfall. The visual context reinforces that: Donaldson and friends are shown on a tropical island beach with supplies, a yellow raft, and on-screen timing text reading “3 DAYS” and “7 DAYS.”
ElevenLabs’ own formulation is concise: “Dubbing v2 conditions directly on the original performance, ensuring tone, emotion, and delivery travel intact into every language.” The product claim is therefore narrower and more concrete than “AI translation.” ElevenLabs is presenting Dubbing v2 as a model for localizing video while preserving the expressive shape of the source, and says earlier models “fell flat” because they did not carry that performance across languages.
Sync-aware translation is the mechanism ElevenLabs wants buyers to notice
The second major claim is that translations are “sync-aware.” On screen, ElevenLabs presents a title card with the text “Translations are sync-aware,” followed by a second text card: “Phrasing adapts naturally.” The narration states that Dubbing v2 adapts phrasing for each language and produces “professional quality outputs, fully automated, end-to-end.”
The practical implication, as ElevenLabs describes it, is that the model is not merely choosing translated words. It is supposed to shape phrasing around delivery constraints: the starts and stops of the original, the timing of the line, and the way a sentence can be said naturally in the target language. ElevenLabs says the system aligns starts and stops “out of the box” and reduces manual editing.
The frog animation attributed to “Wonder.” gives a compact example across a children’s-story register. The narration begins in English: “Finn the frog loved to brag about being the biggest, strongest frog in the swamp.” It then continues in Japanese, describing him puffing out his chest, jumping onto a rock, and declaring himself king of the amphibians. The same story moves into French: “Un jour, il tomba sur un ballon brillant flottant dans les roseaux, qui avait été laissé par un enfant.”
That sample presents a different use case from the MrBeast clip. The island video foregrounds an energetic creator voice; the frog sequence foregrounds narration and characterful storytelling, paired with stop-motion-style animation of a small green frog on a lily pad among clay-like animal characters. Both examples support the same product thesis: the dub should sound adapted to the target language without losing the source’s performance shape.
The commercial example turns dubbing into a production workflow claim
A longer ad-style example, attributed on screen to “ramp,” moves the claim from creator content and animation into marketing localization. The visual shows office workers overwhelmed by stacks of paperwork, with the visible line “Give Your Best Smile.” The voiceover begins in English — “The finance department. Wait, did we pay this already?” — then shifts into Polish across expense categories, receipts, reports, invoices, and purchase orders.
The sequence continues through product-interface shots: a corporate card dashboard with blocked categories including “Alcohol and Bars” and “Gambling”; a phone message asking for a receipt and marking the expense complete; and an auto-coding screen with the visible text “Auto-coding Complete.” The audio moves between Polish and German lines, including “Ausgaben reichen sich von selbst ein,” “Schauen Sie sich das mal an,” and the comic interruption, “Okay, aber programmiert das auch automat... Oh.”
The commercial ends on a German version of Ramp’s positioning line while the screen shows two people shaking hands in a modern office: “Wir kümmern uns um Aspekte des Finanzwesens, die Sie hassen.” The visible English text reads, “We’re fixing the parts of finance you hate.”
This example presents automation as a production claim, not only a translation claim. A localized ad has more constraints than a translated script: product screenshots, comedic timing, brand tone, fast cuts, and lines that must land inside fixed scenes. ElevenLabs presents Dubbing v2 as a way to automate that workflow without building a separate dubbing pipeline.
The launch is packaged for creators, marketers, and studios
ElevenLabs says Dubbing v2 is available in two product contexts. ElevenCreative is presented as the place for “one-click dubs.” ElevenProductions is described as a service offering for “professional-grade localization.” ElevenLabs identifies the target users as creators localizing YouTube videos in ElevenCreative, marketers scaling ads across markets, and studios producing broadcast-grade dubs through ElevenProductions.
The company also positions the model against the economics of traditional dubbing. It says professional dubbing can cost hundreds of dollars per minute, while Dubbing v2 brings that quality level to automated workflows “with no pipeline to build.”
The launch offer is specific. For seven days, ElevenLabs says Dubbing v2 has free usage on every plan: 1 minute on Free, 15 minutes on Starter, and 30 minutes on Creator and above. The company also points viewers to three routes: trying Dubbing v2, applying to the Creator Dubbing Partner Program, and exploring ElevenProductions.
Taken together, the launch frames Dubbing v2 as a production system rather than a novelty translation feature. Its differentiating claim is that the source performance remains the conditioning signal, while sync-aware translation handles timing and phrasing that would otherwise create manual editing work.
