ElevenLabs Music v2 Adds Section Editing and Mid-Track Genre Shifts
ElevenLabs’ launch walkthrough for Music v2 presents the model as a more controllable generative music system, not only a higher-quality one. Alec Wilcock says the new version improves vocals, instrumentation, arrangement, multilingual output and dense vocal delivery, while adding section-by-section composition, targeted inpainting and the ability for one song to move between genres without losing coherence. The company also says the model is trained on licensed data and that generated tracks are cleared for commercial use.

Music v2 is pitched as a more controllable song model, not just a better-sounding one
Alec Wilcock presents ElevenLabs Music v2 as the second generation of the company’s generative music model, with control over song structure as one of the central changes. Users can build a song section by section, regenerate part of an existing track without changing the rest, and ask a single composition to move between sharply different genres while preserving continuity.
You can now build a full song section by section, regenerate any part of a track without touching the rest, and a single song can move from opera to heavy metal and back without falling apart.
The quality claim is broad: Wilcock says Music v2 is stronger across vocals, instrumentation, and arrangement, and that those gains apply across genres. He specifically calls out fast rap, dense lyrics, and complex vocal performances as areas where v1 “couldn’t really hold together” and v2 is meant to perform better.
ElevenLabs also frames the model around commercial use. Wilcock says Music v2 is trained only on licensed data and that generated tracks are cleared for commercial use, so users can use them without sync fees or clearance delays. That is ElevenLabs’ stated position in the launch walkthrough, not a separate legal analysis offered in the source.
The launch framing splits Music v2 across three product surfaces shown on screen: ElevenMusic, ElevenCreative, and ElevenAPI.
| Surface | Role shown in the source |
|---|---|
| ElevenMusic | Listen, remix, and create tracks; described as a studio from start to finish. |
| ElevenCreative | Generate licensed music for ads, branded content, and video at scale. |
| ElevenAPI | Shown as able to generate, inpaint, reference-match, and embed music anywhere; Wilcock separately says ElevenAPI is coming soon. |
The more ambitious claim is genre motion inside one track. A prompt shown for ElevenMusic asks for “a beautiful Italian opera song that transitions quickly into a heavy metal version,” illustrating the kind of abrupt mid-track shift the model is meant to handle.
The new controls target structure, edits, sound effects, and language
Music v2’s additions are presented as practical composition controls rather than isolated output improvements. Mid-track genre transition lets a user ask for a song that changes style during playback, instead of generating separate clips and stitching them together. Embedded non-musical sound effects can be made part of the track itself rather than layered on afterward.
Section-by-section composition is the central structural control. Music v2 can build an intro, verse, chorus, bridge, and outro as separate parts, each with its own style, lyrics, and duration, while maintaining a continuous song. In the editor shown later, that structure appears as editable sections with global styles, section descriptions, and lyrics visible alongside the track.
Improved inpainting is tied to that structure. Users can select a section of an existing track and regenerate only that portion. In the demonstrated editor, that means asking for a musical change in natural language, adjusting style tags, and regenerating the chosen material without starting from a blank prompt.
Multilingual generation is the other major quality claim. Wilcock says lyrics, vocals, and arrangements now perform reliably in the language the user writes in, with improvements across English, Spanish, German, Japanese, and a growing list of other languages.
Before generation, the controls are prompt, model, lyrics, duration, and video context
In ElevenCreative, Music v2 is reached from the main dashboard and opens into a music-generation interface. The basic act of generation remains prompt-driven: describe the desired track, choose settings, and generate. The visible controls include model selection, number of generations, duration, lyric mode, and a finetune field.
The model toggle is the key pre-generation change. Users can choose Music v1 or Music v2. Wilcock says Music v2 has the same features as v1 except that it currently does not support finetunes, though finetune support is “coming very soon.” The interface still shows the finetune control, but the source does not present finetunes as currently available for Music v2.
The interface also branches beyond ordinary song prompting. A user can upload a video and have ElevenLabs use Music v2 to generate fitting background music for it. For a standard music generation, Wilcock’s example prompt asks for “a punk rock song about public transport with a transition halfway through to rap.”
After generation, the output is treated as a project rather than just an audio file. The generated tracks appear with prompt history, model information, and options to reuse the prompt, find alternatives in the ElevenMusic library, or open the result for editing. Hovering over the waveform reveals song structure, which Wilcock uses to find the transition point in the generated punk-to-rap track. He describes the beat switch as continuous.
After generation, inpainting can redirect a section without erasing the whole track
The ElevenMusic editor turns the generated song into editable structure. In the “Transit Rage” example, the visible global styles include punk rock, rap, hip hop, energetic, up-tempo, and male vocals. Section-level material is also visible: an intro with distorted guitar and shouted energy, followed by a first punk verse with lyrics about being stuck in public transport.
The inpainting example is a targeted revision to the first 40 seconds: “Let’s rework the whole first 40 seconds of the music so that, instead of punk rock, the music is upbeat techno that transitions into rap.” Wilcock also changes the global styles manually, removing punk rock and adding techno, then regenerates.
His assessment of the result is qualified. The revised section has techno elements and a continuous transition “from techno rock to rap,” but he notes that if he wanted it to sound more like techno, he should have started with a techno prompt. The edited version is still “keeping the rock roots and trying to adapt it to techno.” In this demonstration, inpainting can redirect a section, but the original generation’s musical context still matters.
The same editor supports creating a new track, customizing lyrics, turning lyrics off for an instrumental, changing duration, and generating songs up to 10 minutes long.
That places Music v2 between a one-shot prompt generator and a structured composition tool: the user can start from a broad description, then refine sections, style tags, lyrics, duration, and selected spans after the first output exists.
Publishing makes generated tracks available for reuse
A finished track can be published to the ElevenMusic marketplace. The publish modal shown for “Transit Rage” includes a generated cover option, title, description, and genre labels such as punk rock, rap, hip hop, and alternative. The description summarizes the track as an energetic punk rock anthem about frustrations with public transport that switches into a laid-back ’90s rap verse with a lo-fi hip hop beat.
Wilcock says users who are happy with a track can publish it to the ElevenMusic library. If other people download and reuse those tracks in their projects, he says the creator can generate “a little bit of income.” The marketplace mechanics are not detailed further; publishing is presented as the path from generation and editing into shared reuse.


