MiniMax Music 2.5 Structure Tags Explained: All 14 Tags With Examples

What's up, fellow music builders — if you've ever generated a track in MiniMax and wondered why the chorus hit harder than the verse but you weren't sure you could repeat it, this is the piece I wish someone had written for me three weeks ago.
I'm Hanks. I test AI tools in real production environments. Not demos — actual deliverable tasks. I spent the last few weeks running MiniMax Music 2.5 through 30+ generations specifically focused on tag behavior: what each one does, how it interacts with adjacent tags, and where things break down.
The question I kept testing: which tags do what, exactly, and how do you combine them to get consistent, repeatable outputs?
Here's what I found.
What Structure Tags Do and Why They Matter

Before Music 2.5, you were essentially handing the model a prompt and hoping the arrangement made sense. The structural logic was baked in and opaque. You could describe "a verse that builds into an explosive chorus" — but whether the model interpreted that correctly was a coin flip.
Music 2.5 changes this with full-section tag control, supporting 14 structural variations. This means you can act like a professional arranger, designing the emotional curve, climax, and instrumentation of the entire song from the start — rather than generating a track and rolling the dice.
Tags work by wrapping lyrics (or empty blocks) and signaling to the model what role that section plays in the song's architecture. The model then applies genre-appropriate instrumentation, dynamic behavior, and vocal treatment to match that role.
Tags define musical sections
Think of each tag as a brief to the model's internal arranger. [Intro] says "establish the sonic world before vocals enter." [Build Up] says "increase tension and density — a drop is coming." [Break] says "pull back, let the arrangement breathe."
Without tags, you're writing lyrics and hoping the model assigns them correctly. With tags, you're directing.

Place the tag at the start of each lyrics block
The tag goes immediately before its section of lyrics, on its own line, with no extra space between the tag and the first line of that section's lyrics.
Format that works:
[Verse]
Your lyrics here
Second line here
[Chorus]
Hook line one
Hook line two
Lowercase tag text ([verse]) also works in testing, but I use capitalized as a habit — it's consistent with the official API documentation examples.
You can add inline parenthetical cues within any tagged section to trigger specific micro-behaviors:
[Verse]
(whispered, intimate)
Your lyrics here
[Inst] (guitar solo, bluesy, building)
These parentheticals don't always work reliably for highly specific instrument directions, but they meaningfully influence dynamic character and vocal approach. Worth using.
***
## All 14 Tags Explained
The complete set of supported structure tags in Music 2.5 is: \[Intro], \[Verse], \[Pre Chorus], \[Chorus], \[Interlude], \[Bridge], \[Outro], \[Post Chorus], \[Transition], \[Break], \[Hook], \[Build Up], \[Inst], \[Solo].
Here's what each one actually does in practice, based on testing.
### Core Structure Tags
These are the architectural backbone — the tags you'll use in almost every song.
| Tag | What It Does | When to Use It |
| -------------- | ------------------------------------------------- | --------------------------------------------------------------------------- |
| \[Intro] | Establishes sonic identity, no or minimal vocals | Opening of any track; sets genre and atmosphere before the narrative starts |
| \[Verse] | Narrative content, lower energy, storytelling | First and second verse; builds toward pre-chorus or chorus |
| \[Pre Chorus] | Builds emotional tension between verse and chorus | Connecting verse to chorus when you want a ramp-up moment |
| \[Chorus] | Peak energy, main hook, melodic anchor | Primary emotional payoff; repeat 2–3 times across the track |
| \[Post Chorus] | Momentum cooldown immediately after the chorus | When you want the chorus energy to bleed into the next section gracefully |
| \[Bridge] | Lyrical/melodic pivot, contrasting section | Single use, typically 2/3 into the song; breaks the verse-chorus loop |
| \[Outro] | Closing resolution, fade or hard stop | End of track; signals the model to wind down arrangement and energy |
**On `[Pre Chorus]`:** this is the most underused tag I see in generated prompts. It creates a reliable tension ramp — the model increases rhythmic density and harmonic tension — and makes the chorus land harder. If you're getting a chorus that feels flat, try adding a pre-chorus before it.
**On `[Bridge]`:** use it once. Multiple bridge tags in a row confuse the model into generating repetitive "contrast" sections that cancel each other out. One bridge, placed after your second chorus, is almost always the right call.
### Highlight & Instrumental Tags
These give you control over specific moments — transitions, instrumental passages, and structural punctuation.
| Tag | What It Does | When to Use It |
| ------------- | ---------------------------------------------------------------- | ------------------------------------------------------------------------------- |
| \[Interlude] | Extended break between major sections, can be melodic or ambient | Between choruses; when you want a longer breathing space than \[Break] provides |
| \[Transition] | Short connective tissue between any two sections | When sections feel like they're jumping too abruptly; 2–4 bar connector |
| \[Break] | Full or partial instrument dropout, strip-back moment | Classic hip-hop/EDM device; just before a final chorus drop |
| \[Hook] | Short repeated melodic or lyrical phrase, usually 2–4 bars | When you want a standalone earworm that isn't a full chorus |
| \[Build Up] | Increasing intensity, rising to a climax or drop | Just before \[Chorus] or \[Break] in EDM and cinematic contexts |
| \[Inst] | Pure instrumental passage, no vocals | Guitar solos, piano interludes, full arrangement showcase |
| \[Solo] | Featured single instrument moment | When you want a specific instrument highlighted, not just background |
**On `[Hook]` vs `[Chorus]`:** the practical difference is scale and weight. A hook is short, punchy, 2–4 bars. A chorus is a full section with melodic resolution and structural finality. For jingles and brand music, `[Hook]` often does what you want better than `[Chorus]` because it's tight and doesn't try to be a complete musical statement.
**On `[Inst]` vs `[Solo]`:** I use `[Inst]` for multi-instrument passages (the band plays without vocals) and `[Solo]` when I want one instrument to carry the moment. In testing, `[Solo]` paired with a parenthetical — `(piano solo)` or `(saxophone solo)` — produces noticeably more focused single-instrument behavior than `[Inst]` alone.
**On `[Transition]`:** this is the tag most people skip and then wonder why their sections feel choppy. A two-bar `[Transition]` between `[Verse]` and `[Chorus]` creates a natural gear change. Think of it as musical punctuation.
***
## 3 Ready-to-Use Tag Layouts

These are copy-paste structures I've run through multiple generations and verified produce consistent, logical song shapes.
### Standard Pop
Best for: pop ballads, singer-songwriter tracks, commercial pop, emotional storytelling.
[Intro] (instrumental, soft, 4–8 bars) [Verse] Your first verse lyrics here 2–4 lines, conversational energy [Pre Chorus] Tension build — 2 lines Emotional ramp toward the hook [Chorus] Main hook — your strongest lyrical moment 2–4 lines, melodic peak [Verse] Second verse lyrics Build on the story from verse one [Pre Chorus] (same or slightly varied from first pre-chorus) [Chorus] (repeat, add harmony layer) [Bridge] Lyrical or emotional pivot — 4 lines Contrast with the verse-chorus pattern [Chorus] (final chorus, full arrangement) [Outro] (fade or resolution, 4–8 bars)
**What I've noticed:** without `[Pre Chorus]`, the verse-to-chorus transition often feels like a jump cut. The pre-chorus is doing real structural work here, even if its lyrics are simple.
***
### Hip-Hop
Best for: trap, boom bap, melodic rap, R\&B-adjacent hip-hop.
[Intro] (instrumental, 4–8 bars, beat-only) [Verse] 16 bars of rap verses Keep stanzas tight — 4-line groups work well [Hook] 4–8 bar hook Catchy, rhythmic, memorable phrase [Verse] Second verse — 16 bars Development or response to verse one [Hook] (repeat) [Break] (minimal beat, stripped back, 4 bars) [Verse] Optional third verse or extended second verse [Hook] (final hook, with ad-lib layer if desired) [Outro] (beat fade or hard stop)
**What I've noticed:** `[Break]` before the final hook is the single highest-impact structural move in hip-hop generation. The contrast between a stripped `[Break]` and a full `[Hook]` creates the "drop" feeling that's central to the genre. Without it, the hook repetitions start blending together.
***
### Cinematic Instrumental
Best for: film scoring, game music, trailers, ambient/emotional underscore. No lyrics required — tags and parentheticals only.
[Intro] (strings only, quiet, sparse, establishing) [Verse] (themes introduced, low tension, exploratory) [Build Up] (brass enters, rhythmic intensification, rising) [Chorus] (full orchestra, emotional peak, triumphant) [Interlude] (solo piano, emotional reset, intimate) [Build Up] (second crescendo, choir enters, climactic) [Chorus] (final peak, full ensemble, sustained) [Outro] (slow resolution, strings fade, quiet close)
**Key note for instrumentals:** for Music 2.5, lyrics is required (1–3500 characters), while prompt is optional. That means you can run an instrumental workflow by putting structure tags in the lyrics field and filling sections with `[Inst]` lines — without needing a separate prompt every time.
This is the correct approach. The lyrics field still needs content — use only tags and parentheticals, no actual lyric text. The model treats empty-but-tagged sections as instrumental by default.
***
## Troubleshooting Tag Issues

These are the three failure patterns I hit most often, and the fixes that worked.
**Check spelling — including capitalization.** `[Prechorus]` and `[pre chorus]` don't reliably trigger the same behavior as `[Pre Chorus]`. The official API documentation uses exact capitalization: `[Pre Chorus]`, `[Post Chorus]`, `[Build Up]`. Stick to that format exactly. I've had generations where a single missing space in `[Build Up]` caused the model to generate a flat dynamic section with no crescendo behavior.
**Avoid empty sections with no content or cue.** If you place a tag but give it nothing — no lyrics, no parenthetical, no indication of what should happen — the model sometimes skips the section entirely or generates a very short, minimal passage. If you want an instrumental `[Intro]`, add at minimum a parenthetical: `(instrumental, 8 bars, establishing)`. Give the model something to work from.
**Don't stack conflicting energy levels back-to-back.** A sequence like `[Chorus] → [Build Up] → [Chorus]` creates an energy logic problem — you're asking for a peak, then a build *toward* a peak, then another peak. The model will try to honor this and often produces a muddy middle section. Follow music's natural energy logic: build before peak, cooldown after peak. `[Pre Chorus] → [Chorus] → [Post Chorus]` is the clean version of that flow.
***
At Macaron, we see the same pattern play out constantly: the friction isn't in creating — it's in keeping track of what's working. Which tag combination produced that good chorus? What parenthetical did you use in that `[Solo]` that made it land? When you're iterating across 10 or 20 generations, the context for each decision gets lost fast.
We built Macaron to hold that context — so your iteration loop has a memory, and your decisions don't have to live in a sticky note. If you're doing serious music prompt work and want a structured space to track your experiments without switching between five apps, try it free at [macaron.im](https://macaron.im/). Run a real workflow, see if it fits your process, and judge the output yourself.










