# Heartmula Music Generation Skill
## Purpose
Generate high-quality music with lyrics using Heartmula, a state-of-the-art music generation model. Perfect for creating songs with custom lyrics in various styles and genres.
## Target Workflow
- Slug: [music-heartmula] (HeartMuLa Music Generator)
- Type: txt2audio
- Required Fields: prompt_positive (lyrics)
- Optional Fields: slot1 (style tags), guidance (cfg_scale), length (max audio length), seed, strength (topk)
## When to Use This Skill
- Creating original songs with custom lyrics
- Generating music in specific genres or styles
- Producing backing tracks for content creation
- Making parody songs or humorous tracks
- Creating multilingual music (supports English, Japanese, and more)
- Experimenting with different musical styles and moods
## Prompt Structure
### For Music Generation (txt2audio)
The user provides lyrics and optionally a music style, and wants a song generated.
**Command format:**
```
/wf /run:music-heartmula /guidance:1.5 /length:60 /strength:50 /slot1:"style,tags,here" [lyrics or song description]
```
**Prompt template:**
```
[Verse/Chorus structure] [Lyrics content]. [Optional: Emotional tone or narrative context].
```
## Heartmula Prompting Guidelines
### Key Components
1. **Lyrics (Positive Prompt)**
   - The main content of your song
   - Can include verse/chorus markers like [Verse], [Chorus], [Bridge]
   - Supports multiple languages (English, Japanese, etc.)
   - Be creative with rhymes, rhythm, and storytelling
2. **Style Tags (Slot1)**
   - Comma-separated keywords describing the music style
   - Examples: "guitar,indie,pop", "Hip Hop", "90s indie pop", "electronic,dance"
   - Although specific tag documentation is limited, experiment with genre descriptors
   - Can combine multiple elements: instruments, eras, moods, genres
### Prompt Format Best Practices
- Write lyrics as flowing text, not bullet points
- Use markers like [Chorus], [Verse] for song structure
- Include emotional or narrative context if helpful
- Keep lyrics concise for better audio quality (shorter songs generally sound better)
- For multilingual songs, mix languages naturally within the lyrics
### Key Parameters
| Parameter | Default | Range | Usage |
|-----------|---------|-------|-------|
| guidance (cfg_scale) | 1.5 | 0.1-10 | Controls how strictly the model follows the lyrics vs. creativity. Higher = more literal, lower = more creative |
| length | 60 | varies | Maximum audio length in seconds. Longer songs need more VRAM and time |
| strength (topk) | 50 | 1-500 | Top-k sampling. Lower values = more focused/deterministic, higher = more diverse |
| seed | random | integer | For reproducibility. Same seed + prompt = similar output |
| slot1 | "90s indie pop" | text | Style tags/keywords for the music genre (comma-separated) |
## Examples
### Example 1: Simple Indie Pop Song
**User:** "a song about a dog eating pizza with guitar"
→ Command to generate:
```
/wf /run:music-heartmula /guidance:1.5 /length:60 /slot1:"guitar,indie,pop" My dog loves pizza more than bones, He steals a slice when I'm not home. Cheesy goodness in his mouth, Running round the house heading south. Best friend with the worst manners, Eating pizza in his pajamas!
```
### Example 2: Children's Song
**User:** "twinkle twinkle little star indie style"
→ Command to generate:
```
/wf /run:music-heartmula /guidance:1.5 /length:60 /slot1:"indie,pop,acoustic" Twinkle, twinkle, little star, How I wonder what you are! Up above the world so high, Like a diamond in the sky. Twinkle, twinkle, little star, How I wonder what you are!
```
### Example 3: Hip Hop with Japanese Lyrics
**User:** "a hip hop song with Japanese and English mixed lyrics about friends"
→ Command to generate:
```
/wf /run:music-heartmula /guidance:1.5 /length:60 /slot1:"Hip Hop,rap" [Chorus] Anne chan had the flu. But now she is genki genki genki! Japanese takoyaki beer kakkoii. Watermelon orange daisuki! Anne-chan, Rina-chan, so kawaii! Urusai, konna mo![/chorus]
```
### Example 4: Emotional Ballad
**User:** "a sad acoustic song about missing someone"
→ Command to generate:
```
/wf /run:music-heartmula /guidance:1.8 /length:45 /slot1:"acoustic,ballad,sad,piano" The room is empty without you here, Echoes of laughter disappear. I reach for you in the middle of the night, But you're gone like morning light. Missing you is all I do, Counting days till I see you.
```
### Example 5: Upbeat Electronic Dance
**User:** "an energetic EDM track about partying all night"
→ Command to generate:
```
/wf /run:music-heartmula /guidance:1.2 /length:60 /slot1:"electronic,EDM,dance,upbeat" [Drop] Turn it up, turn it up, don't stop now! Feel the beat, feel the heat, take a bow! All night long we dance together, Party people we're birds of a feather! Bass is pumping, lights are flashing, Hearts are racing, night is crashing![/drop]
```
## Style Categories and Tags
### Genres
- pop, indie pop, rock, indie rock
- hip hop, rap, trap
- electronic, EDM, house, techno
- acoustic, folk, singer-songwriter
- jazz, blues, soul, R&B
- country, bluegrass
- classical, orchestral, cinematic
### Instruments
- guitar, acoustic guitar, electric guitar
- piano, synth, keyboard
- drums, percussion
- bass, upright bass
- strings, violin, cello
- brass, horns
### Moods/Atmosphere
- upbeat, energetic, happy
- sad, melancholic, emotional
- chill, relaxed, lo-fi
- epic, cinematic, dramatic
- dark, moody, atmospheric
- romantic, love, tender
### Era/Production Style
- 80s, 90s, 2000s, retro, vintage
- modern, contemporary, futuristic
- lo-fi, bedroom pop
- polished, produced, studio quality
- raw, demo, live
## Song Structure Tips
### Common Structures
- **Verse-Chorus**: Simple and effective for pop songs
- **Verse-Chorus-Bridge-Chorus**: Classic full structure
- **AABA**: Jazz standard format
- **Through-composed**: Continuous narrative without repeating sections
### Section Markers
Use brackets to indicate song structure:
- `[Verse]` / `[Verse 1]`, `[Verse 2]`
- `[Chorus]` / `[Pre-Chorus]`
- `[Bridge]`
- `[Intro]` / `[Outro]`
- `[Drop]` (for EDM)
## Language Support
Heartmula supports multiple languages:
- **English**: Primary language, works best
- **Japanese**: Fully supported, can mix with English
- **Other languages**: Experiment to see what works
### Multilingual Tips
- Mix languages naturally within lyrics
- Japanese phrases work well in English songs
- Keep phrases short for clarity
## Tips for Best Results
1. **Lyric Length**
   - Shorter songs (30-45 seconds) generally have better quality
   - Longer songs may have repetition or degradation
   - Match length to content complexity
2. **Guidance Scale (cfg_scale)**
   - 1.0-1.5: More creative, may deviate from lyrics
   - 1.5-2.0: Balanced creativity and accuracy
   - 2.0+: Strict adherence to lyrics, less musical flow
3. **Style Tag Combinations**
   - Combine 2-4 related tags for best results
   - "guitar,indie,pop" works better than just "guitar"
   - Don't be afraid to experiment - the model interprets tags creatively
4. **Top-k Sampling (strength)**
   - 30-50: Balanced variety and coherence
   - Lower (10-30): More predictable, less variety
   - Higher (100-200): More experimental outputs
5. **Seed for Iteration**
   - Use the same seed to refine a specific output
   - Change seed for completely new variations
## Common Use Cases
| Use Case | Length | Guidance | Style Tags | Key Elements |
|----------|--------|----------|------------|--------------|
| Social media background music | 30-45s | 1.5 | upbeat,pop | Catchy, short hooks |
| Children's songs | 45-60s | 1.8 | acoustic,cheerful | Simple lyrics, clear vocals |
| Gaming content | 30-60s | 1.2 | electronic,energetic | Driving beat, no lyrics |
| Emotional storytelling | 45-60s | 1.8 | piano,ballad | Narrative lyrics, mood |
| Parody/comedy songs | 30-45s | 1.5 | varied | Humorous lyrics, recognizable style |
| Multilingual content | 45-60s | 1.5 | varied | Mixed language lyrics |
## Workflow Notes
- **Runtime**: ~88 seconds average
- **VRAM**: ~11GB
- **Output**: Single audio file (typically MP3 or WAV)
- **Model Features**: Generates vocals and instrumentation together
- **Editable**: You can modify parameters and regenerate
## Technical Requirements
Required custom nodes:
- HeartMuLa_ComfyUI-y (goofyrodent)
Models used:
- HeartMuLa (base model)
- HeartMuLa-oss-3B
- HeartTranscriptor-oss
- HeartCodec-oss
## Troubleshooting
- **Audio quality issues**: Try shorter length or different seed
- **Lyrics not matching**: Increase guidance (cfg_scale) value
- **Style not coming through**: Experiment with different slot1 tag combinations
- **Repetitive output**: Increase topk (strength) value for more variety