# LTX-2 Video Generation Skill ## Purpose Generate high-quality videos from text prompts using LTX-2, a state-of-the-art text-to-video model with built-in audio generation. Perfect for creating cinematic clips, animated scenes, and dynamic video content. ## Target Workflow - Slug: [video-ltx2] (LTX-2 text to video with built-in audio) - Type: txt2vid - Required Fields: prompt_positive - Optional Fields: fps, length, seed, size (width/height), slot1 (lora strength) ## When to Use This Skill - Creating cinematic video clips from scene descriptions - Generating animated content with character dialogue and action - Producing video content with synchronized audio/sound - Making short-form video content for social media - Visualizing dynamic scenes with camera movement ## Prompt Structure ### For Text-to-Video (txt2vid) The user provides a text description and wants a video generated. **Command format:** ``` /wf /run:video-ltx2 /fps:24 /length:121 /size:640x448 [detailed scene description] ``` **Prompt template:** ``` [Shot type], [scene description]. [Lighting and atmosphere]. [Character description and action]. [Camera movement]. [Audio description including dialogue in quotes]. ``` ## LTX-2 Prompting Guidelines ### Key Aspects to Include (in order) 1. **Establish the shot** - Use cinematography terms (close-up, wide shot, tracking shot, etc.) - Specify scale and genre characteristics - Examples: "Cinematic action shot", "Wide establishing shot", "Intimate close-up" 2. **Set the scene** - Describe lighting conditions, color palette, textures - Establish the atmosphere and mood - Examples: "warm golden hour lighting", "neon-drenched cyberpunk alley", "soft morning light through fog" 3. **Describe the action** - Write as a natural sequence from beginning to end - Use present tense verbs - Keep the action flowing continuously 4. **Define characters** - Include age, appearance, clothing, distinguishing features - Express emotions through physical cues and expressions 5. **Camera movement** - Specify when and how the camera moves - Describe the relationship between camera and subject - Examples: "camera pans left to follow", "handheld tracking shot", "slow push-in" 6. **Describe audio** - Use natural language for ambient sounds, music, speech - For dialogue, use quotation marks - Can specify language/accent for dialogue if needed ### Prompt Format Best Practices - Keep as a single flowing paragraph (not bullet points) - Use present tense for all actions - Match detail to shot scale (close-ups need more detail than wide shots) - Write 4-8 descriptive sentences to cover all aspects - Don't use special tags or formatting for audio ### Key Parameters | Parameter | Default | Range | Usage | |-----------|---------|-------|-------| | fps | 24 | 1-60 | Frame rate. 24 for cinematic, 30 for standard video | | length | 121 | varies | Number of frames. 121 frames ≈ 5 sec at 24fps | | size | 640x448 | various | Video resolution. 640x448 is default, 1280x896 for upscaled | | seed | random | integer | For reproducibility. Same seed + prompt = same output | | slot1 | 0.8 | 0.0-1.0 | Distill LoRA strength. 1.0 is official, lower often gives better quality | ## Examples ### Example 1: Action Scene **User:** "monster truck driving fast and drifting" → Command to generate: /wf /run:video-ltx2 /fps:24 /length:121 /size:640x448 "An action packed, cinematic shot of a monster truck driving fast towards the camera, the truck passes the camera as it pans left to follow the truck's reckless drive. Dust and motion blur surrounds the truck, handheld feel to the camera as it tries to track its ride into the distance. The truck then drifts and turns around, then drives back towards the camera until seen in extreme close up. Engine roaring, tires screeching, heavy metal music intensifies." ### Example 2: Comedy Scene with Dialogue **User:** "funny family scene in the backyard with the grandpa acting weird" → Command to generate: /wf /run:video-ltx2 /fps:24 /length:121 /size:640x448 "A warm sunny backyard. The camera starts in a tight cinematic close-up of a woman and a man in their 30s, facing each other with serious expressions. The woman, emotional and dramatic, says softly, 'That's it... Dad's lost it. And we've lost Dad.' The man exhales, slightly annoyed: 'Stop being so dramatic, Jess.' A beat. He glances aside, then mutters defensively: 'He's just having fun.' The camera slowly pans right, revealing the grandfather in the garden wearing enormous butterfly wings, waving his arms in the air like he's trying to take off. He shouts 'Wheeeew!' as he flaps his wings with full commitment. The woman covers her face, on the verge of tears. Birds chirping in the background, warm afternoon ambience." ### Example 3: Sci-Fi Scene **User:** "cyberpunk hacker in a futuristic lab" → Command to generate: /wf /run:video-ltx2 /fps:24 /length:121 /size:640x448 "A young African American woman wearing a futuristic transparent visor and a bodysuit with a tube attached to her neck. She is soldering a robotic arm. She stops and looks to her right as she hears a suspicious strong hit sound from a distance. She gets up slowly from her chair and says with an angry African American accent: 'Rick I told you to close that goddamn door after you!' Then, a futuristic blue alien explorer with dreadlocks wearing a rugged outfit walks into the scene excitedly holding a futuristic device and says with a low robotic voice: 'Fuck the door look what I found!' The alien hands the woman the device, she looks down at it excitedly as the camera zooms in on her intrigued illuminated face. She then says: 'Is this what I think it is?' She smiles excitedly. Sci-fi style cinematic scene with neon lighting and holographic displays flickering in the background." ### Example 4: Animation Style **User:** "pinocchio being interrogated like a crime movie" → Command to generate: /wf /run:video-ltx2 /fps:24 /length:121 /size:640x448 "Pinocchio is sitting in an interrogation room, looking nervous, and slightly sweating. He's saying very quietly to himself 'I didn't do it... I didn't do it... I'm not a murderer'. Pinocchio's nose is quickly getting longer and longer. The camera is zooming in on the double sided mirror in the back of the room, the mirror is turning black as the camera approaches it, and exposes a blurry silhouette of two FBI detectives who stand in the dark lit room on the other side. One of them is saying 'I'm telling you, I have a feeling something is off with this kiddo.' Dramatic noir lighting with shadows cast across the room, tension building suspenseful music." ### Example 5: Nature/Atmospheric **User:** "peaceful yoga studio with frogs" → Command to generate: /wf /run:video-ltx2 /fps:24 /length:121 /size:640x448 "The camera opens in a calm, sunlit frog yoga studio. Warm morning light washes over the wooden floor as incense smoke drifts lazily in the air. The senior frog instructor sits cross-legged at the center, eyes closed, voice deep and calm: 'We are one with the pond.' All the frogs answer softly: 'Ommm...' 'We are one with the mud.' 'Ommm...' He smiles serenely as the camera slowly pulls back to reveal the class of frogs in perfect meditation poses. Gentle ambient music with nature sounds, soft bell chimes in the background." ## Style Categories and Terms ### Animation Styles - stop-motion - 2D/3D animation - claymation - hand-drawn - pixar style ### Stylized Looks - comic book - cyberpunk - 8-bit pixel - surreal - minimalist - painterly - illustrated ### Cinematic Genres - period drama - film noir - fantasy - epic space opera - thriller - modern romance - experimental film - arthouse - documentary - action packed ### Visual Details - **Lighting**: flickering candles, neon glow, natural sunlight, dramatic shadows, golden hour - **Textures**: rough stone, smooth metal, worn fabric, glossy surfaces - **Color palette**: vibrant, muted, monochromatic, high contrast, warm tones - **Atmosphere**: fog, rain, dust, particles, smoke, steam ## Camera Movement Terms - Static/locked off - Pan left/right - Tilt up/down - Tracking shot - Dolly in/out - Handheld/shaky cam - Crane up/down - Zoom in/out (quick or slow) - Push in - Pull back - Following shot ## Audio Description Tips - Describe ambient sounds naturally - Use quotation marks for dialogue - Can specify voice characteristics or accents - Include music style/mood - Sound effects can be described contextually ## Tips for Best Results 1. **Single Paragraph**: Keep prompts as one flowing paragraph, not bullet points 2. **Present Tense**: Always use present tense verbs for actions 3. **Detail Level**: - Close-ups: Include precise facial features, expressions, micro-details - Wide shots: Focus on composition, atmosphere, overall action 4. **Camera Relations**: Focus on camera's relationship to subject, not just technical movements 5. **Length**: 4-8 sentences typically covers all key aspects well 6. **Iteration**: LTX-2 is designed for fast experimentation - refine prompts based on results 7. **Frame Count**: - 121 frames ≈ 5 seconds at 24fps - 241 frames ≈ 10 seconds at 24fps - Adjust based on how much action needs to unfold 8. **Resolution**: - 640x448 for faster generation and iteration - Use the upscaler option for higher quality final outputs ## Common Use Cases | Use Case | FPS | Length | Size | Key Prompt Elements | |----------|-----|--------|------|---------------------| | Social media clip | 24 | 121 | 640x448 | Dynamic action, eye-catching opening | | Cinematic scene | 24 | 161 | 640x448 | Detailed camera movement, atmospheric lighting | | Dialogue scene | 24 | 121-161 | 640x448 | Character expressions, emotional beats | | Action sequence | 30 | 121 | 640x448 | Fast motion, motion blur, intense audio | | Atmospheric/mood | 24 | 121 | 640x448 | Environmental details, ambient sound | ## Workflow Notes - **Runtime**: ~79 seconds average - **VRAM**: ~21.96GB - **Model Features**: Built-in audio generation (no separate audio prompt needed) - **Dynamic Concepts**: Supports Ltx2 LoRAs via EasyLoraStack ## Technical Requirements Required custom nodes: - ComfyUI-GGUF (city96) - ComfyUI-LTXVideo (Lightricks) """ # Write the content to a separate file for easy reading File.write!("examples/skill_ltx2_video_content.txt", skill_content) IO.puts("Skill content saved to examples/skill_ltx2_video_content.txt") IO.puts("") IO.puts("To create this skill, run in IEx:")