How should you structure a Seedance 2.0 prompt?

The most reliable structure is subject action, camera movement, environment, sound, and then style only if needed. For multi-shot clips, label each shot explicitly.

Should you use Shot 1 and Shot 2 labels in Seedance 2.0?

Yes for edited multi-shot sequences. Explicit shot labels give Seedance clearer cut points and usually produce more controlled transitions.

Do audio cues matter in Seedance 2.0 prompts?

Yes. Seedance supports synchronized audio generation on fal, so sound cues like footsteps, rain, crowd noise, or glass clinks help anchor timing and realism.

How should you use image, video, and audio references in Seedance 2.0?

Give each reference one clear job: images for identity or styling, videos for motion and pacing, and audio for rhythm or sound texture.

Back to blog

Seedance 2.0 Prompt Guide: Better Prompts That Work [2026]

April 21, 2026By Bilal Azhar

Learn how to write better Seedance 2.0 prompts with shot labels, camera direction, audio cues, reference workflows, settings, and copy-paste examples.

Seedance 2.0 prompts work best when they describe a moving scene: subject, action, camera, environment, sound, and cuts. Use Seedance 2.0 on Morphed for final 1080p renders, Seedance 2.0 Fast for drafts, and this guide as the hub for prompt examples, image-to-video, and text-to-video workflows.

Most bad Seedance outputs are not caused by the model. They are caused by prompts that describe a static image instead of a moving scene.

The biggest shift is simple: write for action, camera, sound, and cuts, not vague cinematic adjectives.

If you have used image models for a while, this takes a minute to unlearn. Image prompting rewards moodboards: lighting words, texture words, style references, beautiful nouns. Video prompting is less forgiving. The model has to decide what happens on frame 1, frame 40, frame 120, and how the scene gets from one to the next. That means your prompt has to behave more like a director's note than a poster description.

Key Takeaways

Seedance 2.0 responds best to shot-based prompts
Multi-shot clips work better with explicit Shot 1: / Shot 2: labels
Camera direction matters more than generic style words
Audio cues improve timing and realism
References work best when each one has a specific role

If you want a better result, stop prompting it like an image model.

The Director Mindset

A good Seedance prompt does not need to be poetic. It needs to be executable.

Imagine standing on set with a small crew. If you say "make it cinematic," nobody knows where to put the camera. If you say "start wide, track beside her at shoulder height, then cut to a close-up when she turns toward the neon sign," the crew can do something with that. Seedance is similar. It responds better when your prompt gives it a job, a camera position, and a sequence.

The most useful mental model is:

Role	What Your Prompt Should Provide
Director	The scene goal and emotional turn
Cinematographer	Shot size, camera movement, lens feel
Actor or product	One clear action per beat
Sound designer	Concrete ambience, Foley, or beat timing
Editor	Shot labels, duration, and cut points

This does not mean every prompt has to be huge. A 4-second product clip can be one sentence. A 15-second transformation needs a shot list. The structure should match the ambition of the clip.

Seedance 2.0 at a Glance

Seedance 2.0 is a ByteDance video model built for text, image, audio, and video-guided generation. The practical advantage is control: you can draft from text, animate an image, or use references to bind identity, motion, and audio timing to the final clip.

Setting	Practical Value
Model family	ByteDance Seedance 2.0
Main workflows	Text-to-video, image-to-video, reference-to-video
Duration on Morphed	4-15 seconds
Standard resolutions	480p, 720p, 1080p
Fast resolutions	480p, 720p
Aspect ratios	auto, 21:9, 16:9, 4:3, 1:1, 3:4, 9:16
Audio	Supported; include concrete sound cues
Reference inputs	Images, video clips, and audio clips depending on workflow
Best first step	Draft in Fast, finish in Standard when quality matters

What Makes Seedance 2.0 Different?

Seedance behaves more like a directing tool than a moodboard tool.

It is especially good at:

shot transitions
camera movement
synchronized sound
reference-guided motion

That means a prompt like this is weak:

cinematic girl in tokyo at night, beautiful lighting, masterpiece

And a prompt like this is stronger:

Shot 1: A woman in a red coat steps into a rain-soaked Tokyo side street, neon reflections breaking across the pavement. The camera tracks beside her at shoulder height. Wet footsteps and distant traffic fill the background.

The second prompt gives the model something to stage.

The important part is not that the second prompt is longer. It is that every phrase has a job. "Rain-soaked side street" gives the environment texture. "Neon reflections" gives surfaces for light to move across. "Camera tracks beside her" gives motion. "Wet footsteps and distant traffic" gives timing and atmosphere.

The Best Prompt Structure

For single-shot clips, use this order:

subject + action
camera movement
environment or lighting
sound
style only if needed

For multi-shot clips, use:

overall format or continuity note
labeled shots
one action per shot
one camera move per shot
sound cue if it matters

Prompt Part	What to write	Why it matters
Subject + action	One strong verb like runs, turns, drops, looks back	Defines the core motion of the shot
Camera movement	Push-in, dolly, orbit, handheld follow, fixed frame	Controls pacing and visual energy
Environment	Rain, fog, reflections, crowd density, wind, room tone	Makes the motion feel grounded
Sound	Footsteps, water hiss, fabric rustle, crowd noise	Improves timing and scene realism
Shot labels	Shot 1, Shot 2, Shot 3	Improves cut clarity in multi-shot sequences

Standard vs Fast Prompting

Seedance 2.0 Fast is for cheaper iteration. Standard Seedance 2.0 is for final output, especially when you need 1080p or more polished detail. The prompt structure is similar, but Fast works better with fewer shots and simpler movement.

Use Case	Better Model	Prompting Advice
4-6 second draft	Seedance 2.0 Fast	One action, one camera move
TikTok or Reels variant	Seedance 2.0 Fast	Vertical 9:16, centered subject
Product final render	Seedance 2.0 Standard	Protect shape and label details
Multi-shot 10-15 second clip	Seedance 2.0 Standard	Use `Shot 1`, `Shot 2`, `Shot 3` labels
Audio-beat ad	Seedance 2.0 Standard	Assign sound or beat timing clearly

For a deeper cost and quality breakdown, see Seedance 2.0 Fast vs Standard.

The draft-to-final workflow matters because Seedance prompts rarely arrive perfect on the first try. Start by proving the movement. Once the movement works, increase quality. If you start at final quality, you spend more credits discovering basic prompt mistakes.

How To Use References Properly

Seedance 2.0 on fal supports image, video, and audio references. The mistake most people make is giving those references overlapping jobs.

Use this instead:

image reference = identity, wardrobe, styling, composition anchor
video reference = camera movement, pacing, motion behavior
audio reference = beat, rhythm, sound texture, sync cue

Example:

@Image1 is the character identity and outfit reference.
@Video1 defines the pacing and handheld camera rhythm.
@Audio1 defines the beat and impact timing.

Shot 1: Medium close-up of the character from @Image1 entering frame, handheld push-in following the movement language of @Video1, low room tone under the opening beats of @Audio1.
Shot 2: The subject turns toward the light source, the camera orbit matching @Video1, percussion rising with @Audio1.
Shot 3: Wide reveal, final movement accent hitting on the beat from @Audio1.

Notice the hierarchy: the image owns identity, the video owns pacing, the audio owns timing. That is much clearer than saying "use these references to make a cool video." Seedance needs to know which file to trust for which part of the clip.

Why Audio Cues Matter More Than People Think

Seedance is stronger when it knows not just what the viewer should see, but what they should hear.

These cues help more than generic atmosphere words:

“wet footsteps on pavement”
“glass clinks on the table”
“crowd noise swelling behind her”
“fabric rustle and sharp inhale”
“distant train rumble”

That turns the prompt into a timed scene instead of a visual description.

The best sound cues are small and physical. Footsteps, fabric, glass, rain, paper, tires, doors, breath, room tone. These cues make a clip feel less like a silent render and more like a shot that happened somewhere.

Text-To-Video vs Image-To-Video

Use text-to-video when composition is flexible and you are exploring a scene from scratch. Use image-to-video when identity, product shape, outfit, room design, or art direction must stay fixed.

Workflow	Use It When	Prompt Focus
Text-to-video	You have no source asset and want to explore	Subject, action, camera, sound
Image-to-video	You have a still image to animate	Motion, camera, identity preservation
Reference-to-video	You have identity, motion, or audio references	Give each reference one job

Detailed workflows:

Prompt Examples You Can Actually Use

1. Beat-Synced Style Shift

What it does: a 15-second vertical lifestyle transformation. The structure is intentionally over-labeled because the prompt depends on rhythm. Seedance needs to know which beats are small inserts and which beats are emotional turns.

FORMAT: 15s / 135 BPM / 12 SHOTS
SUBJECT: @Image1
WARDROBE: casual streetwear that becomes more polished
ENVIRONMENT: apartment interior into bright city street
MOOD: flat to energized to confident
STYLE: beat-driven lifestyle film

Shot 1: She sits on the edge of a bed staring at her phone, bored, room tone low and dry.
Shot 2: The phone buzzes on the beat. Quick insert shot.
Shot 3: She reads the message and her expression changes instantly.
Shot 4: She stands up sharply on the next beat.
Shot 5: She adjusts her jacket and smooths her hair in a tight mirror-side close-up.
Shot 6: Warm light starts to fill the room.
Shot 7: She checks herself in the mirror with a subtle smile.
Shot 8: The door opens and she steps into the street as the music lifts.
Shot 9: Confident walking shot, camera tracking backward in rhythm.
Shot 10: Slow-motion hair movement on a beat accent.
Shot 11: The street feels brighter and more alive around her.
Shot 12: She finishes in a clean final pose, fully synced to the drop.

Why it works: every shot is short and single-purpose. The prompt does not ask the character to do five things at once. It also gives the clip a clear emotional arc: flat, activated, confident.

2. Dreamfall Through Tokyo

What it does: a surreal multi-environment fall sequence. This is the type of prompt where Seedance benefits from explicit cut-by-cut escalation because each world is visually different.

Shot 1: Twilight rooftop above Shibuya Crossing. A young woman in Japanese academy-style clothing is shoved backward from the edge. The camera stays close as she begins to fall, body spinning violently in the wind.
Shot 2: She tears into a dreamscape made of giant glowing ukiyo-e waves. Ink-like water structures crash past her as she tumbles through them.
Shot 3: She breaks into a corridor of endless folding vermilion torii gates. White fox-like light trails streak across frame while she tries and fails to grab a crossbeam.
Shot 4: She falls into a storm of cherry blossoms and fragmented Mount Fuji forms, petals and gold-screen textures exploding around her.
Shot 5: She enters a kaleidoscope of noh masks and lacquer-red patterns as the camera pushes tighter.
Shot 6: Final rupture back into reality. She lands hard at street level in Shibuya, breathless and shaken, while pedestrians stop and stare.

Why it works: the fall is the continuity device. Even though the environments change, the body motion keeps the sequence connected.

3. Disco Character Study

What it does: a character-consistency stress test. The reference image owns the face, outfit, glasses, shirt, and necklaces while the prompt controls camera energy and social awkwardness.

[CINEMATIC SETUP]
Film stock: gritty 1970s Italian cinema look, 35mm grain, high contrast
Camera: handheld documentary-style movement with sudden zooms
Color grade: warm vintage saturation with disco highlights
Atmosphere: smoke, lens flares, crowded dance floor

[REFERENCE]
@Image1 is the main character identity and outfit reference. Keep the face, glasses, shirt, and necklaces consistent.

Shot 1: Medium shot, @Image1 headbanging alone in the center of a packed dance floor, handheld camera shaking with the beat.
Shot 2: Tracking shot as he pushes toward a woman dancing nearby, still headbanging aggressively while she barely reacts.
Shot 3: Sudden zoom onto his face as he tries again with another group, sweat and hair movement flying, the crowd ignoring him.
Shot 4: Wide shot, he is centered in the chaos, still performing at full intensity while the room moves around him as if he is invisible.

Why it works: the scene has a simple joke. One person is intense; everyone else is indifferent. That is easier for Seedance to stage than a vague instruction like "make a funny disco scene."

Common Mistakes That Waste Credits

Too many actions in one shot

If one shot asks the subject to run, jump, turn, scream, and fall, the output usually gets muddy.

Fix it by choosing the action that matters most. If the jump is the visual hook, cut the scream. If the fall is the payoff, make the run shorter.

Conflicting camera language

“steady locked frame” and “chaotic handheld rush” should not live in the same shot unless you are doing something very deliberate.

Too many people in a complex scene

Dense interactions break faster than simple compositions. Reduce subject count when you need cleaner motion.

Too much abstract style language

Words like “cinematic,” “epic,” and “beautiful” are weak unless the physical scene is already clear.

Style words are seasoning. They cannot replace the meal. Use them after the action, camera, and environment are already understandable.

No duration discipline

If you want multiple clear cuts, give the clip enough time. Short duration plus too many beats is one of the fastest ways to get weak results.

Not For You

Seedance is not the right fit for every job.

If you want one-line prompts with no iteration, this workflow will feel too directed
If you need highly precise talking-head dialogue, test lip sync carefully before using it in production
If you want dense multi-person choreography in one shot, expect more retries

Related Seedance Guides

Final Rule

The best Seedance prompts are not more descriptive. They are more operational.

Write:

what happens
how the camera behaves
what the viewer hears
where the cuts happen
what each reference is responsible for

That is the difference between a pretty prompt and a usable one.