Seedance 2.0 Image Input Guide
Learn when to use image-to-video, how to prepare first and last frames, and how to protect product or scene fidelity.
Image-to-video is the best Seedance 2.0 mode when the first frame already exists and your real job is adding motion without losing the original composition.
That makes it especially useful for:
- product hero shots
- poster-to-motion tests
- still-image campaign assets
- packshots and tabletop scenes
- environment shots with a strong starting frame
When image-to-video is the right choice
Choose image-to-video when at least one of these is true:
- the frame composition is already approved
- the product silhouette must stay recognizable
- you need motion, but not a brand new scene design
- the first frame carries the most value
If identity lock across the full clip is more important than animating the first frame, move to Reference Input Guide.
Use the first frame as the anchor
Your uploaded image is not just inspiration. It is the structural anchor of the shot.
The strongest first-frame images usually have:
- one obvious subject
- readable silhouette
- clean separation from the background
- stable lighting direction
- minimal clutter around the hero object
For product work, keep the object large enough in frame that labels, materials, and edges are actually visible.
When to add a last frame
In the current workflow, image-to-video can use a first frame and optionally a last frame. That is useful when you already know how the shot should resolve.
Use a last frame when:
- the ending composition is critical
- you want a before/after or open/closed state
- the shot should move from one approved layout to another
Do not add a last frame just because it is available. If the first and last frames are visually too far apart, the in-between motion often breaks.
A reliable prompt pattern for image input
Start with the uploaded image, then describe the motion:
@Image1 [subject], [single motion layer], [camera move], [lighting/style], [constraints]Example:
@Image1 perfume bottle on dark marble, droplets slide down the glass, slow macro dolly-in, luxury studio contrast, no label blur no cap drift no extra objectsThis works because the frame is already defined. Your prompt should focus on how the still image comes alive, not on redesigning the whole scene.
How to prepare stronger source images
For products
- keep one hero product per image
- use sharp source files
- avoid heavy reflections that already hide the label
- simplify props unless they are essential to the shot
For people
- use a clean face angle
- avoid cropped hands if hands will be part of the motion
- prefer one lighting setup, not mixed light directions
For environments
- keep horizon lines and architectural edges clean
- avoid busy frames with many competing moving elements
What kinds of motion work best
Image-to-video usually performs better with:
- push-ins
- slow pull-backs
- restrained orbits
- controlled tracking
- subtle environmental motion
It usually performs worse when you ask it to invent:
- complex choreography
- large pose changes
- major perspective jumps
- multiple subject interactions
Typical failure modes and first fixes
| Problem | Usual cause | First fix |
|---|---|---|
| Product shape warps | the requested move is too aggressive | slow the move and keep one hero object |
| Label becomes unreadable | too many reflections or particles | simplify the scene and reinforce label constraints |
| Motion feels flat | prompt only describes the object, not the shot | add one camera move and one motion cue |
| Frame-to-frame weirdness | first and last frames conflict too much | remove the last frame or narrow the transition |
| Background starts melting | the scene has too many secondary elements | simplify props and keep the focus tight |
When image-to-video beats text-only generation
Image-to-video is usually the better choice when:
- the client already approved a packshot
- the ad needs to match a still campaign
- product geometry is more important than scene invention
- you are working from catalog, PDP, or lookbook assets
That is why many ecommerce tests should start from image-to-video, not text-to-video.
Practical iteration rules
When a clip fails, fix in this order:
- simplify the motion
- simplify the frame
- strengthen the negative prompt
- only then change the source image
Most teams change the image too early. In practice, the bigger problem is usually that the shot request is trying to do too much.
Related guides
Seedance 2.0 Prompt Writing Guide
Write clearer prompts for Seedance 2.0 with a practical framework for subject, motion, camera, style, and constraints.
Seedance 2.0 Reference Input Guide
Use 1 to 3 reference images more effectively in Seedance 2.0 to lock identity, product geometry, and scene consistency.
DeepSeek V4 Video ドキュメント