
How LipDub AI Works
Advanced Workflow

Upload
Start by uploading the video clips you want to lip sync.
If you're using LipDub to lip-sync AI characters, make sure that you upload a video with mouth shape diversity so the model can learn how the character’s mouth moves.
For the best result, make sure that the video you upload is at least 30-45 seconds. If you don't have 30-45 seconds, you may need to upload some additional training data such as outtakes (see more below in “Train” section)
Once your footage is uploaded, LipDub AI will begin pre-processing, analyzing your video footage frame-by-frame.
Label
Once pre-processing is complete, you’ll see all detected faces for review. Now’s your chance to label each speaker you’d like to lip sync by name.
Below the speaker labeling, you'll see all the frames LipDub AI has detected for the given speaker. If there's a mistake, it's important to fix it, to ensure LipDub AI correctly dubs the speaker across the whole video. If there is a mistake, you can easily fix it by assigning the incorrect frames to the right identity, or deleting those frames all together.
Train
Now you can start LipDub AI’s comprehensive training process, which can take roughly 1-4 hours depending on the length and complexity of the footage. Our team is actively working to make this training process faster, but it is here LipDub AI builds the detailed model needed to deliver the realistic results we’re known for.
Again, if your video has less than 30-45 seconds of speaking time, this is where you upload additional training data.
The good news: this is a one-time step. Once LipDub AI is trained on your source footage within a project, you can generate new videos as often as you’d like.
Add audio and generate
You’ll receive an email notification once training is complete and it’s time to upload your audio files for each speaker in either MP3 or WAV format.
For the best lip sync, upload one audio file per speaker, make sure there’s no background noise and ensure it matches the clip’s duration.
Now, you’re free to generate your video. This step is much faster than training, so you can quickly create as many iterations as you need.