Kling Lip Sync
Adds lip-sync to any video using an audio file or text. Enables changing the speech of a person in a video to match the provided audio or text input.
Adds lip-sync to any video using an audio file or text. Enables changing the speech of a person in a video to match the provided audio or text input.
Model Overview
Lip-sync video generation tool that alters a person's speech in a video to match the provided audio file or text. Converts your text or audio into lip movements for the video's subject.
Best At
Perfect for creating talking-head videos with synced speech from new audio tracks or text-to-speech. Works best with clear facial shots of people speaking, 2-10 second clips, and high-quality audio.
Limitations / Not Good At
Requires input videos to be between 2-10 seconds in duration and 720p to 1080p resolution. Audio files must be under 5MB and in compatible formats. Cannot use both video_url and video_id in the same request. Also, if using text, a voice_id is required.
Ideal Use Cases
Blog intros with custom voiceovers, animated character lip-syncing, product demos, multilingual video translations.
Input & Output Format
Input:
video_url: A URL to a video file (mp4 or mov) of 2-10 seconds and 720p-1080p.audio_file: An archive file (mp3, wav, m4a, aac) under 5MB.text: Free text for lip-sync (requiresvoice_id).
Output:
A video file (mp4) with lip-synced content, provided as a URI.
Performance Notes
designed for short videos (2-10 seconds). Processing speed depends on Replicate's infrastructure. Output video resolution matches input.
Video Url
StringThe URL of the video to generate the lip sync for. Supports .mp4/.mov, ≤100MB, 2–10s, 720p/1080p only, width/height 720–1920px.
Audio Url
StringThe URL of the audio to generate the lip sync for. Minimum duration is 2s and maximum duration is 60s. Maximum file size is 5MB.
Output
InferredOutput
Type
Node
Status
Official
Package
Nodespell AI
Category
AI / Video / KuaishouInput
Output