Veo 3.1

Official

Improved video generation model with higher fidelity, context-aware audio, and supports image references and frame interpolation.

Nodespell AI

AI / Video / Google

Improved video generation model with higher fidelity, context-aware audio, and supports image references and frame interpolation.

Model Overview

A text-to-video generator that creates high-fidelity videos with context-aware audio. It builds on Veo 3 with improved quality.

Best At

Generating high-quality videos from text descriptions.
Maintaining subject consistency when using reference images.
Smooth video transitions via last frame interpolation.
Creating natural audio that matches the generated video content.

Limitations / Not Good At

Reference images are limited to 16:9 aspect ratio and 8-second duration.
Last frame option is omitted when reference images are used.
Output is always a video file; no separate image or audio output.
Specific input image resolutions required for different modes.

Ideal Use Cases

Marketing and product demos with consistent branding.
Social media video creation (short vertical or horizontal videos).
Smooth transitions from images to video content.
Videos with contextually relevant background vocals or sounds.

Input & Output Format

Input: Text prompt (required) combined with optional parameters: aspect ratio, duration, starting image, last frame, reference images, negative prompt, resolution, and audio generation flag.
Output: URI pointing to the generated MP4 video file.

Performance Notes

High quality video generation requires substantial compute resources.
Generation time may be longer than other modalities due to video synthesis.

Inputs (4)

Prompt

String

Text prompt for video generation

Multi InputMin: 0Max: 100

Image

String

Input image to start generating from. Ideal images are 16:9 or 9:16 and 1280x720 or 720x1280, depending on the aspect ratio you choose.

Min: 0Max: 100

Last Frame

String

Ending image for interpolation. When provided with an input image, creates a transition between the two images.

Min: 0Max: 100

Reference Images

String

1 to 3 reference images for subject-consistent generation (reference-to-video, or R2V). Reference images only work with 16:9 aspect ratio and 8-second duration. Last frame is ignored if reference images are provided.

Multi InputMin: 0Max: 100

Parameters (7)

Seed

Number

Random seed. Omit for random generations

Default: -1

Prompt

String

Text prompt for video generation

Default:

Duration

Number

Video duration in seconds

Default: 8

Resolution

String

Resolution of the generated video

Default: 720p

Aspect Ratio

String

Video aspect ratio

Default: 16:9

Generate Audio

Boolean

Generate audio with the video

Default: true

Negative Prompt

String

Description of what to exclude from the generated video

Default:

Outputs (1)

Output

Inferred

Output

Nodespell

London

Building the future. Join us!

nodespell.com nodespell.app NodespellAI

Creator profile

Type

Node

Status

Official

Package

Nodespell AI

Keywords

Video GenerationAspect ControlResolution ControlLength Control

Use in Workflow