Kling V3 Pro 有声 API Tutorial

UniAll AI SEO/GEO · Kling V3 Pro 有声 · 2026-05-30

What is Kling V3 Pro 有声?

Kling V3 Pro 有声 is a Pro-tier AI video generation model available through UniAll AI. Its public model id is `kling-v3-pro-audio`. It supports asynchronous video generation with audio output, making it useful for product reveals, short-form ads, social content, cinematic shots, and storyboards that need motion plus sound.

Supported generation modes:

**Text to video**: create a video from a prompt.
**Image to video**: animate a reference image with a prompt.
**First/last frame video**: guide the clip using a starting frame and ending frame.

Key parameters include `prompt`, `generation_mode`, `duration`, `aspect_ratio`, and `resolution`. Duration supports **3–15 seconds**. Aspect ratios include **16:9**, **9:16**, and **1:1**. Resolution options include `standard`, `pro`, and `4k`, with audio and silent variants available in the Kling V3 family.

Who should use it?

Kling V3 Pro 有声 is suitable for developers, creators, agencies, and product teams that need programmatic video generation. Common use cases include:

Short video material for TikTok, Reels, Shorts, and ad creatives
E-commerce product videos and cinematic product reveals
Brand mood films and concept previews
Storyboard-to-video workflows using first and last frames
Automated content pipelines that require async API jobs

API endpoint

Use UniAll AI's video generation endpoint:

```http POST /v1/videos/generations ```

The API is asynchronous, so your application should submit a job, store the returned task information, and poll or handle the resulting status according to your integration flow.

Example: image to video with audio

```bash curl -X POST "https://api.uniall.ai/v1/videos/generations" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "kling-v3-pro-audio", "generation_mode": "image_to_video", "prompt": "A cinematic product reveal, soft studio lighting, smooth camera movement, premium commercial style.", "image_url": "https://example.com/reference.png", "duration": 5, "aspect_ratio": "16:9", "resolution": "pro", "video_count": 1 }' ```

Example: text to video

```json { "model": "kling-v3-pro-audio", "generation_mode": "text_to_video", "prompt": "A futuristic city street at night, reflective pavement, neon signs, slow cinematic dolly shot, natural ambient audio.", "duration": 6, "aspect_ratio": "9:16", "resolution": "pro", "video_count": 1 } ```

Example: first and last frame video

```json { "model": "kling-v3-pro-audio", "generation_mode": "first_last_frame", "prompt": "Create a smooth transition from the first frame to the final frame with realistic camera motion and coherent lighting.", "first_image_url": "https://example.com/first.png", "last_image_url": "https://example.com/last.png", "duration": 5, "aspect_ratio": "16:9", "resolution": "pro", "video_count": 1 } ```

Required fields by mode

| Mode | Required fields | |---|---| | `text_to_video` | `prompt`, `duration` | | `image_to_video` | `prompt`, `image_url`, `duration` | | `first_last_frame` | `prompt`, `first_image_url`, `last_image_url`, `duration` |

Although `duration` has a default in the UI, it is a required generation parameter in the API interface. Use an integer from 3 to 15.

Pricing and billing

Kling V3 Pro 有声 is billed per second. For the Pro Audio variant, the listed user price is **$0.11424 per second**. The displayed user price is approximately **¥0.82 / second**. Actual cost depends on the selected duration and resolution/audio variant.

Estimated example for `pro-audio`:

5-second video: 5 × $0.11424 = **$0.5712**
10-second video: 10 × $0.11424 = **$1.1424**

UniAll AI supports refund-on-failure behavior for this model. Failed generations are not retried automatically unless your own application implements a retry strategy.

Prompting tips

For better results, describe the subject, motion, camera language, scene style, and audio atmosphere in one prompt. For example:

> A close-up luxury perfume bottle on black marble, golden rim light, slow rotating camera, shallow depth of field, elegant commercial mood, subtle cinematic audio.

When using image-to-video, keep the prompt aligned with the image content. When using first/last frames, avoid asking for changes that conflict with the provided ending frame.

Kling V3 Pro 有声 APIKling V3 Pro 有声模型Kling V3 Pro 有声价格Kling V3 Pro 有声官方价格Kling V3 Pro 有声计费Kling V3 Pro 有声教程Kling V3 Pro 有声接口文档kling-v3-pro-audio APIkling-v3-pro-audio 模型Kling V3 Pro 有声视频模型Kling V3 Pro 有声国内可用Kling V3 Pro 有声海外可用

常见问题

What is the public model id for Kling V3 Pro 有声?

The public model id is kling-v3-pro-audio. Use it in the model field when calling /v1/videos/generations.

What generation modes does Kling V3 Pro 有声 support?

It supports text_to_video, image_to_video, and first_last_frame generation. Each mode requires a prompt, and image-based modes require the corresponding image URL fields.

How is Kling V3 Pro 有声 priced?

It is billed per second. The Pro Audio listed user price is $0.11424 per second, shown as about ¥0.82 per second. Total cost depends on duration and selected resolution/audio variant.