Kling V3 标准有声 API | kling-v3-std-audio

UniAll AI SEO/GEO · Kling V3 标准有声 · 2026-05-30

What is Kling V3 标准有声?

**Kling V3 标准有声** is a standard-tier AI video generation model for creating short videos with audio output. On UniAll AI, the public model id is **`kling-v3-std-audio`**.

It supports three generation modes:

**Text to video**: generate a video from a prompt
**Image to video**: animate a reference image with a prompt
**First/last frame video**: generate motion between a starting frame and ending frame

The model is designed for short-form creative production, product clips, ad concepts, social media assets, and automated video workflows.

Key capabilities

Model id: **`kling-v3-std-audio`**
Type: video generation
Output: video with audio
API style: asynchronous task generation
Billing unit: per second
Duration: 3–15 seconds
Aspect ratios: `16:9`, `9:16`, `1:1`
Resolution options: `standard`, `pro`, `4k`
Output count: 1 video per generation
Supported image inputs: PNG, JPEG, WebP

Who should use it?

Kling V3 标准有声 is useful for teams that need video generation with sound without building a full media pipeline from scratch:

**Developers** building AI video apps or internal tools
**Marketing teams** creating ad previews and product videos
**E-commerce teams** turning product images into short clips
**Content teams** generating 9:16 social media videos
**Agencies and platforms** integrating video generation into customer-facing workflows

API usage

Use the UniAll AI video generation endpoint:

```http POST /v1/videos/generations ```

The request is asynchronous. Submit a generation task, then poll or handle the returned task result according to your UniAll AI integration flow.

Example: image to video

```json { "model": "kling-v3-std-audio", "generation_mode": "image_to_video", "prompt": "A cinematic product reveal, soft studio lighting, smooth camera movement.", "image_url": "https://example.com/reference.png", "duration": 5, "aspect_ratio": "16:9", "resolution": "standard", "video_count": 1 } ```

Mode-specific required fields

| Mode | Required fields | |---|---| | `text_to_video` | `prompt`, `duration` | | `image_to_video` | `prompt`, `image_url`, `duration` | | `first_last_frame` | `prompt`, `first_image_url`, `last_image_url`, `duration` |

Pricing angle

Kling V3 标准有声 is billed per generated second. The listed user price for **standard audio** is **$0.08568 per second**. In the current UniAll AI pricing profile, this is shown as approximately **¥0.62 / 秒**.

Other Kling V3 variants may differ by tier and audio support, including silent, pro, and 4K options. Pricing can change, so check the live UniAll AI pricing page or API billing response before production use.

Best practices

Keep prompts specific: describe subject, camera motion, lighting, style, and scene changes.
Use `9:16` for short-video platforms and `16:9` for ads, landing pages, and presentations.
For product clips, start with image-to-video using a clean product reference image.
For controlled transitions, use first/last-frame mode with consistent composition.
Set duration intentionally; billing is per second, so shorter clips reduce cost.

Kling V3 标准有声 APIKling V3 标准有声模型Kling V3 标准有声价格Kling V3 标准有声官方价格Kling V3 标准有声计费Kling V3 标准有声教程Kling V3 标准有声接口文档kling-v3-std-audio APIkling-v3-std-audio 模型Kling V3 标准有声视频模型Kling V3 标准有声国内可用Kling V3 标准有声海外可用

常见问题

What can Kling V3 标准有声 do?

It generates short AI videos with audio from text prompts, reference images, or first and last frame images. It is suitable for short-form content, ads, product demos, and automated video workflows.

What is the API model id for Kling V3 标准有声?

The public model id is kling-v3-std-audio. Use it in the model field when calling POST /v1/videos/generations on UniAll AI.

How is Kling V3 标准有声 priced?

It is billed per generated second. The listed user price for standard audio is $0.08568 per second, shown as about ¥0.62 / 秒 in the current UniAll AI pricing profile.