What is Kling V3 标准有声?
**Kling V3 标准有声** is a standard-tier AI video generation model for creating short videos with audio output. On UniAll AI, the public model id is **`kling-v3-std-audio`**.
It supports three generation modes:
- **Text to video**: generate a video from a prompt
- **Image to video**: animate a reference image with a prompt
- **First/last frame video**: generate motion between a starting frame and ending frame
The model is designed for short-form creative production, product clips, ad concepts, social media assets, and automated video workflows.
Key capabilities
- Model id: **`kling-v3-std-audio`**
- Type: video generation
- Output: video with audio
- API style: asynchronous task generation
- Billing unit: per second
- Duration: 3–15 seconds
- Aspect ratios: `16:9`, `9:16`, `1:1`
- Resolution options: `standard`, `pro`, `4k`
- Output count: 1 video per generation
- Supported image inputs: PNG, JPEG, WebP
Who should use it?
Kling V3 标准有声 is useful for teams that need video generation with sound without building a full media pipeline from scratch:
- **Developers** building AI video apps or internal tools
- **Marketing teams** creating ad previews and product videos
- **E-commerce teams** turning product images into short clips
- **Content teams** generating 9:16 social media videos
- **Agencies and platforms** integrating video generation into customer-facing workflows
API usage
Use the UniAll AI video generation endpoint:
```http POST /v1/videos/generations ```
The request is asynchronous. Submit a generation task, then poll or handle the returned task result according to your UniAll AI integration flow.
Example: image to video
```json { "model": "kling-v3-std-audio", "generation_mode": "image_to_video", "prompt": "A cinematic product reveal, soft studio lighting, smooth camera movement.", "image_url": "https://example.com/reference.png", "duration": 5, "aspect_ratio": "16:9", "resolution": "standard", "video_count": 1 } ```
Mode-specific required fields
| Mode | Required fields | |---|---| | `text_to_video` | `prompt`, `duration` | | `image_to_video` | `prompt`, `image_url`, `duration` | | `first_last_frame` | `prompt`, `first_image_url`, `last_image_url`, `duration` |
Pricing angle
Kling V3 标准有声 is billed per generated second. The listed user price for **standard audio** is **$0.08568 per second**. In the current UniAll AI pricing profile, this is shown as approximately **¥0.62 / 秒**.
Other Kling V3 variants may differ by tier and audio support, including silent, pro, and 4K options. Pricing can change, so check the live UniAll AI pricing page or API billing response before production use.
Best practices
- Keep prompts specific: describe subject, camera motion, lighting, style, and scene changes.
- Use `9:16` for short-video platforms and `16:9` for ads, landing pages, and presentations.
- For product clips, start with image-to-video using a clean product reference image.
- For controlled transitions, use first/last-frame mode with consistent composition.
- Set duration intentionally; billing is per second, so shorter clips reduce cost.
常见问题
What can Kling V3 标准有声 do?
It generates short AI videos with audio from text prompts, reference images, or first and last frame images. It is suitable for short-form content, ads, product demos, and automated video workflows.
What is the API model id for Kling V3 标准有声?
The public model id is kling-v3-std-audio. Use it in the model field when calling POST /v1/videos/generations on UniAll AI.
How is Kling V3 标准有声 priced?
It is billed per generated second. The listed user price for standard audio is $0.08568 per second, shown as about ¥0.62 / 秒 in the current UniAll AI pricing profile.