AI video

After Sora2, a New Cinematic-Grade AI Video Model Emerges: GAGA

Honghao Wang

10 Oct 2025 — 4 min read

Original Digital Life

Kha'Zix — 2025-10-10 09:31 Beijing

---

GAGA-1: Character Performance & Dialogue That Feel Almost Too Good

After the explosive success of Sora2, the AI video space saw a flood of new products.

One standout comes from my friend Mr. Cao Yue — creator of Sand.ai.

In the early hours today, they officially released their new audio-visual synchronized video model: GAGA-1.

From my tests, its character performance is already at top-tier level.

The name "GAGA-1" oddly reminds me of an old variety show called The Serious Gaga Crew…

Enough talk — here are my test results.

---

A Look Back

Going through my chat history, I realized my first interaction with Mr. Cao Yue was back in April.

That was when he sent me a demo of his early video model — not yet GAGA-1 — with blurry, shaky output.

Fast forward six months — the official model is here.

Honestly, the pace was slower than we expected. One night past 1 a.m., we agreed it might only take two more months…

But six months flew by.

The challenges along the way? Only Cao Yue truly knows.

---

Getting Started with GAGA-1

Launch status:

No invitation codes
No queues
Free to use — for now

Website: http://gaga.art

Main functions:

Gaga Actor — latest model, audio-visual synchronized
Gaga Avatar — older model
Library

For now, ignore Gaga Avatar and focus on Gaga Actor.

---

Key Features of GAGA-1

Biggest strength: Film-like character performance with dialogue
Aspect ratio: Fixed 16:9
Lengths: 5s or 10s videos
Inputs required: Image + text prompt (uses built-in image generation tool)

Usage tips:

Dialogue ≤ 5 Chinese characters → use 5s
Dialogue > 5 characters → use 10s
Avoid > 20 characters — delivery may feel awkward

---

Image Generation

Connected to Banana.

Banana can edit images, but I prefer Seedream 4.0 aesthetics — especially for Asian portraits.

Example: My AI model Ran Xia.

---

First Prompt Test

> "The girl smiles and says: 'You don’t actually think I’m the kind of girl who cries to lose, do you?'"

After 3–4 minutes, a 10s video was ready.

Result:

Wind in hair, natural facial expressions, even teeth — among the best I’ve seen.
If not for Sora2, GAGA-1’s realism might be my #1 right now.

---

Second Prompt Test

> "The girl sighed, lowered her head and bit her lip, then raised her head a moment later and said in a firm voice: 'I've decided — starting today, neither of us owes the other anything.'"

Two renditions worth noting:

Resigned — disappointment and sadness, acceptance of fate.
Frustrated — slight anger, “I’m disappointed in you” vibe.

Observation: Acting depth > dialogue delivery quality; voice tone felt somewhat flat.

---

Third Prompt: Emotion Switch & Pauses

> "The girl cried as she said: 'You… don’t love me anymore?' After pausing for a moment, she shouted hysterically with intense emotion: 'I understand! I will never come to you again!'"

Hit max concurrency limit: 5 generations at once
Success rate: ~40%
Two decent outputs: varying hysterical intensities; one auto-added BGM
Issue: Long prompts can cause truncated dialogue

---

Additional Experiments

Half-body shot with pride

> "The woman maintains her posture, with only her facial expression changing. She says proudly: 'Could it be… that this name is because…' She pauses intentionally, emphasizing: 'Me.'"

Recognized foreign identity
Used broken Mandarin
Performance matched proud delivery

---

Two-person scene

> "The man looks at the woman helplessly: 'Won’t you let me speak?' The woman nods playfully: 'Go ahead.'"

Convincing expressions & line delivery on both actors.

---

Singing capability

Yes, GAGA-1 can sing — pitch may be “abstract.”

Example (Ke Jie, Five-in-a-Row theme):

> "The man looks at the chessboard and sings: 'Traditional five-in-a-row is just connecting five pieces into a line — so boring~ so dull~'"

---

Strengths & Limitations

Strengths:

High-quality facial acting for short scripts
Works well for Chinese & English prompts
Handles emotion shifts decently

Limitations:

Large, complex motions → visual distortion
Japanese dialogue feels awkward
No custom audio upload yet
No fixed voice ID per character (voice varies each generation)

---

Upcoming Features

Custom voice fixing is nearly ready — awaiting engineering completion.

---

Pricing

Current: Free for all users
Future: Will cost less than Sora2 and Veo3
No set timeline for ending free period

---

Monetization Platforms to Complement GAGA-1

Tools like AiToEarn官网 can integrate AI video generation with:

Multi-platform publishing (Douyin, Kwai, Bilibili, Xiaohongshu, YouTube)
Analytics
Global AI model ranking (AI模型排名)
This helps creators distribute high-quality outputs to audiences — and revenue streams — faster.

---

Final Thoughts

Whether for short dramas, game NPC dialogue, or visualizing novel characters, GAGA-1 offers a new, low-cost creative possibility.

It’s not perfect, but it lowers barriers for more people to join video content creation — and it’s a domestic AI model.

Go explore — and I look forward to your creative, entertaining works.

---

If you found this useful:

Like 👍
Click “read”
Share
Star ⭐ my profile for updates

Links: