After Sora2, a New Cinematic-Grade AI Video Model Emerges: GAGA

Original Digital Life
Kha'Zix — 2025-10-10 09:31 Beijing
---
GAGA-1: Character Performance & Dialogue That Feel Almost Too Good

After the explosive success of Sora2, the AI video space saw a flood of new products.
One standout comes from my friend Mr. Cao Yue — creator of Sand.ai.
In the early hours today, they officially released their new audio-visual synchronized video model: GAGA-1.
From my tests, its character performance is already at top-tier level.

The name "GAGA-1" oddly reminds me of an old variety show called The Serious Gaga Crew…
Enough talk — here are my test results.
---
A Look Back
Going through my chat history, I realized my first interaction with Mr. Cao Yue was back in April.
That was when he sent me a demo of his early video model — not yet GAGA-1 — with blurry, shaky output.

Fast forward six months — the official model is here.
Honestly, the pace was slower than we expected. One night past 1 a.m., we agreed it might only take two more months…
But six months flew by.
The challenges along the way? Only Cao Yue truly knows.
---
Getting Started with GAGA-1
Launch status:
- No invitation codes
- No queues
- Free to use — for now
Website: http://gaga.art

Main functions:
- Gaga Actor — latest model, audio-visual synchronized
- Gaga Avatar — older model
- Library
For now, ignore Gaga Avatar and focus on Gaga Actor.
---
Key Features of GAGA-1
- Biggest strength: Film-like character performance with dialogue
- Aspect ratio: Fixed 16:9
- Lengths: 5s or 10s videos
- Inputs required: Image + text prompt (uses built-in image generation tool)

Usage tips:
- Dialogue ≤ 5 Chinese characters → use 5s
- Dialogue > 5 characters → use 10s
- Avoid > 20 characters — delivery may feel awkward
---
Image Generation

Connected to Banana.

Banana can edit images, but I prefer Seedream 4.0 aesthetics — especially for Asian portraits.
Example: My AI model Ran Xia.

---
First Prompt Test
> "The girl smiles and says: 'You don’t actually think I’m the kind of girl who cries to lose, do you?'"
After 3–4 minutes, a 10s video was ready.
Result:
- Wind in hair, natural facial expressions, even teeth — among the best I’ve seen.
- If not for Sora2, GAGA-1’s realism might be my #1 right now.
---
Second Prompt Test
> "The girl sighed, lowered her head and bit her lip, then raised her head a moment later and said in a firm voice: 'I've decided — starting today, neither of us owes the other anything.'"
Two renditions worth noting:
- Resigned — disappointment and sadness, acceptance of fate.
- Frustrated — slight anger, “I’m disappointed in you” vibe.
Observation: Acting depth > dialogue delivery quality; voice tone felt somewhat flat.
---
Third Prompt: Emotion Switch & Pauses
> "The girl cried as she said: 'You… don’t love me anymore?' After pausing for a moment, she shouted hysterically with intense emotion: 'I understand! I will never come to you again!'"
- Hit max concurrency limit: 5 generations at once
- Success rate: ~40%
- Two decent outputs: varying hysterical intensities; one auto-added BGM
- Issue: Long prompts can cause truncated dialogue
---
Additional Experiments
Half-body shot with pride
> "The woman maintains her posture, with only her facial expression changing. She says proudly: 'Could it be… that this name is because…' She pauses intentionally, emphasizing: 'Me.'"
- Recognized foreign identity
- Used broken Mandarin
- Performance matched proud delivery
---
Two-person scene
> "The man looks at the woman helplessly: 'Won’t you let me speak?' The woman nods playfully: 'Go ahead.'"
Convincing expressions & line delivery on both actors.
---
Singing capability
Yes, GAGA-1 can sing — pitch may be “abstract.”
Example (Ke Jie, Five-in-a-Row theme):
> "The man looks at the chessboard and sings: 'Traditional five-in-a-row is just connecting five pieces into a line — so boring~ so dull~'"
---
Strengths & Limitations
Strengths:
- High-quality facial acting for short scripts
- Works well for Chinese & English prompts
- Handles emotion shifts decently
Limitations:
- Large, complex motions → visual distortion
- Japanese dialogue feels awkward
- No custom audio upload yet
- No fixed voice ID per character (voice varies each generation)
---
Upcoming Features
Custom voice fixing is nearly ready — awaiting engineering completion.
---
Pricing
- Current: Free for all users
- Future: Will cost less than Sora2 and Veo3
- No set timeline for ending free period
---
Monetization Platforms to Complement GAGA-1
Tools like AiToEarn官网 can integrate AI video generation with:
- Multi-platform publishing (Douyin, Kwai, Bilibili, Xiaohongshu, YouTube)
- Analytics
- Global AI model ranking (AI模型排名)
- This helps creators distribute high-quality outputs to audiences — and revenue streams — faster.
---
Final Thoughts
Whether for short dramas, game NPC dialogue, or visualizing novel characters, GAGA-1 offers a new, low-cost creative possibility.
It’s not perfect, but it lowers barriers for more people to join video content creation — and it’s a domestic AI model.
Go explore — and I look forward to your creative, entertaining works.
---
If you found this useful:
- Like 👍
- Click “read”
- Share
- Star ⭐ my profile for updates
Links: