After Sora2, a New Cinematic-Grade AI Video Model Emerges: GAGA

After Sora2, a New Cinematic-Grade AI Video Model Emerges: GAGA

Original Digital Life

Kha'Zix — 2025-10-10 09:31 Beijing

---

GAGA-1: Character Performance & Dialogue That Feel Almost Too Good

image

After the explosive success of Sora2, the AI video space saw a flood of new products.

One standout comes from my friend Mr. Cao Yue — creator of Sand.ai.

In the early hours today, they officially released their new audio-visual synchronized video model: GAGA-1.

From my tests, its character performance is already at top-tier level.

image

The name "GAGA-1" oddly reminds me of an old variety show called The Serious Gaga Crew

Enough talk — here are my test results.

---

A Look Back

Going through my chat history, I realized my first interaction with Mr. Cao Yue was back in April.

That was when he sent me a demo of his early video model — not yet GAGA-1 — with blurry, shaky output.

image

Fast forward six months — the official model is here.

Honestly, the pace was slower than we expected. One night past 1 a.m., we agreed it might only take two more months…

But six months flew by.

The challenges along the way? Only Cao Yue truly knows.

---

Getting Started with GAGA-1

Launch status:

  • No invitation codes
  • No queues
  • Free to use — for now

Website: http://gaga.art

image

Main functions:

  • Gaga Actor — latest model, audio-visual synchronized
  • Gaga Avatar — older model
  • Library

For now, ignore Gaga Avatar and focus on Gaga Actor.

---

Key Features of GAGA-1

  • Biggest strength: Film-like character performance with dialogue
  • Aspect ratio: Fixed 16:9
  • Lengths: 5s or 10s videos
  • Inputs required: Image + text prompt (uses built-in image generation tool)
image

Usage tips:

  • Dialogue ≤ 5 Chinese characters → use 5s
  • Dialogue > 5 characters → use 10s
  • Avoid > 20 characters — delivery may feel awkward

---

Image Generation

image

Connected to Banana.

image

Banana can edit images, but I prefer Seedream 4.0 aesthetics — especially for Asian portraits.

Example: My AI model Ran Xia.

image

---

First Prompt Test

> "The girl smiles and says: 'You don’t actually think I’m the kind of girl who cries to lose, do you?'"

After 3–4 minutes, a 10s video was ready.

Result:

  • Wind in hair, natural facial expressions, even teeth — among the best I’ve seen.
  • If not for Sora2, GAGA-1’s realism might be my #1 right now.

---

Second Prompt Test

> "The girl sighed, lowered her head and bit her lip, then raised her head a moment later and said in a firm voice: 'I've decided — starting today, neither of us owes the other anything.'"

Two renditions worth noting:

  • Resigned — disappointment and sadness, acceptance of fate.
  • Frustrated — slight anger, “I’m disappointed in you” vibe.

Observation: Acting depth > dialogue delivery quality; voice tone felt somewhat flat.

---

Third Prompt: Emotion Switch & Pauses

> "The girl cried as she said: 'You… don’t love me anymore?' After pausing for a moment, she shouted hysterically with intense emotion: 'I understand! I will never come to you again!'"

  • Hit max concurrency limit: 5 generations at once
  • Success rate: ~40%
  • Two decent outputs: varying hysterical intensities; one auto-added BGM
  • Issue: Long prompts can cause truncated dialogue

---

Additional Experiments

Half-body shot with pride

> "The woman maintains her posture, with only her facial expression changing. She says proudly: 'Could it be… that this name is because…' She pauses intentionally, emphasizing: 'Me.'"

  • Recognized foreign identity
  • Used broken Mandarin
  • Performance matched proud delivery

---

Two-person scene

> "The man looks at the woman helplessly: 'Won’t you let me speak?' The woman nods playfully: 'Go ahead.'"

Convincing expressions & line delivery on both actors.

---

Singing capability

Yes, GAGA-1 can sing — pitch may be “abstract.”

Example (Ke Jie, Five-in-a-Row theme):

> "The man looks at the chessboard and sings: 'Traditional five-in-a-row is just connecting five pieces into a line — so boring~ so dull~'"

---

Strengths & Limitations

Strengths:

  • High-quality facial acting for short scripts
  • Works well for Chinese & English prompts
  • Handles emotion shifts decently

Limitations:

  • Large, complex motions → visual distortion
  • Japanese dialogue feels awkward
  • No custom audio upload yet
  • No fixed voice ID per character (voice varies each generation)

---

Upcoming Features

Custom voice fixing is nearly ready — awaiting engineering completion.

---

Pricing

  • Current: Free for all users
  • Future: Will cost less than Sora2 and Veo3
  • No set timeline for ending free period

---

Monetization Platforms to Complement GAGA-1

Tools like AiToEarn官网 can integrate AI video generation with:

  • Multi-platform publishing (Douyin, Kwai, Bilibili, Xiaohongshu, YouTube)
  • Analytics
  • Global AI model ranking (AI模型排名)
  • This helps creators distribute high-quality outputs to audiences — and revenue streams — faster.

---

Final Thoughts

Whether for short dramas, game NPC dialogue, or visualizing novel characters, GAGA-1 offers a new, low-cost creative possibility.

It’s not perfect, but it lowers barriers for more people to join video content creation — and it’s a domestic AI model.

Go explore — and I look forward to your creative, entertaining works.

---

If you found this useful:

  • Like 👍
  • Click “read”
  • Share
  • Star ⭐ my profile for updates

Links:

Read more

Google DeepMind Launches CodeMender: An Intelligent Agent for Automatic Code Repair

Google DeepMind Launches CodeMender: An Intelligent Agent for Automatic Code Repair

Google DeepMind Launches CodeMender — AI for Automated Software Vulnerability Repair Date: 2025-10-18 13:09 Beijing --- Introduction Google DeepMind has unveiled CodeMender, an AI-powered intelligent agent designed to automatically detect, fix, and strengthen software vulnerabilities. Built on cutting-edge reasoning models and program analysis technologies, CodeMender aims to dramatically cut the

By Honghao Wang
What Signal Is Behind People’s Daily’s Consecutive Interviews with Entrepreneurs?

What Signal Is Behind People’s Daily’s Consecutive Interviews with Entrepreneurs?

Anti-Overcompetition — Urgent Action Needed! --- Source: Reprinted from the WeChat public account 笔记侠PPE书院 (bijixiafuwu) (authorized). Contact the original publisher for permission before reprinting. Article stats: 9,929th in-depth piece | 5,625 words | ~16 min read --- Understanding Overcompetition (Involution) Editor’s note: Overcompetition (内卷) is now a serious concern in

By Honghao Wang