Choreographers Out of Work! NUST + Tsinghua + Nanjing University Create: One Song Enables High-Quality Harmonious Group Dance

Choreographers Out of Work! NUST + Tsinghua + Nanjing University Create: One Song Enables High-Quality Harmonious Group Dance

TCDiff++: Breaking Barriers in Multi-Person AI Dance Generation

image

---

Overview

Xin Zhiyuan Report

When the metaverse’s digital humans need advanced “group dance skills,” traditional music-driven generation technologies hit serious bottlenecks:

  • Dancer collisions
  • Stiff, unnatural movements
  • Failures in long-sequence choreography

Researchers from NUST, Tsinghua University, and Nanjing University have jointly developed TCDiff++, an end-to-end model that achieves high-quality, long-duration, multi-dancer automatic choreography without collisions or foot sliding.

TCDiff++ supports cross-modal choreography — one click generates harmonious group dances, offering a complete AIGC workflow for virtual concerts, theatre, and metaverse events.

This is an upgrade from the AAAI 2025 open-source TCDiff model and has been accepted by IJCV 2025.

---

Challenges in AI Group Dance Generation

1. High Similarity in Motion Data

  • In typical datasets, 80%+ of movements are nearly identical.
  • Each dancer’s motion data: 100+ dimensions, but position data is only 3D.
  • Result: AI confuses dancer identities → leads to collisions.

2. Foot Sliding Phenomenon

  • AI struggles to synchronize upper-body motion and foot placement.
  • Foot sliding breaks realism and immersion.

3. Long-Sequence Instability

  • Short clips (few seconds) are easy.
  • Long sequences (minutes): mutations, stutters, positional drift.
  • Real-world performances last much longer — current models fail here.
image

---

From TCDiff to TCDiff++

In TCDiff (AAAI 2025), researchers introduced the trajectory-controllable concept:

  • Two-stage framework: separate trajectory prediction from action generation.
  • Worked for collision prevention, but caused rigid transitions and jitter in long sequences.

TCDiff++ solves these with a fully end-to-end architecture.

Core Components:

  • Group Dance Decoder
  • Generates collision-free coordinated movements from music.
  • Footwork Adaptor
  • Refines foot trajectories, eliminates sliding, ensures grounded footwork.
image

---

Links:

📄 Paper: arxiv.org/pdf/2506.18671

🌐 Project: da1yuqin.github.io/TCDiffpp.website

💻 Code: github.com/Da1yuqin/TCDiffpp

---

System Workflow

image
  • Step 1: Group Dance Decoder → initial movements, no collisions.
  • Step 2: Footwork Adaptor → optimizes feet, removes sliding.
  • Step 3: Integration → stable, realistic dance sequences.

---

Key Innovations

1. Collision Prevention

image
  • Dance Positioning Embedding — encodes dancer positions (left/right) in formation.
  • Fusion Projection Module — boosts feature dimensions, improves dancer distinction.
  • Global Distance Constraint — maintains safe spacing between dancers.
image

---

2. Precise Footwork

image
  • Swap Mode Conditioning — guides realistic foot movement from start.
  • Footwork Adaptor — uses heel/toe contact & root bone velocity to refine steps.

---

3. Long-Sequence Optimization

image
  • Long Group Diffusion Samplingsegmented generation with half-segment overlap.
  • Preserves positional continuity and motion smoothness.
image

---

Performance Comparisons

Model Benchmarks

image

Why TCDiff++ Excels:

  • Best multi-dancer coordination
  • High solo realism & diversity
  • Strong long-duration stability
image
image

---

Other Models’ Limitations:

  • EDGE: Can’t distinguish dancers → sliding & collisions.
  • GCD: Ignores coordinate modeling → severe sliding.
  • CoDancers: Improves ID clarity but loses formation harmony.
  • TCDiff: Better formation but mismatched actions & positions.

---

Long-Sequence Capability Test

image
  • EDGE / GCD: Abrupt position swaps.
  • CoDancers: Poor formations.
  • TCDiff: Cumulative position–motion errors.
  • TCDiff++: Maintains position–movement consistency for 720-frame sequences.
image

---

Ablation Study

  • All modules reduce collisions & sliding.
  • Maximum performance achieved when all modules combined.
image

---

User Preference Survey

image

Criteria:

  • Movement realism
  • Music–movement correlation
  • Formation aesthetics
  • Dancer harmony

Result: TCDiff++ most visually preferred.

image

---

Limitations & Future Directions

  • Currently music-only input
  • No support for text input, keyframes, or style parameters.
  • Future plans: rich multimodal controls for interactive use.
  • Formation change learning (e.g., swaps)
  • Dataset scarcity of swap motions & annotations.
  • Need larger, richer datasets for dynamic formation choreography.

---

Integration with AI Content Ecosystems

TCDiff++ can be integrated with open-source platforms like AiToEarn, enabling:

  • AI-generated choreography
  • Cross-platform publishing (Douyin, Bilibili, YouTube, Instagram, etc.)
  • Analytics & monetization
image

---

References:

📄 https://arxiv.org/pdf/2506.18671

---

If you want, I can also prepare a short, visually-optimized summary version of this TCDiff++ feature set for quick reading, which would be ideal for GitHub README or project landing pages. Would you like me to do that?

Read more

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Harvard CS50: Introduction to Programming with R Harvard University offers exceptional beginner-friendly computer science courses. We’re excited to announce the release of Harvard CS50’s Introduction to Programming in R, a powerful language widely used for statistical computing, data science, and graphics. This course was developed by Carter Zenke.