AI news

When Recommendation Systems Truly “Understand You”: Kuaishou Team Presents New TagCF Breakthrough at NeurIPS 2025

Honghao Wang

27 Nov 2025 — 4 min read

Understanding Users, Not Just Content — The TagCF Framework

Every day, recommendation systems capture our interests — from videos we scroll past to livestreams where we pause.

Most algorithms focus on understanding content (what topics you like) rather than understanding people (Who are you?).

The Kuaishou Consumer Strategy Algorithm Team identified this gap and, in collaboration with Kuaishou’s Foundation Model & Applications Department and Wuhan University, proposed TagCF — a framework that bridges content interests with social roles.

This approach moves recommendation systems from knowing what happens to knowing why it happens.

> Accepted at NeurIPS 2025 — All related code and experimental frameworks are open-sourced to help academia and industry explore recommendation systems built on deeper user understanding.

---

Resources

Paper Title: Who You Are Matters: Bridging Topics and Social Roles via LLM-Enhanced Logical Recommendation
Paper Link: http://arxiv.org/abs/2505.10940
Code Repository: https://github.com/Code2Q/TagCF

---

1. Background & Motivation

User Understanding: A Missing Dimension

Figure 2

Traditional systems statistically model associations between content items — but the true underlying factor is often the user role (e.g., symphony musician, new dad).

User roles act as latent confounders in these relationships, and explicitly modeling them:

Improves interpretability
Enhances statistical and logical modeling power
Helps break users out of “information cocoons”
Boosts recommendation accuracy and stability

> Terminology

> - User Tag ≈ User Role ≈ User Attribute

> - Item Tag ≈ Item Topic ≈ Point of Interest

---

2. Two New Tasks Introduced

TagCF extends traditional recommendation with topic modeling concepts, defining:

2.1 User Role Identification

Modeling traits, personality, social identity, and needs (explicit attributes such as age, gender, etc.).

2.2 Behavioral Logic Modeling

Constructing logical association graphs between user roles and item topics:

I2U (Item-to-User): Identify which roles should see content with certain topics
U2I (User-to-Item): Predict what topics a specific role is likely to want

Figure 3

---

3. TagCF Solution Architecture

Module 1 — Video Content Understanding (MLLM-Based)

Traverse newly uploaded videos (after reaching exposure threshold).
Use MLLM (M3 [1]) to:
Extract multimodal video embeddings.
Interpret semantics via specially designed prompts (Figure 5).
Generate Item Tags and User Tags.
Update the tag library in real time.

Figure 5

---

Module 2 — Behavior Logic Graph Construction (LLM-Based)

Build U2I and I2U logic graphs from tag sets.
Use QWen2.5-7B [2] to reason logical relationships between start and target tags (Figure 6).

Figure 6

---

Module 3 — Downstream Recommendation Empowerment

Enhance recommendation models with tag and logic data (LLM-for-rec paradigm).
Maintain modeling space consistency using:
TagCF-it: Item tag space focus
TagCF-ut: User tag space focus

Enhancement Methods:

Encoder model enhancement using tags
Training enhancement via tag-logic alignment
Prediction score adjustment using tag-logic

---

4. Engineering Challenges & Solutions

Challenges:

Uncontrolled tag set expansion
Long-tail video coverage distribution
Conflict between free-form generation and strict scoring needs
Lack of evaluation methods for large-model outputs

Solutions:

High-Frequency Tag Cover Sets
Weak-overlap, symmetric cover sets for user tags & item tags
Size: ~7k–20k, stable in ~30 days
Model Distillation
Train small models from large-model item2tag/tag2tag outputs for fine-grained ranking and reuse.

Manual Comparative Evaluation
“Good–Same–Bad” strategy across accuracy, completeness, rationality, readability — quality close to GPT-4o.

---

5. Explicit Filter-Bubble Control

TagCF introduces controllable cocoon-breaking strategies:

TagCF-util: Use only in-cocoon tag set (T(0)) → focus on accuracy.
TagCF-expl: Use union of T(0) + out-of-cocoon set (T(1)) → improve diversity.

---

6. Experimental Results

6.1 Offline

Metrics:

Accuracy: NDCG, MRR
Diversity: Cover, Gini

Findings:

TagCF improves backbone model performance.
Logic graph backend is transferable.
TagCF-util excels in accuracy.
TagCF-expl boosts diversity.

---

6.2 Online

Workflow:

Results:

TagCF-expl improved long-term retention (LT +0.037%).
User tag sets: smaller sets, faster convergence, stronger enhancements vs. item tags.

---

7. Summary & Outlook

Contributions:

Balanced video understanding & user understanding in recommendation.
Developed TagCF with strong transferability — applicable to recall, e-commerce, search, and beyond.
Enabled explicit user cocoon modeling and control.

Significance:

Moves recsys from statistical guessing → logical human understanding.
Bridges symbolic & statistical paradigms in large-model-powered recommendations.
Framework complements open ecosystems like AiToEarn官网 for AI-assisted content creation, cross-platform publishing, and monetization.

Future Work:

Refine tag-logic backend for efficiency.
Extend application scenarios.
Maintain balance between accuracy and diversity in recommendation.

---

References

[1] Mu Cai, Jianwei Yang, Jianfeng Gao, and Yong Jae Lee. Proceedings of the 13th International Conference on Learning Representations, 2025.

[2] An Yang et al. Qwen2.5 Technical Report. arXiv:2412.15115, 2024.

[3] Zhou, Guorui et al. OneRec Technical Report. arXiv:2506.13695, 2025.

[4] Zhang, Kaike et al. GoalRank: Group-Relative Optimization for a Large Ranking Model. arXiv:2509.22046, 2025.