Tencent AI
Tencent Releases Ultra-Low-Cost AI Training Method: $17 Beats $9,700 Fine-Tuning方案
Training-Free GRPO: A Cost-Effective Breakthrough in LLM Optimization Only 120 RMB — outperforming fine-tuning that costs 70,000 RMB! Tencent has introduced a new method for upgrading large-model agents: Training-Free Group Relative Policy Optimization (Training-Free GRPO). Key idea: No parameter adjustment required — the method leverages brief experiential learning within prompts to