LLM optimization
Today’s Open Source (2025-11-3): Kuaishou and Nanjing University Lab Co-Develop HiPO for Hybrid Strategy Optimization in LLM Dynamic Inference, Dual-Mode Switching Balances Accuracy and Efficiency
🏆 Foundational Models ① Project: HiPO HiPO-8B is a novel reinforcement learning framework based on Hybrid Policy Optimization, enabling dynamic reasoning capabilities in large language models (LLMs). Key Highlights: * Developed by KwaiKAT team at Kuaishou in collaboration with NJU-LINK Laboratory (Nanjing University) and ARiSE Laboratory. * Features “think-on” and “think-off” mode switching to