AI efficiency

Thinking Machine's New Study Goes Viral: Combining RL + Fine-Tuning for More Cost-Effective Small Model Training

On-policy distillation

Thinking Machine's New Study Goes Viral: Combining RL + Fine-Tuning for More Cost-Effective Small Model Training

Thinking Machine’s Breakthrough: On-Policy Distillation for Efficient LLM Training Thinking Machine’s latest research is generating intense discussion in the AI community. After being personally reposted by Mira Murati — founder and former OpenAI CTO — many prominent figures praised its research value: According to Murati’s summary, the team has

By Honghao Wang