linear attention

Today’s Open Source (2025-10-31): Kimi Linear Open-Sourced, KDA-Optimized Gated DeltaNet, 1M-Token Long-Context Decoding Speed Boost by 6×

LLM innovations

Today’s Open Source (2025-10-31): Kimi Linear Open-Sourced, KDA-Optimized Gated DeltaNet, 1M-Token Long-Context Decoding Speed Boost by 6×

Daily Discovery: Latest LLM Innovations Date: 2025-10-31 Location: Hong Kong, China --- 📢 Overview Today's discoveries feature groundbreaking projects across LLM architectures, text-to-speech generation, reinforcement learning, 3D scene creation, real-time inference, and autonomous agent development: * Kimi-Linear — Hybrid linear attention architecture. * kani-tts — Multi-language, high-quality text-to-speech engine. * ROVER — Minimal and efficient

By Honghao Wang
Kimi Open-Sources New Linear Attention Architecture, Surpasses Full Attention Models for the First Time with 6× Faster Inference

Kimi Linear

Kimi Open-Sources New Linear Attention Architecture, Surpasses Full Attention Models for the First Time with 6× Faster Inference

The Transformer Era Is Being Rewritten Moonshot AI has unveiled its open-source Kimi Linear architecture — introducing a novel attention mechanism that, for the first time, outperforms traditional full attention under identical training conditions. In long-context tasks, Kimi Linear: * Reduces KV cache usage by 75% * Achieves up to 6× faster inference

By Honghao Wang