SenseTime’s DailyMind Open-Source Model Achieves Breakthrough in Spatial Intelligence, Outperforms GPT-5 in Multiple Benchmarks
SenseTime SenseNova-SI: Breakthrough in Spatial Intelligence
Outperforms GPT-5 and Gemini 2.5 Pro — Now Open Source
SenseTime has officially released and open-sourced its SenseNova-SI series, achieving a major breakthrough in spatial intelligence.
In multiple authoritative benchmarks for spatial understanding and reasoning tasks, SenseNova-SI:
- Significantly surpasses open-source multimodal models of similar scale.
- Outperforms top closed-source models such as GPT-5 and Gemini 2.5 Pro.
---
Understanding the Spatial Intelligence Gap
While industry-leading large models excel in knowledge, writing, reasoning, and programming, they share a serious weakness: accurately understanding and reasoning about spatial structures.
Spatial intelligence is essential for embodied AI agents interacting with the physical world.
Example:
- Left image: Complex spatial reasoning problem from The Brain TV show — GPT-5 solves easily, showing its non-spatial reasoning strength.
- Right image: Simple spatial problem a child could solve — GPT-5 fails by picking “A”.

Insight:
Spatial understanding is crucial for AI to truly comprehend the 3D world. SenseTime’s innovation directly targets this gap.
---
SenseNova-SI — Benchmark Performance
Model sizes:
- 2B parameters
- 8B parameters
Benchmarks: VSI, MMSI, MindCube, ViewSpatial

Source: https://github.com/OpenSenseNova/SenseNova-SI
Highlights:
- SenseNova-SI-8B average score: 60.99
- Far ahead of open-source models:
- Qwen3-VL-8B: 40.16
- BAGEL-7B: 35.01
- SpatialMLLM: 35.05
- ViLaSR-7B: 36.41
- Surpasses closed-source leaders:
- GPT-5: 49.68
- Gemini-2.5-Pro: 48.81
Result:
This is a qualitative breakthrough, not just incremental improvement.
---
Training Paradigm: Leveraging the "Scaling Effect"
Innovation stems from SenseTime’s systematic approach in training data and methodology:
- Spatial capability classification framework.
- Diverse, high-quality prior datasets.
- Large-scale spatial understanding data.
For the first time in spatial AI, scaling high-quality data has proven to dramatically boost spatial reasoning ability across multiple domains.
Six Spatial Dimensions Enhanced:
- Spatial measurement
- Spatial reconstruction
- Spatial relationships
- Viewpoint transformation
- Spatial deformation
- Spatial reasoning
A technical report detailing the methodology will be released soon.
---
Comparative Examples — GPT-5 vs. SenseNova-SI-8B
Example 1 — Cube Composition
Correct Top-Down View Selection (SITE-Bench)
- GPT-5: D
- SenseNova-SI-8B: B ✅

---
Example 2 — Motorbike Position (SITE-Bench)
Photographer’s viewpoint — left or right?
- GPT-5: A (left)
- SenseNova-SI-8B: B ✅ (right)

---
Example 3 — Perspective Reasoning

---
Applied Examples from SITE-Bench & MindCube
Multi-lane Road Scenario — Predicting the Yellow Car’s Next Move
- GPT-5: C (stationary)
- SenseNova-SI-8B: D ✅ (turn right)

---
Outdoor Scene — Inferring Movement Direction
- GPT-5: C
- SenseNova-SI-8B: D ✅ (diagonally forward left)

---
Indoor Space — Movement by Object Position
- GPT-5: D
- SenseNova-SI-8B: A ✅ (diagonally forward left)

---
Object Recognition from Different Angles
- GPT-5: B
- SenseNova-SI-8B: C ✅ (door)
---
Impact: World Models & Embodied Intelligence
Spatial intelligence is pivotal for world models and embodied intelligence ecosystems.
Key Initiatives:
- “Wu Neng” Embodied Intelligence Platform — powered by “Kai Wu” World Model.
- SenseNova-SI complements Kai Wu — enabling multimodal AI to transition from digital environments to physical world tasks.
- Open-source EASI Spatial Intelligence Evaluation Platform and Leaderboard:
- https://github.com/EvolvingLMMs-Lab/EASI
- Unified evaluation standards.
- Ongoing tracking of open/closed models.
---
Beyond AI: Linking Spatial Intelligence to Creative Tools
Platforms like AiToEarn官网 integrate:
- AI-driven content creation.
- Cross-platform publishing.
- Analytics & monetization.
Supported channels: Douyin, Kwai, WeChat, Bilibili, Rednote, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter.
Synergy:
Advances in spatial reasoning — combined with automated creation & distribution tools — are shaping the next wave of intelligent applications.
---
Bottom Line:
The debut of SenseNova-SI marks a pivotal step toward AI systems that can truly comprehend and operate in the 3D physical world, accelerating development in robotics, autonomous driving, AR/VR, and beyond.
---
Do you want me to also prepare a side-by-side table of benchmark scores so readers can instantly compare SenseNova-SI with competing models? That would make the performance gap even more visually clear.