AI news

SenseTime’s DailyMind Open-Source Model Achieves Breakthrough in Spatial Intelligence, Outperforms GPT-5 in Multiple Benchmarks

Honghao Wang

11 Nov 2025 — 4 min read

SenseTime SenseNova-SI: Breakthrough in Spatial Intelligence

Outperforms GPT-5 and Gemini 2.5 Pro — Now Open Source

SenseTime has officially released and open-sourced its SenseNova-SI series, achieving a major breakthrough in spatial intelligence.

In multiple authoritative benchmarks for spatial understanding and reasoning tasks, SenseNova-SI:

Significantly surpasses open-source multimodal models of similar scale.
Outperforms top closed-source models such as GPT-5 and Gemini 2.5 Pro.

---

Understanding the Spatial Intelligence Gap

While industry-leading large models excel in knowledge, writing, reasoning, and programming, they share a serious weakness: accurately understanding and reasoning about spatial structures.

Spatial intelligence is essential for embodied AI agents interacting with the physical world.

Example:

Left image: Complex spatial reasoning problem from The Brain TV show — GPT-5 solves easily, showing its non-spatial reasoning strength.
Right image: Simple spatial problem a child could solve — GPT-5 fails by picking “A”.

Insight:

Spatial understanding is crucial for AI to truly comprehend the 3D world. SenseTime’s innovation directly targets this gap.

---

SenseNova-SI — Benchmark Performance

Model sizes:

2B parameters
8B parameters

Benchmarks: VSI, MMSI, MindCube, ViewSpatial

Source: https://github.com/OpenSenseNova/SenseNova-SI

Highlights:

SenseNova-SI-8B average score: 60.99
Far ahead of open-source models:
Qwen3-VL-8B: 40.16
BAGEL-7B: 35.01
SpatialMLLM: 35.05
ViLaSR-7B: 36.41
Surpasses closed-source leaders:
GPT-5: 49.68
Gemini-2.5-Pro: 48.81

Result:

This is a qualitative breakthrough, not just incremental improvement.

---

Training Paradigm: Leveraging the "Scaling Effect"

Innovation stems from SenseTime’s systematic approach in training data and methodology:

Spatial capability classification framework.
Diverse, high-quality prior datasets.
Large-scale spatial understanding data.

For the first time in spatial AI, scaling high-quality data has proven to dramatically boost spatial reasoning ability across multiple domains.

Six Spatial Dimensions Enhanced:

Spatial measurement
Spatial reconstruction
Spatial relationships
Viewpoint transformation
Spatial deformation
Spatial reasoning

A technical report detailing the methodology will be released soon.

---

Comparative Examples — GPT-5 vs. SenseNova-SI-8B

Example 1 — Cube Composition

Correct Top-Down View Selection (SITE-Bench)

GPT-5: D
SenseNova-SI-8B: B ✅

---

Example 2 — Motorbike Position (SITE-Bench)

Photographer’s viewpoint — left or right?

GPT-5: A (left)
SenseNova-SI-8B: B ✅ (right)

---

Example 3 — Perspective Reasoning

---

Applied Examples from SITE-Bench & MindCube

Multi-lane Road Scenario — Predicting the Yellow Car’s Next Move

GPT-5: C (stationary)
SenseNova-SI-8B: D ✅ (turn right)

---

Outdoor Scene — Inferring Movement Direction

GPT-5: C
SenseNova-SI-8B: D ✅ (diagonally forward left)

---

Indoor Space — Movement by Object Position

GPT-5: D
SenseNova-SI-8B: A ✅ (diagonally forward left)

---

Object Recognition from Different Angles

GPT-5: B
SenseNova-SI-8B: C ✅ (door)

---

Impact: World Models & Embodied Intelligence

Spatial intelligence is pivotal for world models and embodied intelligence ecosystems.

Key Initiatives:

“Wu Neng” Embodied Intelligence Platform — powered by “Kai Wu” World Model.
SenseNova-SI complements Kai Wu — enabling multimodal AI to transition from digital environments to physical world tasks.
Open-source EASI Spatial Intelligence Evaluation Platform and Leaderboard:
https://github.com/EvolvingLMMs-Lab/EASI
Unified evaluation standards.
Ongoing tracking of open/closed models.

---

Beyond AI: Linking Spatial Intelligence to Creative Tools

Platforms like AiToEarn官网 integrate:

AI-driven content creation.
Cross-platform publishing.
Analytics & monetization.

Supported channels: Douyin, Kwai, WeChat, Bilibili, Rednote, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter.

Synergy:

Advances in spatial reasoning — combined with automated creation & distribution tools — are shaping the next wave of intelligent applications.

---

Bottom Line:

The debut of SenseNova-SI marks a pivotal step toward AI systems that can truly comprehend and operate in the 3D physical world, accelerating development in robotics, autonomous driving, AR/VR, and beyond.

---

Do you want me to also prepare a side-by-side table of benchmark scores so readers can instantly compare SenseNova-SI with competing models? That would make the performance gap even more visually clear.