SenseTime’s DailyMind Open-Source Model Achieves Breakthrough in Spatial Intelligence, Outperforms GPT-5 in Multiple Benchmarks

SenseTime’s DailyMind Open-Source Model Achieves Breakthrough in Spatial Intelligence, Outperforms GPT-5 in Multiple Benchmarks

SenseTime SenseNova-SI: Breakthrough in Spatial Intelligence

Outperforms GPT-5 and Gemini 2.5 Pro — Now Open Source

SenseTime has officially released and open-sourced its SenseNova-SI series, achieving a major breakthrough in spatial intelligence.

In multiple authoritative benchmarks for spatial understanding and reasoning tasks, SenseNova-SI:

  • Significantly surpasses open-source multimodal models of similar scale.
  • Outperforms top closed-source models such as GPT-5 and Gemini 2.5 Pro.

---

Understanding the Spatial Intelligence Gap

While industry-leading large models excel in knowledge, writing, reasoning, and programming, they share a serious weakness: accurately understanding and reasoning about spatial structures.

Spatial intelligence is essential for embodied AI agents interacting with the physical world.

Example:

  • Left image: Complex spatial reasoning problem from The Brain TV show — GPT-5 solves easily, showing its non-spatial reasoning strength.
  • Right image: Simple spatial problem a child could solve — GPT-5 fails by picking “A”.
image

Insight:

Spatial understanding is crucial for AI to truly comprehend the 3D world. SenseTime’s innovation directly targets this gap.

---

SenseNova-SI — Benchmark Performance

Model sizes:

  • 2B parameters
  • 8B parameters

Benchmarks: VSI, MMSI, MindCube, ViewSpatial

image

Source: https://github.com/OpenSenseNova/SenseNova-SI

Highlights:

  • SenseNova-SI-8B average score: 60.99
  • Far ahead of open-source models:
  • Qwen3-VL-8B: 40.16
  • BAGEL-7B: 35.01
  • SpatialMLLM: 35.05
  • ViLaSR-7B: 36.41
  • Surpasses closed-source leaders:
  • GPT-5: 49.68
  • Gemini-2.5-Pro: 48.81

Result:

This is a qualitative breakthrough, not just incremental improvement.

---

Training Paradigm: Leveraging the "Scaling Effect"

Innovation stems from SenseTime’s systematic approach in training data and methodology:

  • Spatial capability classification framework.
  • Diverse, high-quality prior datasets.
  • Large-scale spatial understanding data.

For the first time in spatial AI, scaling high-quality data has proven to dramatically boost spatial reasoning ability across multiple domains.

Six Spatial Dimensions Enhanced:

  • Spatial measurement
  • Spatial reconstruction
  • Spatial relationships
  • Viewpoint transformation
  • Spatial deformation
  • Spatial reasoning

A technical report detailing the methodology will be released soon.

---

Comparative Examples — GPT-5 vs. SenseNova-SI-8B

Example 1 — Cube Composition

Correct Top-Down View Selection (SITE-Bench)

  • GPT-5: D
  • SenseNova-SI-8B: B
image

---

Example 2 — Motorbike Position (SITE-Bench)

Photographer’s viewpoint — left or right?

  • GPT-5: A (left)
  • SenseNova-SI-8B: B ✅ (right)
image

---

Example 3 — Perspective Reasoning

image

---

Applied Examples from SITE-Bench & MindCube

Multi-lane Road Scenario — Predicting the Yellow Car’s Next Move

  • GPT-5: C (stationary)
  • SenseNova-SI-8B: D ✅ (turn right)
image

---

Outdoor Scene — Inferring Movement Direction

  • GPT-5: C
  • SenseNova-SI-8B: D ✅ (diagonally forward left)
image

---

Indoor Space — Movement by Object Position

  • GPT-5: D
  • SenseNova-SI-8B: A ✅ (diagonally forward left)
image

---

Object Recognition from Different Angles

  • GPT-5: B
  • SenseNova-SI-8B: C ✅ (door)

---

Impact: World Models & Embodied Intelligence

Spatial intelligence is pivotal for world models and embodied intelligence ecosystems.

Key Initiatives:

  • “Wu Neng” Embodied Intelligence Platform — powered by “Kai Wu” World Model.
  • SenseNova-SI complements Kai Wu — enabling multimodal AI to transition from digital environments to physical world tasks.
  • Open-source EASI Spatial Intelligence Evaluation Platform and Leaderboard:
  • https://github.com/EvolvingLMMs-Lab/EASI
  • Unified evaluation standards.
  • Ongoing tracking of open/closed models.

---

Beyond AI: Linking Spatial Intelligence to Creative Tools

Platforms like AiToEarn官网 integrate:

  • AI-driven content creation.
  • Cross-platform publishing.
  • Analytics & monetization.

Supported channels: Douyin, Kwai, WeChat, Bilibili, Rednote, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter.

Synergy:

Advances in spatial reasoning — combined with automated creation & distribution tools — are shaping the next wave of intelligent applications.

---

Bottom Line:

The debut of SenseNova-SI marks a pivotal step toward AI systems that can truly comprehend and operate in the 3D physical world, accelerating development in robotics, autonomous driving, AR/VR, and beyond.

---

Do you want me to also prepare a side-by-side table of benchmark scores so readers can instantly compare SenseNova-SI with competing models? That would make the performance gap even more visually clear.

Read more

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Harvard CS50: Introduction to Programming with R Harvard University offers exceptional beginner-friendly computer science courses. We’re excited to announce the release of Harvard CS50’s Introduction to Programming in R, a powerful language widely used for statistical computing, data science, and graphics. This course was developed by Carter Zenke.