StepOpen Releases 4B Agent Model — Runs on All Android Devices, One-Click Deployment for DIY Enthusiasts

StepOpen Releases 4B Agent Model — Runs on All Android Devices, One-Click Deployment for DIY Enthusiasts

GELab-Zero — First Simultaneous Open-Source GUI Agent Model with Full Infrastructure & One-Click Deployment

StepStar (阶跃星辰) has officially open-sourced GELab-Zero, a powerful GUI Agent model and complete deployment infrastructure.

Performance Highlights

  • GELab-Zero-4B-preview has set new records on multiple mobile and desktop GUI leaderboards for its size category.
  • Achieves State-of-the-Art (SOTA) results in GUI understanding and execution.
  • Outperforms larger models (e.g., GUI-Owl-32B) while delivering better deployment agility.

---

Why GELab-Zero Matters

With AI now prevalent in consumer devices like smartphones, Mobile Agents are transitioning from possible to scalable.

GUI Agents offer:

  • Adaptation to almost any app based purely on visual understanding.
  • No vendor-specific changes needed, resulting in minimal integration cost.

StepStar also released AndroidDaily, a real-world evaluation standard for consumer-grade GUI model capability testing.

---

The Challenge in GUI Agent Development

Running a mobile GUI Agent across devices and OS variations is hard due to:

  • Fragmented ecosystems.
  • Complex setups: multi-device ADB, dependency installation, permissions, inference service handling, orchestration, and replay.

Solution: Lower the entry barrier so developers can focus on innovation without reinventing the core infrastructure.

---

What's Included in GELab-Zero

  • Local GUI Agent Model — GELab-Zero-4B-preview
  • Plug-and-Play Full Inference Infrastructure — handles heavy engineering tasks automatically.
  • AndroidDaily Benchmark Suite — designed around genuine business scenarios.

Benchmarks Used:

  • ScreenSpot
  • OSWorld
  • MMBench
  • Android World

These cover:

  • GUI comprehension
  • Element locating
  • Interactive operations

Result: GELab-Zero-4B-preview delivers SOTA performance in its category.

image
image

---

Example Scenarios

Scenario 1: Multi-Item Purchase in Food Delivery Apps

Prompt: Buy multiple items of specified categories, specs, and quantities via Ele.me’s nearest Hema Fresh store.

Outcome: Model accurately identifies items and executes repetitive multi-step purchases smoothly.

---

Scenario 2: Claim Enterprise Meal Voucher

Prompt: Navigate within GeiDao to claim a specific “Employee Benefits” voucher.

Outcome: Handles niche app navigation with precision.

---

Scenario 3: Play a Classic Movie Featuring a Specific Actor

Prompt: On Tencent Video, play a classic Jackie Chan action film.

Outcome:

  • Recognizes subjective term (“classic”).
  • Closes pop-ups.
  • Searches within the movie category and selects top-rated film.

---

Scenario 4: Weekend Activity for Kids

Prompt: Find an activity spot for a child in Beijing.

Outcome:

  • Searches content platforms.
  • Evaluates options & recommends “Wanku Adventure” at Beijing Garden Expo Park.
  • Highlights kid-friendly features.

---

Key Capabilities

image
  • Lightweight Local Inference
  • Run 4B models on consumer-grade hardware with low latency & privacy protection.
  • One-Click Task Initiation
  • Unified deployment handles dependencies & device setup automatically.
  • Multi-Device Task Distribution
  • Assign tasks across multiple devices and record interaction traces.
  • Multiple Agent Modes
  • Supports ReAct closed-loop, multi-agent collaboration, and scheduled tasks.

---

AndroidDaily Benchmark

Developed to reflect real-world everyday tasks — beyond productivity apps — covering:

  • Food & Dining
  • Travel
  • Shopping
  • Housing
  • Information Consumption
  • Entertainment

Accuracy:

GELab-Zero-4B-preview scored 73.4% on AndroidDaily across complex mobile scenarios.

image

---

Dual-Track Evaluation System

1. Static Evaluation

  • Tests grounding (UI understanding) & action planning.
  • Dataset: 3,146 actions with step-by-step screenshots.
  • Measures numeric prediction accuracy (e.g., clicks, input).

2. End-to-End Testing

  • Full-task execution in real or emulated environments.
  • Covers:
  • Transportation (ride-hailing, navigation, public transit)
  • Shopping & Payment
  • Social Communication
  • Content Consumption
  • Local Services
  • Metric: Overall success rate per scenario.
image

---

Open-Source Access

---

Monetization & Ecosystem Integration

For creators and developers, AiToEarn complements GELab-Zero by enabling AI-powered content monetization and cross-platform publishing.

Supports simultaneous distribution to:

  • Douyin, Kwai, WeChat, Bilibili, Rednote
  • Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X (Twitter)

Integrated features:

  • AI generation tools
  • Cross-platform analytics
  • Model ranking (AI模型排名)

Bottom line: GELab-Zero lowers the deployment barrier for GUI Agents, while tools like AiToEarn empower creators to monetize innovations at scale.

---

Would you like me to add a quick “Getting Started” section with installation and run commands for GELab-Zero so developers can use this right after reading? That could make this Markdown more actionable.

Read more

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Harvard CS50: Introduction to Programming with R Harvard University offers exceptional beginner-friendly computer science courses. We’re excited to announce the release of Harvard CS50’s Introduction to Programming in R, a powerful language widely used for statistical computing, data science, and graphics. This course was developed by Carter Zenke.