StepOpen Releases 4B Agent Model — Runs on All Android Devices, One-Click Deployment for DIY Enthusiasts

Honghao Wang

30 Nov 2025 — 3 min read

GELab-Zero — First Simultaneous Open-Source GUI Agent Model with Full Infrastructure & One-Click Deployment

StepStar (阶跃星辰) has officially open-sourced GELab-Zero, a powerful GUI Agent model and complete deployment infrastructure.

Performance Highlights

GELab-Zero-4B-preview has set new records on multiple mobile and desktop GUI leaderboards for its size category.
Achieves State-of-the-Art (SOTA) results in GUI understanding and execution.
Outperforms larger models (e.g., GUI-Owl-32B) while delivering better deployment agility.

---

Why GELab-Zero Matters

With AI now prevalent in consumer devices like smartphones, Mobile Agents are transitioning from possible to scalable.

GUI Agents offer:

Adaptation to almost any app based purely on visual understanding.
No vendor-specific changes needed, resulting in minimal integration cost.

StepStar also released AndroidDaily, a real-world evaluation standard for consumer-grade GUI model capability testing.

---

The Challenge in GUI Agent Development

Running a mobile GUI Agent across devices and OS variations is hard due to:

Fragmented ecosystems.
Complex setups: multi-device ADB, dependency installation, permissions, inference service handling, orchestration, and replay.

Solution: Lower the entry barrier so developers can focus on innovation without reinventing the core infrastructure.

---

What's Included in GELab-Zero

Local GUI Agent Model — GELab-Zero-4B-preview
Plug-and-Play Full Inference Infrastructure — handles heavy engineering tasks automatically.
AndroidDaily Benchmark Suite — designed around genuine business scenarios.

Benchmarks Used:

ScreenSpot
OSWorld
MMBench
Android World

These cover:

GUI comprehension
Element locating
Interactive operations

Result: GELab-Zero-4B-preview delivers SOTA performance in its category.

---

Example Scenarios

Scenario 1: Multi-Item Purchase in Food Delivery Apps

Prompt: Buy multiple items of specified categories, specs, and quantities via Ele.me’s nearest Hema Fresh store.

Outcome: Model accurately identifies items and executes repetitive multi-step purchases smoothly.

---

Scenario 2: Claim Enterprise Meal Voucher

Prompt: Navigate within GeiDao to claim a specific “Employee Benefits” voucher.

Outcome: Handles niche app navigation with precision.

---

Scenario 3: Play a Classic Movie Featuring a Specific Actor

Prompt: On Tencent Video, play a classic Jackie Chan action film.

Outcome:

Recognizes subjective term (“classic”).
Closes pop-ups.
Searches within the movie category and selects top-rated film.

---

Scenario 4: Weekend Activity for Kids

Prompt: Find an activity spot for a child in Beijing.

Outcome:

Searches content platforms.
Evaluates options & recommends “Wanku Adventure” at Beijing Garden Expo Park.
Highlights kid-friendly features.

---

Key Capabilities

Lightweight Local Inference
Run 4B models on consumer-grade hardware with low latency & privacy protection.
One-Click Task Initiation
Unified deployment handles dependencies & device setup automatically.
Multi-Device Task Distribution
Assign tasks across multiple devices and record interaction traces.
Multiple Agent Modes
Supports ReAct closed-loop, multi-agent collaboration, and scheduled tasks.

---

AndroidDaily Benchmark

Developed to reflect real-world everyday tasks — beyond productivity apps — covering:

Food & Dining
Travel
Shopping
Housing
Information Consumption
Entertainment

Accuracy:

GELab-Zero-4B-preview scored 73.4% on AndroidDaily across complex mobile scenarios.

---

Dual-Track Evaluation System

1. Static Evaluation

Tests grounding (UI understanding) & action planning.
Dataset: 3,146 actions with step-by-step screenshots.
Measures numeric prediction accuracy (e.g., clicks, input).

2. End-to-End Testing

Full-task execution in real or emulated environments.
Covers:
Transportation (ride-hailing, navigation, public transit)
Shopping & Payment
Social Communication
Content Consumption
Local Services
Metric: Overall success rate per scenario.

---

Open-Source Access

GitHub: https://github.com/stepfun-ai/gelab-zero
Hugging Face: https://huggingface.co/stepfun-ai/GELab-Zero-4B-preview

---

Monetization & Ecosystem Integration

For creators and developers, AiToEarn complements GELab-Zero by enabling AI-powered content monetization and cross-platform publishing.

Supports simultaneous distribution to:

Douyin, Kwai, WeChat, Bilibili, Rednote
Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X (Twitter)

Integrated features:

AI generation tools
Cross-platform analytics
Model ranking (AI模型排名)

Bottom line: GELab-Zero lowers the deployment barrier for GUI Agents, while tools like AiToEarn empower creators to monetize innovations at scale.

---

Would you like me to add a quick “Getting Started” section with installation and run commands for GELab-Zero so developers can use this right after reading? That could make this Markdown more actionable.