AI news

5 Impressive GitHub Projects for AI-Controlled Smartphones

Honghao Wang

25 Nov 2025 — 3 min read

AI-Powered Mobile Automation Overview

Traditionally, automating mobile phone operations required tools like Appium or Airtest, along with detailed knowledge of an app’s element IDs (`resource-id`, `xpath`, etc.).

However, any app update that changed these IDs would break the automation scripts.

Thanks to AI large models — particularly vision models — controlling smartphones through AI is now practical. Below are some popular open-source projects that enable AI-driven mobile control.

---

1. MobiAgent — Mobile Intelligent Agent Framework

Developer: IPADS Lab

Purpose: AI agents autonomously operate mobile devices.

Example Tasks

"Find the top-selling men's jeans on Xiaohongshu, search the same product on Taobao, collect brand/name/price, and send them via WeChat."
"Open Ele.me and order a lemon water from Mixue Bingcheng."

How It Works

MobiAgent decomposes tasks into three specialized modules:

Planner — Creates the overall plan.
Decider — Determines where to click next.
Grounder — Locates precise positions on the screen.

Core Components

MobiMind Model Family — Intelligence core with models of varying scales.
AgentRR Acceleration Framework — Optimizes repeated tasks for faster execution.
MobiFlow Benchmark — Standardized scenarios across 10+ mainstream apps for evaluating performance.

Repo:

---

2. Mobile-Agent — Alibaba Open Source

Purpose: AI performs cross-app operations by visually understanding the screen.

Example Task

"Search for Jinan travel guides on Xiaohongshu, sort by favorites, and save the first note."

Key Features

Recognizes text, icons, and buttons visually — no backend API required.
Uses ADB (Android Debug Bridge) for command execution.
Captures screens after each step to self-correct actions.

Repo:

---

3. Droidrun — Mobile Automation Agent Framework

Platform: Android & iOS

Stars: 6.2K on GitHub

Concept

AI handles "thinking," while the framework performs actions — no reliance on hard-coded UI elements.

Example Task

"Find next week’s available 2-person apartments in San Francisco and return the cheapest option."

Repo:

---

4. AppAgent — Tencent Open Source

Full Name: Multimodal Agents as Smartphone Users

Goal: Give AI agents human-like perception and interaction skills.

Key Features

Captures screenshots via ADB, sends them to a multimodal AI model.
Decides actions (tap/swipe) based on UI element analysis.
Learns new apps through:
Autonomous exploration
Observation of human demonstrations
Builds a Knowledge Base for future operations without relearning.

Repo: https://github.com/TencentQQGYLab/AppAgent

---

5. mobile-use — Voice-Controlled Mobile Automation

Stars: 1.8K

Platform: Android & iOS

Developer: Minitap AI Team

How It Works

Captures current mobile screen.
Sends screenshot + spoken/user instruction to a multimodal AI model.
Model outputs coordinates or actions (tap/swipe/input).
Executes via ADB.
Takes new screenshot to verify progress, repeating until task completion.

Technical Notes

Integrates Maestro mobile testing framework for reliable device interaction.
Supports multiple large-model backends: OpenAI API, local models, or other services.

Repo: https://github.com/minitap-ai/mobile-use

---

Why This Matters

The growing ecosystem of AI-driven mobile automation — from AppAgent to mobile-use — is enabling:

Human-like UI workflow learning
Cross-app task execution
Novel productivity tools
Accessibility enhancements

---

Bonus: AiToEarn — AI Content Monetization Platform

For creators and developers looking to combine AI mobile automation with publishing:

AiToEarn官网:
Features:
AI content generation
Cross-platform publishing (Douyin, WeChat, Bilibili, Facebook, Instagram, YouTube, etc.)
Analytics
AI model ranking:

By integrating AI agents for task automation with AiToEarn for multi-platform distribution and monetization, creators can streamline digital productivity and maximize reach.

---

💡 Tip: Bookmark the repos above — these projects are rapidly evolving and could reshape how we interact with mobile devices.

5 Impressive GitHub Projects for AI-Controlled Smartphones

Honghao Wang

AI-Powered Mobile Automation Overview

1. MobiAgent — Mobile Intelligent Agent Framework

Example Tasks

How It Works

Core Components

2. Mobile-Agent — Alibaba Open Source

Example Task

Key Features

3. Droidrun — Mobile Automation Agent Framework

Concept

Example Task

4. AppAgent — Tencent Open Source

Key Features

5. mobile-use — Voice-Controlled Mobile Automation

How It Works

Technical Notes

Why This Matters

Bonus: AiToEarn — AI Content Monetization Platform

Read more

Xiaoyuan Learning Tablet Wins 2025 IDEA International Design Award, Setting a New Benchmark for Study Devices

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Cloud Computing Giant Unveils 25 New Products in 10 Minutes — Kimi and MiniMax Debut

TopGear Picks 18 Cars of the Year, Only One from China