AI news

Fara-7B: An Efficient Small Language Model for Intelligent Agents in Computer Applications

Honghao Wang

25 Nov 2025 — 3 min read

Pushing the Frontiers of Computer‑Use Agents

Introducing Fara‑7B — An Open‑Weight, Ultra‑Compact Model Optimized for Real‑World Web Tasks

---

Background: Microsoft’s Small Language Models (SLMs) Journey

In 2024, Microsoft began delivering small language models to customers:

Phi models via Microsoft Foundry
Phi Silica deployed on Copilot+ PCs with Windows 11 (details here)

Now, the next milestone: Fara‑7B, our first agentic SLM for direct computer use.

---

What Makes Fara‑7B Different?

Key Highlights

Direct computer interaction — operates through visual perception, mouse, and keyboard actions.
Compact size — 7B parameters, yet competitive with much larger multi‑model agentic systems.
On‑device capability — enables low latency and improved privacy (data stays local).
Open‑weight release — encourages experimentation and community feedback.

Practical Applications

Fara‑7B is designed for automating everyday web tasks:

Filling out forms
Searching information
Booking travel
Managing accounts

---

Synergy with Content Monetization Platforms

The rise of on‑device AI agents complements open‑source ecosystems like AiToEarn — a global AI content monetization platform connecting:

Generation tools
Cross‑platform publishing (Douyin, Kwai, WeChat, Bilibili, Rednote/Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter)
Analytics & model rankings

Creators can:

Generate AI content
Publish across multiple channels
Track performance and monetize efficiently

---

Fara‑7B in Action: Demo Videos

Video 1 — Shopping Scenario

Task: Purchase an Xbox Spongebob controller via Magentic‑UI.
Behavior: Pauses at each Critical Point for user approval.

Video 2 — Information Retrieval

Task: Find and summarize the latest three issues posted on GitHub for `Microsoft/Magentic-UI`.

Video 3 — Multi‑Tool Task

Task: Determine driving time between two locations and suggest a nearby cheese shop.
Tools used: Bing Maps + Bing Search.

---

Performance and Limitations

Strengths

Competitive across common benchmarks and novel evaluation sets (e.g., job postings, price comparisons).

Limitations

Accuracy challenges with complex queries
Occasional instruction‑following errors
Susceptibility to hallucinations

Note: These remain active areas of research.

---

Availability

Microsoft Foundry
Hugging Face (MIT license)
Quantized & silicon‑optimized version for Copilot+ PCs with Windows 11.

Integration with Magentic‑UI enables quick experimentation.

---

Technical Development

Synthetic Data Pipeline

Bottleneck: Lack of large‑scale, annotated computer interaction datasets.

Solution: Generate scalable synthetic multi‑step tasks from public websites using Magentic‑One.

Stages:

Task Proposal — Seeded from themed or random URLs; refined into actionable steps.
Benchmark: WebTailBench (11 categories, e.g., booking tickets, applying for jobs).
Task Solving — Multi‑agent system executes steps via:
Orchestrator
WebSurfer
UserSimulator
Trajectory Verification — Alignment, rubric, and multimodal checks to ensure fidelity.

Dataset:

145,000 trajectories
1 million steps across diverse sites and tasks

---

Training Fara‑7B

Base model: Qwen2.5‑VL‑7B
Input: Browser screenshots + action history + recent user messages
Output: Reasoning text + tool invocation (`click(x,y)`, `type()`, `visit_url()`…)
Training method: Supervised fine‑tuning (observe–think–act sequences)
No reinforcement learning used.

---

Evaluation

Benchmarks

WebVoyager
Online‑Mind2Web
DeepShop
WebTailBench (Microsoft‑created, real‑world tasks)

Testing uses BrowserBase for standardized sessions.

Result: Fara‑7B consistently outperforms comparable models, including some larger ones.

Performance Table

|---------------|------------|-----------------|----------|--------------|

| SoM Agent (GPT‑4o) | 65.1 | 34.6 | 16.0 | 30.0 |

| GLM‑4.1V‑9B‑Thinking | 66.8 | 33.9 | 32.0 | 22.4 |

| OpenAI computer‑use | 70.9 | 42.9 | 24.7 | 25.7 |

| UI‑TARS‑1.5‑7B | 66.4 | 31.3 | 11.6 | 19.5 |

| Fara‑7B | 73.5 | 34.1 | 26.2 | 38.4 |

---

Safety Measures

Principles

Transparency
User control
Sandboxed execution

Built‑in Protections:

Stops at Critical Points for explicit consent
High refusal rates for harmful tasks (82% in WebTailBench‑Refusals)
Microsoft red‑teaming on jailbreaks, prompt injections, unsafe outputs

Logging & Auditability: All actions logged for user review.

---

How to Use

Access:

Microsoft Foundry
Hugging Face
Try via Magentic‑UI (inference code provided)
Download for Copilot+ PCs in the VSCode AI Toolkit (NPU acceleration supported)

---

Looking Ahead

Goal: Build stronger on‑device CUAs via improved multimodal bases and reinforcement learning.
Early release focuses on community feedback and real‑world experimentation.
For contribution opportunities: Open roles at AI Frontiers.

---

Acknowledgements

Thanks to all contributors across engineering, research, and deployment teams who made Fara‑7B possible and brought it to Copilot+ PCs.

---

Tip for Creators: Combine Fara‑7B workflows with multi‑platform publishing via AiToEarn to maximize efficiency and monetization across major social and content networks, supported by integrated analytics and model ranking tools.