By Jonathan Kemper
Publication Date: 2025-11-29 09:04:00
Microsoft’s new Fara-7B model is a compact AI system built to operate user interfaces purely through visual input. Despite its small size, it aims to keep pace with far more complex systems while running locally on consumer devices.
Fara-7B is based on Alibaba’s Qwen2.5-VL-7B and, according to Microsoft, relies solely on visual information. Instead of tapping into accessibility trees or parsing HTML, it works directly off screenshots of the interface. The model runs in a loop of observing, thinking, and acting, predicting click coordinates or generating keystrokes as needed. It uses the…