Google is launching a new Gemini 2.5 Computer Use model that can interact with the browser like a real user
Google launches the new Gemini 2.5 Computer Use model, which got the opportunity to interact with the browser like a real user — to press buttons, scroll pages, enter text, and perform other human actions. Such functionality paves the way for automation of tasks in environments where there is no API or where access to it is strictly limited.
The basis of the model’s work is a combination of visual recognition and logical reasoning, thanks to which it can follow complex user instructions. For example, Gemini 2.5 Computer Use is able to fill out and submit an online form, conduct interface testing or interact with web resources, imitating human actions. Google has already tested similar approaches in its internal experiments AI Mode and Project Mariner, but the new model brings them to the public level.
Gemini 2.5 outperforms the competition in a number of web and mobile benchmarks that test the system’s ability to work with interfaces. It supports 13 basic actions, including opening tabs, entering text into fields, dragging objects, navigating pages, and others. It is important that the model works exclusively through the browser and does not have access to the system level of the operating system, which provides additional security.
The functionality is already available to developers through Google AI Studio and Vertex AI, which allows you to integrate the capabilities of the model into your own projects. For the general public, the company opened a public demo on Browserbase, where users can watch Gemini 2.5 perform various tasks – from playing “2048” to search for discussions on the Hacker News platform.
The launch of Gemini 2.5 Computer Use represents a significant step forward in the development of artificial intelligence interaction tools with the digital environment, as the model combines the flexibility of human actions with the speed and accuracy of machine execution.




