EU and the world

Microsoft presented a new AI agent “Magma”, which is able to control software and robots

Microsoft company presented “Magma” is an integrated model of artificial intelligence that combines visual and speech data processing for interaction with software and robotic systems.

“Magma” is the first AI model that not only analyzes text, images and videos, but can also actively interact with them – for example, to navigate interfaces or manipulate physical objects. The project was created in collaboration with researchers from KAIST, the University of Maryland, the University of Wisconsin-Madison, and the University of Washington.

Unlike previous multimodal systems that used separate models for data analysis and management, “Magma” combines these capabilities in a single architecture. Microsoft positions this model as an important stage in the creation of agent AI that not only recognizes the environment, but is also able to independently develop strategies and perform multi-step tasks.

Magma is based on two key technologies: Set-of-Mark, which helps identify objects for interaction, and Trace-of-Mark, which analyzes video and learns movement patterns.

Thanks to these mechanisms, the model is able to perform complex tasks, including navigation in interfaces and control of robotic systems. This makes it not just a perception system, but a full-fledged multimodal agent capable of acting in the real world.

 

See also  Trump called Zelensky an "outstanding salesman": details

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles

Back to top button