Google Unveils Two New AI Models — Gemini Robotics and Gemini Robotics-ER

Google presented two new models of artificial intelligence: Gemini Robotics, designed for use in robotics based on Gemini 2.0, and Gemini Robotics-ER, which is characterized by improved spatial understanding. About this informs The Robot Report.
Google officials say they’ve made significant progress in Gemini’s ability to solve complex tasks by applying multimodal reasoning across text, images, audio and video. Thanks to new models, these possibilities now go beyond the digital space into the real world.
Gemini Robotics is an advanced vision-language-action (VLA) model that adds physical actions as a new output channel, enabling direct control of robots. Gemini Robotics-ER provides enhanced spatial understanding, allowing robots to run their own programs through embodied reasoning.
Both models pave the way for robots to perform a much wider range of tasks in the real world. Within this project, the company cooperates with Apptronik to create humanoid robots that will work on the basis of Gemini 2.0.