
The new artificial intelligence model from DeepMind can assist robots in folding origami and sealing Ziploc bags.
On Wednesday, DeepMind unveiled two new models based on Gemini, which the company claims will lay the groundwork for a new generation of useful robots.
Since its launch at the end of last year, Gemini 2.0 has been integrated into various Google products, including a new AI-based chatbot. Now, Google DeepMind is leveraging that same technology for more innovative projects. Last Wednesday, the artificial intelligence lab presented two new models based on Gemini, which aim to "lay the groundwork for a new generation of useful robots."
The first of these models, Gemini Robotics, was designed by DeepMind to enable direct control of robots. The company claims that AI systems for robots must excel in three qualities: generality, interactivity, and dexterity. Generality refers to a robot's ability to adapt to new situations, even those not addressed in its training. Interactivity, on the other hand, encompasses the robot's ability to respond to people and its environment. Finally, dexterity relates to the fine motor skills that, while routine for humans, are complicated for robots.
"While our previous work showed progress in these areas, Gemini Robotics represents a significant step forward in performance across all these aspects, bringing us closer to general-purpose robots," states DeepMind.
For example, using Gemini Robotics, DeepMind's ALOHA 2 robot is capable of folding origami and sealing a Ziploc bag. This dual-arm robot understands all instructions given in everyday natural language. In a video shared by Google, it can be seen that it even manages to complete tasks despite facing obstacles, such as when a researcher moves the Tupperware where the robot was supposed to place the fruit.
Google has also partnered with Apptronik, the company behind the bipedal robot Apollo, to develop the next generation of humanoid robots. At the same time, DeepMind is introducing Gemini Robotics-ER (embedded reasoning). Regarding this second model, the company has indicated that it will allow robots to execute their own programs using Gemini's advanced reasoning capabilities. DeepMind is granting access to the system to "trusted evaluators," including Boston Dynamics, a former subsidiary of Google.