Google's artificial intelligence enters its 'agent era'.
New artificial intelligence prototypes have been developed based on the latest model, Gemini 2.0.
I walked through a room filled with shelves containing quite conventional programming and architecture books. One of the shelves was slightly out of alignment, and behind it, I discovered a hidden room featuring three television screens displaying famous artworks: The Scream by Edvard Munch, A Sunday Afternoon on the Island of La Grande Jatte by Georges Seurat, and The Great Wave off Kanagawa by Hokusai. Bibo Xu, the principal product manager at Google DeepMind for Project Astra, remarked, “Here are some interesting pieces of art. Is there any particular one you would like to talk about?” The Astra Project, a prototype of a universal artificial intelligence agent from Google, replied without hesitation, “We previously discussed the artwork A Sunday Afternoon. Is there any specific detail you would like to comment on, or would you prefer to talk about The Scream?”
I found myself in the extensive Google campus in Mountain View, observing the latest projects from its artificial intelligence lab, DeepMind. One of these was Project Astra, a virtual assistant that was first introduced at Google I/O earlier this year. This assistant, currently available in an app, has the capability to process text, images, videos, and audio in real-time, answering questions about them. It works similarly to Siri or Alexa but feels more natural in conversation, can recognize its surroundings, and also “remembers” past interactions. Today, Google announced that it is expanding the testing program for Project Astra to more users, including trials that utilize prototype glasses, although no release date was provided.
Another previously unannounced test is the AI agent known as Project Mariner. This tool can control your browser and use a Chrome extension to complete tasks, although it is still in its early stages and has just begun testing with a group of “trusted testers.” Meanwhile, Project Astra has completed its testing phase, and Google is increasing the number of participants while integrating feedback into new updates. These improvements include understanding different accents and uncommon words, providing up to 10 minutes of memory per session, and reducing latency, in addition to its integration into various Google products like Search, Lens, and Maps.
During the demonstrations of both products, Google emphasized that what they were seeing were “research prototypes” that were not ready for consumers. The demonstrations were highly controlled, consisting of carefully guided interactions with Google staff. It is still unknown when these systems will be available to the public or what they will look like.
In that special environment, while Project Astra was sharing information about The Scream, it mentioned that there are four versions of this artwork by Norwegian expressionist artist Edvard Munch produced between 1893 and 1910, with the most famous being the one painted in 1893. Throughout the conversation, Astra displayed enthusiasm and some awkwardness. “Hi Bibo,” it exclaimed at the start of the demonstration. “Wow, that was really exciting,” Xu replied. “Can you tell me—” But she was interrupted by Astra: “Was there something about the artwork that was exciting?”
The concept of “agents” has been a rising trend in numerous AI companies, including OpenAI, Anthropic, and Google. Google CEO Sundar Pichai describes them as models that “can understand more about the world around them, think several steps ahead, and act on your behalf, with your oversight.” However, because AI systems are unpredictable, it is challenging to launch them at scale. Anthropic acknowledged that its new browser agent “suddenly took a break” during a programming demonstration.
Agents do not seem ready to access sensitive data such as emails or banking information. Although they follow instructions, they are vulnerable to command injection attacks. Google aims to protect itself from these attacks by prioritizing legitimate instructions from users.
The demonstrations of the agents presented by Google were low-risk. In the case of Project Mariner, I watched an employee open a recipe in Google Docs, interacting with the Chrome extension. When asked to add all the vegetables from the recipe to her cart on Safeway, Mariner activated, listing the tasks it was about to perform. However, the process seemed slow, and at that moment, it appeared to me that it would have been faster to complete it on my own.
Jaclyn Konzelmann, Google’s product management director, commented on this slowness: “The big question is, can it do it quickly? Not at this moment, as you can see, it moves quite slowly. This is partly due to technical limitations and partly a current design since it is still in its early stages.”
Despite these limitations, Google’s latest announcements, which also included a new artificial intelligence model, Gemini 2.0, are a testament to what is being referred to as the “agent era.” Although there are no consumer-ready products on the immediate horizon, it is clear that agents are the big goal companies want to achieve with large-scale language models.
Despite the imperfect nature of the Astra and Mariner prototypes, it remains interesting to see them in action. Personally, I am not entirely sure about trusting an AI for important information, but the idea of adding items to the shopping cart seems like a fairly low-risk endeavor, as long as Google manages to streamline the process.