Google Unveils Project Mariner AI Agents for Web Automation

Project Mariner

Google has introduced its first AI agent, Project Mariner, which can perform actions on the web, marking a significant advancement in user interaction with technology. Unveiled on Wednesday, this research prototype from Google's DeepMind division is powered by the Gemini model and operates directly within the Chrome browser. It mimics human-like behavior by controlling the cursor, clicking buttons, and filling out forms, enabling it to navigate websites seamlessly.

Initially, Google is rolling out Project Mariner to a select group of pre-approved testers. The company is exploring new ways for Gemini to read, summarize, and interact with websites, representing a "fundamentally new UX paradigm shift." This shift aims to reduce direct user interaction with websites, allowing a generative AI system to handle tasks on behalf of users.

In a demonstration with TechCrunch, Google Labs Director Jaclyn Konzelmann showcased how Project Mariner functions. After installing an extension in Chrome, users can instruct the AI agent through a chat window. For example, users can command it to "create a shopping cart from a grocery store based on this list." The agent then navigates to the specified grocery store's website—such as Safeway—searching for items and adding them to a virtual cart.

However, the agent's performance has room for improvement; there was a noticeable delay of about five seconds between cursor movements. At times, it paused to seek clarification on specific items before proceeding. Importantly, Project Mariner is designed not to handle sensitive tasks like checking out or entering credit card information, ensuring that users maintain control over their data.

Behind the scenes, the AI agent takes screenshots of the browser window—an action that requires user consent—and sends these images to Gemini in the cloud for processing. The AI then relays instructions back to the user's computer for navigating web pages.

Project Mariner can assist with various tasks such as finding flights and hotels, shopping for household items, and discovering recipes—activities that typically require manual navigation through websites. However, it currently operates only on the active tab of the Chrome browser, limiting multitasking capabilities while it works.

Google DeepMind's Chief Technology Officer, Koray Kavukcuoglu, emphasized that this approach allows users to remain aware of what the AI agent is doing. "Because [Gemini] is now taking actions on a user’s behalf, it’s important to take this step-by-step," he explained.

While website owners may find some comfort in knowing that Project Mariner operates within their browser windows—ensuring user engagement with their sites—the introduction of this AI agent could lead to decreased direct interaction with websites over time.

In addition to Project Mariner, Google announced several other AI agents designed for specific tasks. One such agent, Deep Research, assists users in exploring complex topics by generating multistep research plans. Another new agent named Jules integrates with GitHub workflows to aid developers with coding tasks.

As Google continues to refine Project Mariner and its other AI agents, it remains unclear when these innovations will be available to a broader audience. However, their potential impact on how users interact with the web could reshape online experiences significantly.