I tried OpenAI's competitor, Browser Use, and it's impressive, although it requires some technical skill to use.
The control of artificial intelligence over internet usage requires some effort.
The recent unveiling of Operator, OpenAI’s first artificial intelligence agent, has attracted attention in the task automation world. However, there is already a new competitor in the market: Browser Use, a tool that allows users to perform online actions autonomously. This agent, known as the Computer-Using Agent (CUA), has the ability to write, search, click, and copy information from websites without the need for a mouse or keyboard, and it does not require the $200 monthly subscription for ChatGPT Pro.
Browser Use is available for free, especially for those who have the skills to work with API code. Although installation can be challenging for those who are not programming experts, this resource has just launched a cloud version that utilizes OpenAI's GPT-4o model. This option significantly simplifies the process, providing a more user-friendly interface. While this version costs $30 and has certain limitations, it may be more accessible for those looking to automate tasks without delving too deeply into code.
During tests of Browser Use, various tasks were carried out in real-world situations. In a price comparison exercise, it was asked to search for "MacBook Air M2" on Amazon, Best Buy, and Walmart, and to extract the details of the first five results from each site. The tool performed well in this task, though it did not manage to find hidden discounts. The ability to automate price monitoring across different sites proved to be quite impressive.
Another challenge was planning a trip, where it was asked to find a round-trip flight from New York to London. Browser Use found an option with British Airways at a price of $750, presenting all relevant details. This functionality could be very useful for frequent travelers, allowing for the automation of fare searches.
In a weather prediction test, a summary of the weather forecast for New York was requested, including temperature trends and the likelihood of rain. The tool not only retrieved the necessary information but also provided advice on how to dress according to the expected weather conditions.
The main difference between Browser Use and Operator lies in accessibility. Browser Use resembles a "Swiss Army knife" for developers, allowing for great flexibility but requiring a certain level of technical knowledge. In contrast, Operator functions as a personal assistant that simplifies many tasks, although it has limitations in customization and cost.
While Browser Use presents challenges, especially in designing requests and the need to start new interactions, it offers a powerful platform for those willing to explore and experiment. On the other hand, if a more user-friendly and straightforward option is sought, Operator might be the better choice. In any case, the future of web automation looks promising and is well on its way to expansion.