The advanced voice mode of ChatGPT could include a new 'Live Camera' feature.
Detectives have found mentions of "live camera" in the code of ChatGPT.
Developers have found references to "Live camera" in the ChatGPT code, suggesting that real-time vision capabilities may be available soon. Recently, specific lines were detected in the Advanced Voice Mode section of the beta version of ChatGPT v1.2024.317, indicating the inclusion of a feature known as "Live camera." This code includes a warning for users, stating that they should not use the live camera "for real-time navigation or decisions that may affect their health or safety." Another line of the code seems to provide instructions on visual capabilities, suggesting that users can "tap the camera icon to allow ChatGPT to see and converse about their environment."
Expectations regarding ChatGPT's visual skills have been high since the presentation of GPT-4o, which was showcased at an OpenAI event last May. During that demonstration, it was shown how this new version could use mobile or desktop cameras to identify objects and remember details about what it saw. A highlighted example was the recognition of a dog playing with a tennis ball, recalling that its name was "Bowser."
Since the event and the early access granted to some testers, not much has been shared about GPT-4o and its visual capabilities. However, OpenAI had introduced the Advanced Voice Mode for ChatGPT Plus and Team users in September. If the vision mode is implemented soon, users will have the opportunity to test both features that were announced last spring.
Despite some reports of disappointing results in its future models, OpenAI has been active. Last month, it launched ChatGPT Search, connecting the AI model to the web and offering real-time information. Additionally, there are rumors about the development of an agent capable of performing complex tasks on behalf of the user, such as writing code and browsing the internet, which is expected to be released in January.