
A recent study reveals that ChatGPT and Google Gemini perform poorly in summarizing news.
And it doesn't surprise me.
A recent analysis conducted by the BBC has revealed that several of the most widely used artificial intelligence chatbots, such as ChatGPT, Gemini, Copilot, and Perplexity, fail to accurately summarize news articles. To obtain results, these systems were asked to summarize 100 news articles from the BBC itself, and upon evaluation, it was found that 51% of the responses contained "significant issues." Furthermore, 19% of the responses that cited content from the BBC showed factual errors, including incorrect information regarding claims, figures, and dates.
The study included specific examples of inaccuracies, such as the case where Gemini incorrectly claimed that the NHS did not recommend vaping as a method to quit smoking. Additionally, ChatGPT and Copilot stated that Rishi Sunak and Nicola Sturgeon were still in their positions, despite having left them.
Beyond the mentioned errors, the report also highlighted that AI struggled to distinguish between facts and opinions, often resulting in a lack of essential context in the summaries provided. Although these results are not alarming considering the known issues with news summarization tools, such as the confusions of Apple Intelligence that led the company to temporarily disable its feature in iOS 18.3, it serves as a reminder that one should not accept what AI presents without skepticism.
The BBC concludes that Microsoft’s Copilot and Google’s Gemini presented more noticeable difficulties compared to OpenAI’s ChatGPT and Perplexity. While this research does not provide entirely new information, it reaffirms skepticism towards the summarization capabilities of these tools and underscores the importance of remaining cautious when consuming information generated by AI chatbots. As AI rapidly advances and new language models are launched almost daily, it is to be expected that errors will continue to arise. However, some recent evidence suggests that inaccuracies and "hallucinations" are less frequent in applications like ChatGPT compared to previous months.
Sam Altman recently mentioned that the advancement of AI exceeds Moore's Law, suggesting that we will continue to see improvements in these systems and their interaction with the environment. However, for the time being, it is advisable not to rely on artificial intelligence for everyday news.