Co-founder of OpenAI, Ilya Sutskever, predicts the end of pre-training in artificial intelligence.
"We have reached the maximum of data and there will be no more."
Ilya Sutskever, co-founder and former chief scientist of OpenAI, was in the spotlight earlier this year after establishing his own artificial intelligence lab, Safe Superintelligence Inc. Although he has kept a low profile since his departure, he made a rare public appearance in Vancouver during the Conference on Neural Information Processing Systems (NeurIPS).
During his talk, Sutskever stated that “pre-training as we know it will undoubtedly come to an end.” This comment refers to the initial phase of developing an AI model, where a large language model learns patterns from massive amounts of unlabeled data, typically text from the internet, books, and other sources. He explained that “we have reached the peak of data, and there will be no more.”
In his speech, Sutskever emphasized that while he believes existing data can still drive the development of AI, the industry is running out of new data sources for training. This shift, he said, will eventually force a transformation in how models are currently trained. Sutskever compared this situation to fossil fuels, noting that just as oil is a limited resource, the internet has a finite amount of content generated by humans: “There’s only one internet.”
Sutskever also predicted that next-generation models will be “real agents.” The term "agent" has gained popularity in the field of AI and is generally understood as an autonomous AI system that performs tasks, makes decisions, and interacts with software independently. In addition to being an “agent,” he asserted that future systems will also have the capacity for reasoning. Unlike current AI, which mainly relies on prior patterns, forthcoming systems will be able to solve problems in a more sequential manner, similar to human thought processes.
He noted that the more a system reasons, “the more unpredictable it becomes,” comparing the unpredictability of these “truly reasoning systems” to the surprising capabilities of advanced AIs in games like chess, which become hard to predict even for the best human players. “They will understand things from limited data... they won't get confused,” he assured.
At the end of his speech, a member of the audience asked how researchers can establish the right incentive mechanisms for humanity to develop an artificial intelligence that possesses “the freedoms we have as Homo sapiens.” Sutskever responded that such questions deserve deeper reflection and, after a pause, admitted that he did not feel confident in answering, as it would require a “top-down governmental structure.”
One attendee suggested the use of cryptocurrencies, which elicited laughter from the room. Sutskever replied that perhaps what the attendee proposed could happen: “I don’t feel like the right person to comment on cryptocurrencies, but there’s a possibility that what you describe might occur. Maybe it wouldn’t be a bad outcome for AI to just want to coexist with us and have rights. Perhaps that would be okay... Things are incredibly unpredictable. I hesitate to comment, but I encourage speculation.”