Cover Image for Finally, we have an 'official' definition for open-source artificial intelligence.
Wed Oct 30 2024

Finally, we have an 'official' definition for open-source artificial intelligence.

The OSI, which has self-proclaimed itself as the arbiter of everything related to open source software, has released its first definition of 'open source' artificial intelligence.

The Open Source Initiative (OSI), an organization established to define and manage the concept of open source, has released version 1.0 of its Open Source AI Definition (OSAID). This document, the result of several years of collaboration with academia and industry, aims to establish a standard that allows individuals to determine whether an AI model can be considered open source.

A key objective of this definition is to help policymakers and AI developers have a common understanding. Stefano Maffulli, OSI's executive vice president, emphasized that regulations surrounding AI are under scrutiny, and some entities, like the European Commission, have shown interest in granting special recognition to open source initiatives. To achieve this consensus, the OSI made a deliberate effort to engage a wide variety of stakeholders beyond the usual tech groups.

For an AI model to be classified as open source under the OSAID, it must provide sufficient information about its design that allows a person to recreate it "substantially." Furthermore, relevant details about the training data, including its origin and processing, must be disclosed. According to Maffulli, an open source AI model must allow for a comprehensive understanding of its construction, including access to the complete code used in training and data filtering. Rights of use for developers are also established, such as the freedom to employ and modify the model without the need for prior permissions.

Despite the importance of the OSAID, the OSI lacks enforcement mechanisms to pressure developers to adhere to this definition. However, it plans to highlight those models that are described as "open source" but do not meet the established criteria. Although the results of this approach have been varied, the goal is for the AI community not to recognize as "open source" those that are not truly so.

Historically, many companies have used the term "open source" ambiguously. Companies like Meta have faced criticism for using this classification to describe their strategies for releasing AI models, which in many cases do not meet the OSAID standards. For example, Meta requires platforms with over 700 million monthly active users to request a special license to use its Llama models. After discussions with the OSI, Google and Microsoft agreed to stop using the term for models that are not completely open, while Meta has maintained its position.

Some startups have also faced criticism, such as Stability AI and Mistral, which have implemented restrictions that limit the commercial use of their models. A study from last August revealed that many "open source" models are actually only nominally open, as the necessary data to train them is not disclosed, and the computational power to run them is unattainable for many developers.

Opinions on this situation are divergent. Meta rejected the criticism and questioned the OSAID definition, arguing that its licensing policies act as safeguards against harmful implementations, and that it is taking a cautious approach in disclosing information about its models. The company points out that other efforts are underway to define "open source AI."

Finally, some experts suggest that the OSAID may not adequately address the issue of licensing training data, which could limit its effectiveness in practice. Maffulli acknowledges that the definition will require updates, and the OSI has formed a committee to oversee its implementation and propose future amendments. This effort is being carried out collaboratively, taking into account the views of various stakeholders in the AI field.