The EU’s AI Act Creates Regulatory Complexity for Open-Source AI
After years of negotiations, EU policymakers have finally arrived at an agreement on the Artificial Intelligence Act (AI Act), a new European law to regulate this emerging technology. One point of debate until the final hours was how the EU should address open-source AI—AI models that developers make freely available to the public. EU policymakers were particularly concerned about the impact of the AI Act on open-source because many of Europe’s more successful AI startups have released open-source models. While the final text of the AI Act excludes open-source AI from some obligations, these exclusions only apply under certain limited conditions. Combined with the law’s broad scope, the AI Act will significantly impact the development and use of open-source AI in the EU.
The AI Act creates a comprehensive set of rules for providers, deployers, importers, distributors, and product manufacturers of AI systems. The law groups AI systems into four categories: unacceptable risk (e.g., AI systems used for remote real-time biometric identification, subliminal manipulation of individuals, and social scoring), high risk (e.g., AI systems used in essential services like banking, insurance, employment, education, migration, law enforcement, and elections), limited risk (e.g., AI systems intended to interact directly with individuals), and minimal risk (i.e., everything else). The law bans AI systems with unaccepted risk; requires high-risk AI systems to follow certain rules, including related to data governance, technical documentation, risk monitoring, and impact assessments; and establishes transparency requirements for limited-risk AI systems. Finally, the AI Act creates additional transparency and accountability rules for general-purpose AI (GPAI) models, also known as foundation models.
Open-source AI provides an important pathway for innovation since anyone can freely use and modify the code and data for research and commercial purposes. Developers can contribute to an open-source project or create their own based on an existing project. By sharing code and data, this collaborative process accelerates development, cuts costs, and democratizes access to AI technology. For example, developers can fine-tune open-source AI models for specific applications thereby facilitating adoption. Open-source AI not only enables research and development on more capable AI models, but also progress on other metrics such as explainability, bias, safety, and efficiency.
The AI Act has some unique rules for open-source AI. The law makes no exceptions for open-source AI systems when it comes to the bans on AI with unacceptable risk and restrictions on high-risk AI systems. For other non-GPAI systems, the AI Act does not apply to third parties making open-source AI products publicly available. However, this exemption only applies if they are not monetizing their products. As a result, any company that attempts to monetize its open-source AI products, such as by offering paid technical support for the open-source model or using targeted ads to cover costs, would not be able to make use of this exemption. The law also states that open-source developers “should be encouraged to implement widely adopted documentation practices, such as model cards and data sheets” but provides no detail on what this encouragement should look like in practice.
To complicate matters, the AI Act also exempts AI systems “specifically developed and put into service for the sole purpose of scientific research and development” from its rules. While this exemption is useful for allowing scientific research, AI models produced for academic purposes under an open-source license can then be repurposed for commercial purposes. This provision effectively creates a loophole where AI models produced for scientific purposes evades the safety regulations the EU has created under the belief that such rules are necessary to prevent harm from AI.
As noted previously, the AI Act creates unique rules for GPAI models. Open-source GPAI models do not have to follow the law’s requirements to provide technical documentation to the EU’s AI Office about the AI model unless these models present a systemic risk. Regulators will decide later which models fall into this category, so this may include many open-source models. For now, it includes any AI model that uses more than 1025 floating point operations (FLOPs) as part of its training, an arbitrary threshold not based on any specific known risk. Regulators could later decide that an AI model with a certain number of parameters, trained on a certain volume of data, or reaching a certain number of users constitutes a systemic risk. All GPAI models, even those that do not present a systemic risk, must publicly disclose information about the content used for training and put in place a policy to ensure it respects EU copyright law. Finally, the AI Act requires providers of GPAI models with a systemic risk, including open-source GPAI models, to appoint an authorized representative to cooperate with the EU’s AI Office and other national authorities.
While EU policymakers have tried to address some of the concerns of the open-source community, it is clear that many open-source AI projects will still fall under the AI Act’s rules. In some cases, such as when a company is unilaterally developing an open-source AI model, compliance will not be any different than if the company was developing a proprietary AI model. Indeed, there is no reason the rules should favor or penalize open-source business models. However, for open-source AI projects that are based on decentralized contributions of individual developers who are not backed by a single company, the complexity of these new rules might make it harder to bring open-source AI to the EU.
Appendix: Key sections from the AI Act relating to open-source AI
Recital 57e
Third parties making accessible to the public tools, services, processes, or AI components other than general-purpose AI models, shall not be mandated to comply with requirements targeting the responsibilities along the AI value chain, in particular towards the provider that has used or integrated them, when those tools, services, processes, or AI components are made accessible under a free and open license. Developers of free and open-source tools, services, processes, or AI components other than general-purpose AI models should be encouraged to implement widely adopted documentation practices, such as model cards and data sheets, as a way to accelerate information sharing along the AI value chain, allowing the promotion of trustworthy AI systems in the Union.
Recital 60i
Software and data, including models, released under a free and open-source license that allows them to be openly shared and where users can freely access, use, modify and redistribute them or modified versions thereof, can contribute to research and innovation in the market and can provide significant growth opportunities for the Union economy. General purpose AI models released under free and open-source licences should be considered to ensure high levels of transparency and openness if their parameters, including the weights, the information on the model architecture, and the information on model usage are made publicly available. The licence should be considered free and opensource also when it allows users to run, copy, distribute, study, change and improve software and data, including models under the condition that the original provider of the model is credited, the identical or comparable terms of distribution are respected.
Recital 60i+1
Free and open-source AI components covers the software and data, including models and general purpose AI models, tools, services or processes of an AI system. Free and opensource AI components can be provided through different channels, including their development on open repositories. For the purpose of this Regulation, AI components that are provided against a price or otherwise monetized, including through the provision of technical support or other services, including through a software platform, related to the AI component, or the use of personal data for reasons other than exclusively for improving the security, compatibility or interoperability of the software, with the exception of transactions between micro enterprises, should not benefit from the exceptions provided to free and open source AI components. The fact of making AI components available through open repositories should not, in itself, constitute a monetization.
Recital 60f
The providers of general purpose AI models that are released under a free and open source license, and whose parameters, including the weights, the information on the model architecture, and the information on model usage, are made publicly available should be subject to exceptions as regards the transparency-related requirements imposed on general purpose AI models, unless they can be considered to present a systemic risk, in which case the circumstance that the model is transparent and accompanied by an open source license should not be considered a sufficient reason to exclude compliance with the obligations under this Regulation. In any case, given that the release of general purpose AI models under free and open source license does not necessarily reveal substantial information on the dataset used for the training or fine-tuning of the model and on how thereby the respect of copyright law was ensured, the exception provided for general purpose AI models from compliance with the transparency-related requirements should not concern the obligation to produce a summary about the content used for model training and the obligation to put in place a policy to respect Union copyright law in particular to identify and respect the reservations of rights expressed pursuant to Article 4(3) of Directive (EU) 2019/790.
Recital 60o
It is also necessary to clarify a procedure for the classification of a general purpose AI model with systemic risks. A general purpose AI model that meets the applicable threshold for high-impact capabilities should be presumed to be a general purpose AI models with systemic risk. The provider should notify the AI Office at the latest two weeks after the requirements are met or it becomes known that a general purpose AI model will meet the requirements that lead to the presumption. This is especially relevant in relation to the FLOP threshold because training of general purpose AI models takes considerable planning which includes the upfront allocation of compute resources and, therefore, providers of general purpose AI models are able to know if their model would meet the threshold before the training is completed. In the context of this notification, the provider should be able to demonstrate that because of its specific characteristics, a general purpose AI model exceptionally does not present systemic risks, and that it thus should not be classified as a general purpose AI model with systemic risks. This information is valuable for the AI Office to anticipate the placing on the market of general purpose AI models with systemic risks and the providers can start to engage with the AI Office early on. This is especially important with regard to general-purpose AI models that are planned to be released as open-source, given that, after open-source model release, necessary measures to ensure compliance with the obligations under this Regulation may be more difficult to implement.
Article 2 (Scope), 5a
This Regulation shall not apply to AI systems and models, including their output, specifically developed and put into service for the sole purpose of scientific research and development.
Article 2 (Scope), 5g
The obligations laid down in this Regulation shall not apply to AI systems released under free and open source licenses unless they are placed on the market or put into service as high-risk AI systems or an AI system that falls under Title II and IV.
Article 28 (Responsibilities Along the AI Value Chain), 2b
The provider of a high risk AI system and the third party that supplies an AI system, tools, services, components, or processes that are used or integrated in a high-risk AI system shall, by written agreement, specify the necessary information, capabilities, technical access and other assistance based on the generally acknowledged state of the art, in order to enable the provider of the high risk AI system to fully comply with the obligations set out in this Regulation. This obligation shall not apply to third parties making accessible to the public tools, services, processes, or AI components other than general-purpose AI models under a free and open license.
The AI Office may develop and recommend voluntary model contractual terms between providers of high-risk AI systems and third parties that supply tools, services, components or processes that are used or integrated in high-risk AI systems. When developing voluntary model contractual terms, the AI Office shall take into account possible contractual requirements applicable in specific sectors or business cases. The model contractual terms shall be published and be available free of charge in an easily usable electronic format.
Article 52c (Obligations for Providers of General Purpose AI Models), -2
The obligations set out in paragraph 1, with the exception of letters (c) and (d), shall not apply to providers of AI models that are made accessible to the public under a free and open license that allows for the access, usage, modification, and distribution of the model, and whose parameters, including the weights, the information on the model architecture, and the information on model usage, are made publicly available. This exception shall not apply to general purpose AI models with systemic risks.
Article 52ca (Authorized Representative), 5
The obligation set out in this article shall not apply to providers of general purpose AI models that are made accessible to the public under a free and open source license that allows for the access, usage, modification, and distribution of the model, and whose parameters, including the weights, the information on the model architecture, and the information on model usage, are made publicly available, unless the general purpose AI models present systemic risks.