The AIRIA Bill Would Force the Commerce Department to Bite Off More Than It Can Chew
A bipartisan group of senators led by Sen. John Thune (R-SD) and Sen. Amy Klobuchar (D-MN) introduced a bill this week to increase transparency and accountability for high-risk AI applications. The Artificial Intelligence Research, Innovation, and Accountability (AIRIA) Act of 2023 seeks to balance AI innovation and accountability, and while it generally strikes that balance better than several other AI policy proposals, it puts the horse before the cart, requiring the Commerce department to come up with technical solutions to complex, nontechnical problems that haven’t been fully defined yet.
The bill includes several reasonable ideas. It distinguishes between developers and deployers of AI systems and focuses on incentivizing AI deployers to protect consumers, an important distinction that avoids placing unnecessary burdens on AI developers who cannot reasonably foresee all downstream uses of their software. It also tasks the Government Accountability Office (GAO) with conducting a study to identify barriers to the use of AI in the federal government and recommendations on how to spur public sector AI adoption. Overall, it embodies a light-touch, risk-based approach to AI regulation that other lawmakers would be wise to replicate.
However, the key provision in the bill to create a certification framework for “critical-impact AI” is flawed. This provision takes a solution for AI accountability used in defense contexts and tries to shoehorn it into nondefense contexts. Specifically, the bill tries to emulate the testing, evaluation, validation, and verification (TEVV) framework that the Department of Defense (DOD) uses for reviewing AI systems intended to be deployed in military applications. The proposed bill would require the Department of Commerce to develop TEVV standards for high-impact AI systems that are intended to be deployed in nondefense applications, and proposes that organizations intending to deploy such systems should self-certify their compliance.
The first problem is that the scope of application areas that Commerce’s framework is supposed to cover is incredibly broad. The bill defines critical-impact AI systems as those that use or collect biometric data, operate or manage critical infrastructure, are deployed the criminal justice system, or used in a way that “poses a significant risk to rights afforded under the Constitution of the United States or safety.” That means Commerce has to create technical standards that are simultaneously able to robustly test an AI system that a judge wants to use to predict recidivism risk, an AI system that an energy company wants to use to optimize energy from the grid, and an AI system an edtech company wants to use to personalize learning in classrooms, to name three examples of infinitely many. Standardizing for such disparate contexts is not feasible.
Moreover, the objective of this scheme is not clear. Accountability and transparency are means to an end, not an end in themselves, though an elected official’s answer might be that the end goal in all these contexts is to build “trust” with consumers. However, while “building trust” may fly as a public policy goal, creating technical standards would force Commerce to bring analytic clarity to such vague goals. In the context of criminal justice, trust might mean systems are fair and unbiased; in critical infrastructure contexts trust might mean systems are reliable and safe; while in contexts of constitutional rights trust might mean transparency. The crucial term here is “might”—these are complex questions that many other agencies need to answer before the Commerce Department can develop technical standards.
AIRIA is jumping the gun. While there is certainly public pressure to come up with a solution to risks from AI systems, technical solutions cannot address policy problems that haven’t been thoroughly examined or defined. Rushing to establish AI standards without a clear understanding of the nuanced requirements in different sectors risks creating a framework that does not effectively address diverse contexts. Policymakers should first focus on creating well-defined policy objectives and only then consider creating technical standards that align and support those goals.