If AI Training Is Theft, Then Everyone’s a Thief
The UK government is weighing changes to its copyright laws, sparking backlash from the creative industries—especially the concerted Make It Fair campaign, which argues that AI training amounts to the unfair, unpaid theft of creative content. Yet, if AI training is theft, then artists, researchers, historians, indeed every human, is a thief. Humans learn from what other humans produce. AI does the same. So, rather than accept the flawed narrative of AI as theft, the UK should uphold a permissive copyright regime aligned with its goals of driving economic growth and becoming an AI superpower.
Some creators argue that training AI on online copyrighted content without payment is theft and should require a licence. They also say it strips them of control and reflects broader mistreatment of content owners by AI companies. But these creators misunderstand that human learning has always built on what came before, without the need for permission, payment, or even attribution.
UK copyright has never blocked people from learning from existing works to create something new. It clearly distinguishes between copying and drawing inspiration—a line that fosters creativity, learning, and innovation. Just as artists, researchers, and musicians build on past work, creators have long benefited from this flexibility. Even Sir Paul McCartney, active in today’s AI and copyright debate, imitated Little Richard’s vocal style in She Loves You—a move entirely legal under UK law.
AI learning is no different. AI systems that analyse large volumes of publicly available content to identify patterns and generate new, transformative outputs are a continuation—at scale—of this same, long-protected human tradition. The only copying that takes place is in the formation of the training dataset, a necessity that should fall under the same exception that search engines rely on to provide search results. It’s often impossible for humans to trace how specific sources influence a final work. Similarly, AI can’t pinpoint what inspired a particular output, especially since it draws from numerous sources, making individual contributions negligible.
Creators have more to gain from advancing frontier AI than restricting it. When OpenAI launched a new image generator, users began recreating art in the style of animation studio Studio Ghibli, sparking concerns about originality. Yet Ghibli itself draws from the shin-hanga artistic movement, which was inspired by both traditional ukiyo-e art and Western impressionism. Like Ghibli, shin-hanga succeeded by reimagining past creativity in new ways. Rather than devaluing Ghibli’s work, AI reignited interest in its artistic roots—boosting cultural engagement and even raising the parent company’s share value during the “Ghiblification” trend.
McCartney also relied on AI to produce the documentary series The Beatles: Get Back, and the 2023 hit single Now and Then, dubbed “the last Beatles song,” that sampled unfinished recordings from deceased band member John Lennon. The custom-made AI technology isolated Lennon’s vocals, which McCartney stated was exciting to give him the opportunity to still work on Beatles music decades later. The single reached number 1 on the UK Official Chart, the first time the band topped the chart in 54 years, and became the fastest-selling vinyl single of the century.
Significantly, even with a permissive copyright regime, creators remain protected in areas where copyright law has always protected them. For example, if someone were to use OpenAI’s new tool to develop a film that is both similar in style and narrative to previous Studio Ghibli films, this would easily fall foul of current UK copyright rules because copyright continues to protect rightsholders against profiting off another creator’s success. Training AI does not threaten this ethos.
UK policymakers should resist pressure to reshape copyright law based on the short-term, proprietary concerns of some creators. Instead, they should focus on the broader public interest in ensuring the UK becomes globally competitive in AI. To become an AI leader, the UK should avoid imposing unnecessary barriers on the use of publicly available data for training AI systems. It should reject the House of Lords’ proposed copyright amendments to the Data Use and Access Bill and affirm that text and data mining is permitted for all purposes under the UK’s copyright law. At the same time, the government should continue to enforce existing copyright protections on AI-generated outputs.
The choice before the UK government is an easy one. Using publicly available data to train AI is not theft—but it is essential to innovation. If the UK is serious about becoming a global AI leader, it should adopt a forward-looking copyright framework that enables, rather than obstructs, AI development. To start, it should embrace a modern, permissive approach to copyright that protects creative rights without sacrificing the UK’s long-term technological potential.