Generative Artificial Intelligence vs. Copyright

What Is the Impact of AI on Intellectual Property Rights?

What impact artificial intelligence will have on humanity in general is not something we at the Norwegian Industrial Property Office (NIPO) can answer. What we can say something about, however, is the potential impact of artificial intelligence on intellectual property rights.

It is particularly generative AI that has been in the spotlight during the recent wave of attention around artificial intelligence. Generative AI refers to systems capable of producing new, original content, such as text, images, or music. These systems manage this because they are exposed to large volumes of data relevant to the task they are designed for, such as text or image generation, and are programmed to detect patterns and relationships within that data.

The models can then use these patterns and connections when generating new content. A service like ChatGPT, for example, has been trained on massive amounts of text and, through this, has learned the structure and patterns of human language, enabling it to express itself in a remarkably human-like way.

Almost all of the material used to train such AI systems is protected by copyright, and the content they generate often resembles the types of works that have traditionally qualified for copyright protection. This gives rise to several legal questions, which we will explore in more detail in the rest of this article.

Does Training Generative AI on Copyrighted Material Infringe Copyright?

Training AI models, as briefly mentioned earlier, involves a process known as "text and data mining", or TDM. This is a broad term that refers to the automated analysis of large volumes of text and data to identify patterns and connections. Once this process is complete, the AI model can make use of what it has learned to generate new material.

In many cases, some form of copying is an unavoidable part of the TDM process. In principle, this may constitute copyright infringement. However, the purpose of text and data mining is not to reproduce the material, but to extract the underlying ideas and facts. Since copyright is not intended to protect ideas or facts, some find it paradoxical that the process may still amount to an infringement.

Not long after generative AI entered the spotlight, its reliance on copyrighted material for training sparked significant debate. On one hand, some argue that creators and right holders should be compensated when their works are used in AI training. On the other hand, some believe that TDM should be more freely permitted, since the goal is merely to extract ideas and facts, which in themselves are not subject to protection.

Because TDM may involve acts that fall within the scope of copyright, even if the aim is not to reproduce protected content, the EU has introduced specific rules to regulate such use. These rules are particularly relevant for training AI on copyright-protected material.

New EU/EEA Legislation

An EU directive on copyright in the digital environment from 2019, known as the Digital Single Market Directive, contains rules on text and data mining (TDM) in Articles 3 and 4. Since TDM is a key component in the training of generative AI, these provisions are highly relevant to the legality of using copyright-protected material for such purposes. The directive is expected to be implemented into Norwegian law sometime in 2025.

The directive makes certain forms of TDM lawful, even though they technically involve copying. Article 3 allows research institutions to perform TDM on content they have lawful access to. Article 4 gives similar rights to commercial actors, provided the right holder has not explicitly opted out. The exception or limitation provided for in paragraph 1 shall apply on condition that the use of works and other subject matter referred to in that paragraph has not been expressly reserved by their right holders in an appropriate manner, such as machine-readable means in the case of content made publicly available online. In other words, right holders are given an explicit right to object to the use of their works in AI training.

One challenge, however, is that there is currently no common standard for what such a reservation should look like. Since reservations can take many forms, it is difficult for AI systems to detect and respect them. As a result, models may be trained on content without proper authorisation. Whether a harmonised opt-out system will be introduced, and what it might look like, remains to be seen.

In May 2024, the EU adopted the AI Act, which sets out a legal framework for the safe and responsible development and use of AI. The regulation confirms that providers must respect opt-outs under the 2019 Directive and that this also excludes content from being used in AI training.

The AI Act also requires AI providers to be transparent about which copyrighted works were used during training. This allows creators to know when their works have been used to train AI and gives them a basis for making an informed choice about whether to opt out of text and data mining for future AI training.

Recital 106 of the AI Act states that the opt-out right applies even if the training takes place outside the EU. This aims to prevent companies from circumventing EU law by using foreign jurisdictions with weaker protections, but also raises questions about how this aligns with copyright's territorial nature.

Can Anyone Obtain Copyright for What Generative AI Creates?

We have now looked at the training of generative AI and whether this infringes copyright. Another side of the same issue is whether the material generated by AI can obtain copyright protection.

Generative AI can create works that typically fall within the categories protected by copyright, such as text, music, and visual art. But can someone obtain an exclusive right in the form of copyright to these works – and if so, who would be entitled to that right?

Can the AI Itself Obtain Copyright for What It Creates?

Since it is the AI that creates the content, one might first ask whether the AI model itself can obtain copyright to the works it produces. Section 2 of the Norwegian Copyright Act requires a "personal intellectual creation", which suggests that copyright does not apply to works generated by machines. Even though AI is often described as "intelligent", it is ultimately just advanced software. Norwegian legal theory also seems to agree that only humans can be considered authors of copyrightable works. It is therefore safe to conclude that an AI model cannot itself obtain copyright.

Can the Person Who Has Developed the AI Obtain Copyright for What It Creates?

Another question is whether the developer of the AI system could obtain copyright to the output. However, the developer has essentially just programmed a capable algorithm. Beyond that, they have no real connection to the specific works the AI model ends up generating. As mentioned, copyright requires a "personal intellectual creation" behind each work. To claim that the developers of a system like ChatGPT have made such a contribution to each of the millions of texts the system produces daily would be stretching it too far.

Can the Person Who Uses an AI to Create a Work Obtain Copyright?

A third question is whether the user of an AI model can obtain copyright. If the work is generated automatically based on a simple prompt, it will lack the "personal intellectual creation" required by law. However, if the user controls the process and uses the AI merely as a useful tool, providing their own input and specific instructions, this may in some cases form a basis for protection. The distinction lies in whether the AI is used as a tool, or whether the AI is left to do the creative work on its own.

Humanoid AI robot paints a composition on a canvas in studio

Can Content-Generating AI Infringe Copyright?

Generative AI has learned to generate its own, new works. However, what happens if generative AI produces something that is identical to or very similar to a work that already exists and is protected by copyright?

As we have seen, what AI produces is not simply the result of "copy and paste". Generative AI learns underlying patterns from existing works and uses those patterns to create something new. For instance, ChatGPT has learned the structure of human language and is therefore able to compose new sentences on its own.

If such an AI creates something that resembles another copyrighted work, but the type of expression involved allows for only limited creative choices, it might be the case of a coincidental independent creation. In other words, a work that is identical or highly similar to another but was created completely independently. If that is indeed the case, it does not constitute copyright infringement.

However, if the AI reproduces something that was included in its training data and the type of work allows for a broad range of creative choices, it would take quite a bit to prove that copyright infringement has not occurred.

Finding the Way Forward

One of the biggest challenges in the interaction between legal regulation and technology is that technology develops much faster than new laws can be adopted. However, with both the EU rules on text and data mining from 2019 and the AI Act from 2024, important steps have now been taken to prepare for the challenges posed by generative AI and copyright. Still, many questions remain unresolved, and much of it will have to be figured out as we go.

The same was true when the internet became widely accessible a few decades ago. This raised major copyright questions and challenges, much like the current explosion in AI. These were addressed through the application of existing regulations and specific adaptations where needed.

For your information, the Ministry of Culture and Equality is the responsible authority for copyright in Norway. However, at the Norwegian Industrial Property Office, we naturally follow developments and the relationship between copyright and AI, as a national centre of expertise on intellectual property.