What significance AI will have for humanity in general, we at the Norwegian Industrial Property Office (NIPO) are not able to answer. What we can say something about however, is what significance artificial intelligence will have for intellectual property rights.
In particular it is generative AI that has been in focus in the AI wave that has washed over the world recently. Generative AI is artificial intelligence that can produce new, original content – such as text, images and music. Such an AI model is able to do this because it is exposed to large amounts of data related to the task it was designed for, such as text or image production, and it is programmed to find patterns and relationships in the data.
These patterns and connections that it has learned can be applied again when it comes to generating new content. A service such as ChatGPT, for example, has been trained on huge amounts of text, and has learned patterns and connections in human language from all this text, and can thus formulate itself in a very human-like way.
Almost all material on which such an AI is trained is protected by copyright, and the material that the AI in turn produces is material that traditionally has qualified for copyright protection.
Does training generative AI on copyrighted material infringe copyright?
Training of AI models, which we touched upon at the beginning, involves something called "text and data mining" - often abbreviated to "TDM". This is an umbrella term that refers to the automatic process where large sets of text and data are analyzed in order to discover patterns and relationships. Once the TDM process is done, the AI model can apply what it has learned to generate new material.
When carrying out TDM, some type of copying of the material that you wish to extract from will often be an inevitable step in the process of . Therefore, such extraction infringes copyright in principle. However, the purpose of the text and data mining is not to copy this material, but to extract the ideas and facts behind it. Copyright is not intended to safeguard and protect ideas and facts. Therefore, it is perhaps somewhat paradoxical that one infringes copyright when carrying out text and data mining.
It didn't take long from generative AI to start making headlines, until its use of copyrighted material started a a major debate. On the one hand, you have those who believe authors should be compensated for the use of works in AI training. And on the other hand you have those who believe that one should be able to perform TDM and training of AI more freely on copyrighted material, since the actual purpose is only to extract pure ideas and facts, which in themselves are not meant to be protected.
As TDM in principle infringes copyright, but does not have as its purpose to copy, the EU has found it necessary to come up with some special rules that regulate exactly this. These rules will have an impact on the training of AI on material protected by copyright.
New EU/EEA legislation
An EU directive on copyright in the digital sphere from 2019, called the Digital Markets Directive, contains rules related to TDM in the directive's articles 3 and 4. Since TDM techniques are used to train generative AI, these rules will be of great importance in realtion to the legality of training AI on copyrighted material. This directive will shortly be implemented in Norwegian legislation - presumably during 2024.
The rules in this EU directive contain a limitation to copyright, so that text and data extraction becomes legal as a starting point. According to Article 3 of the directive, research institutions will be able to freely exercise text and data mining of all copyright-protected material. Article 4 states that other actors, such as commercial enterprises, can carry out text and data mining of all copyright-protected material, as long as the author(s) has not made an explicit reservation against this.
As such, authors have an opportunity to reserve against their work being used in TDM in a number of cases, and therefore also in training of generative AI. A challenge here, however, is that there are currently no commonly agreed guidelines on how such reservations should look like. Since such reservations basically can appear in many different ways, it will be a challenge for the AI models that are trained on copyright material to discover the reservations. There is thus a risk that the AI by mistake is trained on material that is not permitted for training. Time will show whether such guidelines for reservations will be made, and what they will look like.
The EU countries have also recently reached a political agreement on the draft of their own AI act (regulation) called the "Artificial Intelligence Act", or "AI Act". This regulation, which will enter into force in 2025 at the earliest, is intended to be a harmonized framework with common rules and safety measures for the development and use of artificial intelligence in a number of areas.
When it comes to copyright, it appears that the AI Act will state that AI providers must respect rights holders' opportunity to reserve againsttext and data mining under the 2019 Digital Markets Directive, and thus also to AI training. The fact that the TDM rules from 2019 will have an impact on AI training has long been assumed, but this seems to be made explicit by the AI Act. Furthermore, the AI Act requires that AI providers must make visible which copyright-protected material has been used in the AI training. In this way, authors will have the opportunity to know when their work is used in AI training, and will then be able to make an informed choice on reservation against TDM as part of AI training on later occasions.
We believe that these new rules from the EU can contribute to a clearer framework when it comes to the relationship between AI training and copyright.
Does anyone get copyright for what generative AI creates?
We have now looked at the training of generative AI and whether this infringes copyright. Another side of the same issue is whether the material created by AI gets copyright.
Generative AI can create works that typically fall within the categories of works protected by copyright, such as text, music and visual art. But will someone get an exclusive right in the form of copyright to these works - and if so, who would get this exclusive right?
Can the AI itself obtain copyright for what it creates?
Since it is, after all, the AI that creates the work, it is perhaps most natural to firstly ask whether the AI model itself can obtain copyright for the works it creates. Section 2 of the Intellectual Property Act requires that there must have been an "individual creative effort" in order for someone to obtain copyright to a work. This requirement gives the impression that copyright does not apply to works created by machines. Although they are referred to as having "intelligence", AIs are really just advanced computer programs. In Norwegian legal theory, there also seems to be agreement that only people can create intellectual property. One can therefore safely conclude that an AI model cannot itself obtain copyright.
Can the person who has developed the AI obtain copyright for what it creates?
Another question is whether the person who developed the AI should obtain copyright. But those who have developed the actual AI model have really only programmed a good algorithm. Beyond that, they have no further knowledge of the specific works that the AI model will produce. As mentioned, there must have been an "individual creative effort" for someone to obtain copyright. To say that the developers of a service such as ChatGPT have exerted such an effort in each of the millions of texts that the service generates daily, would probably be stretching it too far.
Does the person who uses an AI to create a work get copyright?
A third question that can be asked in connection with copyright to AI-generated material is whether the user of AI can obtain copyright. In some cases, just a few keystrokes by the user will generate works. This obviously cannot be said to be an "individual creative effort". But what if users of an AI is extremely specific in what they order from the AI, are completely aware of what end result they want, and actually use the AI more as an artistic tool than an end station? Can this justify a right? Here there can probably be room for doubt in some cases.
Can content-generating AI infringe copyright?
Now we have seen that generative AI, after being trained on thousands of already existing works, has learnt to generate its own, new works. But what happens if generative AI produces something identical or highly similar to something that already exists and is protected by copyright?
As we have seen, it is not the case that what AI produces is a result of "cut and paste". Generative AI has learnt the underlying patterns in existing works, and uses this to produce something entirely new. For instance, ChatGPT has become so good at understanding the system in human language that it has learnt to generate new sentences all by itself.
If such an AI creates something similar to another intellectual work, but it is a form of work where there are few creative options to arrive at the final result, the AI work may perhaps be regarded as an accidental double creation. A double creation is a work which is identical or very similar to another, but which has been created completely independently of the other. If it is in fact such a double creation, it will not constitute an infringement of copyright.
But if an AI reproduces something that is included as part of the training material, and there has also been a lot of room for choice, it will take quite a bit more to prove that there has not been an infringement of copyright here.
The road ahead – the road is created as you walk
One of the biggest challenges when it comes to the interaction between legislation and technology is that technology develops much faster than it is possible to adopt new laws. But with both the EU rules on text and data mining from 2019, and the forthcoming AI Act, important steps have now been taken to prepare for the challenges between generative AI and copyright. However, much is still unclear, and some of the way will probably have to be made up as you go.
This was also the case when the internet became public property a few decades ago. This created major copyright questions and challenges, as the AI explosion has done today. This was solved both through the application of existing regulations and through special adaptations where there was a need.
For your information, the Ministry of Culture is the department in charge of copyright in Norway. However, as a competence center for intellectual property rights we at NIPO naturally follow the development and relationship between copyright and AI.