I. Introduction
In September 2022, OpenAI, a charity that later added a for-profit entity to its governance structure,Footnote 1 released a free version of a chatbot system named ChatGPT, which it then turned into a pay-for subscription plan. Since then, ChatGPT’s successors, and other similar products,Footnote 2 have gotten the world talking about their shocking capabilities. This new wave of commercial chatbots also prompted a debate on the possibility that humanity may be getting closer to a new, more powerful type of artificial intelligence (AI)Footnote 3 and on all its potentially disruptive effects on our society – from the job market to education and beyond.
This paper focuses on this new generation of chatbots released commercially at the end of 2022 and during 2023 and more generally discusses all current and future chatbots exploiting Transformer-based large language models (commonly abbreviated as “LLMs”). With the commercial release of these products, humans started to wonder whether this impressive technology questions our place in the world and whether a future in which humans would be obsolete is approaching at a faster pace than we could have ever anticipated.Footnote 4
Yet, the fear that these machines will bring about the end of human civilisation as we know it, and other dystopian and eerie scenarios,Footnote 5 obfuscate the more imminent risks that are associated with the underlying technology.Footnote 6 Some such risks may have already occurred and become more severe because these chatbots have been made available to a wide share of the population without any prior actions being taken to addres the literacy of users and via a for-profit model. And while this technology is still in the “hope and hype” phase, now is the appropriate time for lawyers and policymakers to take a hard look at it and act to steer it towards the common good and away from risks that can already be foreseen or imagined.
This paper explains how the new generation of chatbots works (Section II) and what actual risks for humans, society and the planet appear to be associated with them (Section III). It then looks at how the legal system should respond to such potential risks and discusses possible regulatory choices, with a special focus on the proposal for a European Union (EU) regulation on AI,Footnote 7 currently under discussion (Section IV). Section V concludes.
II. How chatbots using large language models work
To reflect on the risks stemming from new technologies, and their potentialities as well, we first need to understand their inner workings. Indeed, AI-based systems such as LLMs are often perceived as “black boxes”: the users only see that they provide an input to the system, which in turn produces an output. The chatbots considered in this article work exactly in this way: the user can provide instructions or a question and they will receive a reply in written form from the chatbot. However, ignoring how the input is processed and how the output is composed contributes to some of the risks associated with these systems, namely the risks associated with their distribution amongst a vulnerable and naïve population of users, who have not been given any information regarding how the system works. This section offers a brief presentation of the technology powering most of the available commercial products, such as ChatGPT and Bing Chat, drawing from writings in the fields of computer science.
At the outset, it needs to be noted that LLMs are classified as a type of “foundation model”. This term was introduced in 2021 to describe AI-based systems that are trained on broad data and can be adapted to a wide range of downstream tasks.Footnote 8 The term “foundation” also means that, although these pieces of software are unfinished, insofar as they need adapting to a specific task, they do provide the basis for many different applications. The term “foundation” therefore conveys the importance of the correct development and deployment of such software. The term has become influential in the literature since its introduction. The European Parliament (EP) has indeed proposed a few amendments to the AI Act to take this term into account and the category of software that it defines. In particular, Article 3(1)(c) AI Act – Parliament Amendments defines a “foundation model” as “an AI system model that is trained on broad data at scale, is designed for generality of output, and can be adapted to a wide range of distinctive tasks”. Although foundation models can be applied to tasks such as vision and robotics, this paper focuses specifically on those powering chatbots and therefore those allowing the computer-based processing of natural language (commonly abbreviated “NLP”).
NLP is an area of computer science as old as the research on “artificial intelligence” itself, if not older.Footnote 9 The idea of using machines for translating texts in a different language, and generally processing natural language, followed the successful attempts at code cracking during World War II.Footnote 10 Nonetheless, it was not until large numbers of texts became available in a digital format that the performance and potentialities of NLP were significantly augmented.Footnote 11 Through advancements in the field of machine learning and neural networks from the 2010s until today, modern LLMs have thrived and evolved beyond anything that existed beforehand.
In essential terms, chatbots based on LLMs generate texts in response to instructions typed into the chatbot (called “prompts”) by using statistical techniques. The way in which LLMs are trained allows them to compose texts based on the most statistically probable association of words that follow each other in human-generated texts. In order to achieve this, NLP tools of the modern era are powered by three essential features.Footnote 12 The first one is the possibility of being trained on immense amounts of text available online and in a digital, machine-readable format, covering the span of human knowledge and the many human ways of retelling our experiences of the world. Secondly, a new way of doing machine learning followed from the introduction of an infrastructure called the Transformer by Google,Footnote 13 which is a new type of architecture of neural networks, and its descendants, such as BERT.Footnote 14 Finally, and crucially, the past few years have also brought advanced computational capabilities thanks to extremely powerful hardware that is able to process a multitude of complex calculations simultaneously.
The way in which an LLM is able to “understand” and “learn” from a pre-existing text in a digital format, and the way in which it “knows” words in a given language, is extremely complex for a layperson to fully understand, but anyone should be able to grasp the essentials of how these tools work, particularly the lawyers and policymakers who are called to set and apply the rules by which these tools are deployed in society.
In simplified terms, LLMs are composed of a series of twenty to thirty building blocks, called encoder or decoder blocks, depending on the model, each of which is made of two different layers. The first layer is based on a feature called “self-attention”, which performs two tasks. Firstly, it transforms a word into an integer number (called a token), which, from that moment on, will represent that word in the LLM. Secondly, it assigns each word with “some” meaning by linking the word itself to the context in which it is found in the training sentences. This operation in practice transforms the number representing the word into a matrix of numbers that all represent the word at all the possible positions and combinations with other words that the LLMs have found in the training data (an operation called “embedding”). The magnitude of embedding is only limited by the capabilities of the hardware and the data used in the training. It therefore becomes clear why the availability of large amounts of text in a digital format and advanced hardware capabilities have powered the rise of LLMs.
The other layer of each building block of the LLM is able to produce an output based on the input that it receives from the preceding block, which is a possible “answer” to the instruction. This output passes to the following block. Each block is able to improve and refine the LLM’s “understanding” of the word in this same way. At the moment of projecting the final output, the LLMs use another neural network to proceed to transform the matrix of numbers into a probabilistic calculation of which of the tokens (ie words) that it contains should be selected as a reply to the question asked in the chatbot. The token selected according to probabilistic calculations as the most appropriate is re-transformed in the corresponding word and projected by the LLM in its natural language reply. This operation is done for every word or combination of words that constitutes the natural language text answering the question in the chatbot.
A modern-day LLM first goes through a pre-training phase, during which it is exposed to an enormous amount of text, spanning different topics and styles. The purpose of this phase is to create an LLM that has a large vocabulary (and the corresponding embedding) that is not related to a particular field or a particular purpose. During pre-training, the model, by definition, will not be able to predict the statistically most likely word to follow a certain sentence. Errors in prediction are spotted and then the correct result is fed into the LLM, so that it may learn. This operation is done millions of times during the pre-training phase. In addition, these LLMs are capable of performing self-supervised learning, which dispenses with human oversight during pre-training and seems to have proven to be even more effective than human-supervised training.Footnote 15
The technical process through which the data have been selected for the training of commercially available chatbots, such as ChatGPT, is generally known to researchers and experts in the field. For example, ChatGPT’s foundational model has been trained on a freely accessible library of texts, called the “Common Crawl”.Footnote 16 The data in the Common Crawl have been recorded from webpages since 2008 via tools called “web crawlers” that are able to covertly scrape information from websites without leaving a trace of their presence.Footnote 17 A group of researchers has applied three different filters to the Common Crawl to produce a database called “C4.EN.NOCLEAN”, which contains more than 2.3 TB of text, to exclude, for example, some offensive words.Footnote 18 In the training of ChatGPT, this “clean”Footnote 19 version of the Common Crawl has been complemented and filtered with the addition of a few other datasets.Footnote 20 On the one hand, two Internet-based datasets containing books, called “Books1” and “Books2”, whose content is unspecified by OpenAI but has been documented in previous publications, seem to contain a vast array of academic publications from, for example, PubMed, along with material from YouTube and a “mix of fiction and non-fiction books”.Footnote 21 On the other hand, OpenAI added to ChatGPT’s training some Internet-based sources made out of scraped materials, such as English-language Wikipedia pages and another freely available dataset known as “Webtext”.Footnote 22 For the earlier version of ChatGPT released at the end of 2022, we know that all of these datasets have been used but also that the training gave more prominence to the “clean” Common Crawl.Footnote 23 Although this explanation may be enough to grasp the technicality of the selection of data, it provides the layperson, the consumer and society at large with little meaningful information regarding the actual content of the information used to train these LLMs and the effects that such content has on the output of the commercially available chatbot systems.Footnote 24
After the pre-training phase, the LLM goes through another phase, called fine-tuning, during which the model can be trained for a specific task – for instance, academic writing or translation. During this phase, pre-trained models undergo a new type of training that uses labelled data taken from a more specific dataset that is adapted to the specific task to which the fine-tuning is aimed. In this phase, the datasets include both the input and the wanted corresponding outputs. This allows the pre-trained model to “learn” to generate outputs that are increasingly similar to the labelled data provided. This operation can also be done via a special machine learning technique called “reinforcement learning”, whereby the model is rewarded when it produces an output that is sufficiently similar to the desired one. During this process, humans can be involved in different ways, including via real-time interaction with the model. The exact data used for fine-tuning are generally not known for most of the commercially available chatbots.Footnote 25
III. Potential risks associated with the development and commercial deployment of chatbots using large language models
This section of the paper attempts to explain the possible negative impacts on society of the new generation of chatbots. At this stage, these risks appear possible, and therefore worthy of consideration, to avoid sleepwalking into a future in which humans are made worse off by the introduction of these new technologies. Some of the risks discussed in this section have already been realised to various degrees. Conversely, some other risks may turn out to be less daunting than they appear now. The objective of this section is to present the risks that seem plausible whilst taking into consideration the current deployment at scale and for commercial purposes in an untrained and unprepared population. Section IV will then reflect upon the areas of law that need to address such risks in order to mitigate them or prevent them altogether.
For clarity, the potential risks are divided into (1) risks related to the input provided to chatbots, (2) risks related to the output of chatbots and, finally, (3) systemic risks associated with the deployment of these tools for commercial purposes.
1. Risks related to the training of chatbots
The first set of risks has to do with the way in which the LLMs that underpin the newly available chatbots are trained. As explained in Section II, the training of LLMs comprises a pre-training phase and a fine-tuning phase. Both phases rely on the possibility to train the language model on very large amounts of data, the content of which is not transparent.Footnote 26 This way of training raises a series of questions.
Firstly, the use of large datasets that have been created starting from libraries of crawled data and then refined using different filters, some of which are unaccounted for, raises the question of whether information fed to LLMs during training is biased, to the disadvantage of different groups in society. In every system that relies on machine learning techniques, “datasets form the critical information infrastructure underpinning [machine learning] research and development, as well as a critical base upon which algorithmic decision-making operates”.Footnote 27 Notwithstanding the crucial role played by datasets in any machine learning application, such as LLMs, work that relates to datasets is heavily under-incentivised as opposed to work focusing on the development of more efficient algorithms.Footnote 28 Institutions of research in the field of computer science, comprising industry and academic institutions, seem to feed into this lack of recognition of the valuable work that would be necessary to curate databases,Footnote 29 worsening the status of such crucial infrastructure for machine learning. As a consequence, publications accompanying new datasets have been found to under-specify the decisions that go into the collection, curation and annotation of datasets,Footnote 30 leading to a lack of transparency and reliance on best practices regarding the curation of datasets and no general interest in whether the datasets are reliable in the first place.Footnote 31 In turn, this vicious circle feeds a phenomenon that has been called the “naturalisation of datasets”: as the datasets used for LLMs become increasingly well-known and relied upon on a routine basis by industry and researchers, the history behind their creation is lost, “in a manner that ultimately renders the constitutive elements of their formation invisible”.Footnote 32 This lack of documentation of the process and content behind the datasets used in the training of LLMs is alarming per se, and it prompts the question regarding accountability for the content produced by commercial products, such as ChatGPT3. Ample literature has documented the existence of LLMs’ biases and discriminatory outputs. For example, LLMs have been found to associate the word “Muslim” with violence.Footnote 33
These risks linked to discrimination and exclusions in algorithms and machine learning systems using large and opaque datasets have been widely known for years,Footnote 34 and very little has been done since then to mitigate them and develop different ways to harness the potentialities of AI. In the era of the commercialisation of chatbots exploiting LLMs, it is high time for policymakers and the legal system to find the right tools to prevent discriminatory outputs, starting with streamlining the process of training and the choice of data.
Secondly, another concern that arises is whether the information fed to LLMs via large datasets used for training can be used at all for such a purpose. At least two problems can be flagged under this perspective. On the one hand, large datasets harbour a real risk of exposure of the personal data of unknowing individuals. It has been proven that, under some conditions, it is possible to reverse engineer data present in large datasets used for training in order to extract personal data referring to identifiable individuals.Footnote 35 This is a way by which the personal data of individuals that are available on the Internet can be retrieved by third parties. As it has been explained, the fact that such data were already publicly available on the Internet does not in itself warrant authorisation or give consent to further processing.Footnote 36 This way of retrieving the personal data of individuals can also expose identified people to harm, or more generally to unwanted attention.Footnote 37 On the other hand, some of the information used to train models, such as English-language texts used to train LLMs, may have been put on the Internet with the assumption that they would not be used for such a purpose, or they may have been put on the Internet at a time when this particular type of use was not known and thus surely not contemplated by their authors. Although these circumstances do not in themselves demonstrate that the use of this type of text is a violation of existing laws on copyright or authorship or contractual arrangements linked to websites, they do raise the question of how to control for such possibilities if the datasets used are opaque and non-transparent.
Finally, another very alarming risk concerns the exploitation of workers in the Global South, who are called to work on the process of fine-tuning by labelling data and other tasks related to reinforcement learning. For example, reporting has uncovered such practices being used by OpenAI in Kenya.Footnote 38
In addition to these risks, the legal system and policymakers need also to reckon with the fact that the training of LLMs happens mainly in a non-supervised way and thus without human oversight. This in turn renders the question of how to think about accountability for any possible violation of laws or harm that is caused during or by the training process. More generally, the way in which LLMs work does not leave much room for inquiry and thus reinforces the “black box” model.
2. Risks associated with the outputs of chatbots
The second set of risks that can be identified relates to the content produced by chatbots leveraging LLMs. As was explained previously, the newly released commercial chatbots generate content in the form of text as a response to the instructions input by the user (“prompt”).
In this respect, the first broad question that arises is whether the output of the chatbot may harm humans in any way. A non-specific answer to this question would be that there seem to be instances in which harm is not only a plausible risk but is already established. As mentioned in Section III.1, biases in the training data have translated into discriminatory outputs. As an example of the harm directly caused by chatbots, the Italian Data Protection Authority (DPA) has ordered an urgent temporary limitation on the processing of personal data relating to users located in Italy by the company operating Replika, an AI-powered chatbot generating a “virtual friend”.Footnote 39 The Italian DPA has found, via some tests and other evidence regarding replies generated by Replika, that the chatbot posed risks to minors and, generally speaking, “emotionally vulnerable individuals”. With a similar decision, the Italian DPA has also blocked ChatGPT for a few weeks pending explanations and commitments from OpenAI regarding the processing of personal data of Italian users, especially minors.Footnote 40 It is also worth noting that the draft EU AI Act, currently under discussion, prohibits the commercialisation of AI-powered tools that can manipulate users or otherwise exploit the vulnerabilities of minors and other groups.Footnote 41 These few examples seem sufficient to establish that the output produced by chatbots can harm humans, especially minors or other groups of vulnerable individuals.
Nonetheless, when considering the potential for harm arising out of the new generation of chatbots, attention should also be paid to studies that have highlighted everyone’s risk of exhibiting some vulnerability that can be exploited, including by AI. It has been put forward by literature in the field of behavioural economics and anthropology that everyone, immersed as we are in an “endless chain of acts of consumption”, becomes a vulnerable consumer.Footnote 42 The overwhelming nature of the demands that the consumer market puts on humans fosters a mindset of scarcity, whereby mental space for certain cognitive tasks is absorbed by other issues, putting consumers in a situation that is structurally vulnerable vis-à-vis their counterparts.Footnote 43 This is particularly true for individuals in situations of poverty or marginalisation, but it remains a valuable point for the vast majority of consumers. In such a situation, humans become “disengaged” consumers and “find themselves in vulnerable purchasing situations, not because of particular cognitive failings or socio-demographic characteristics, but because the structure of the consumer markets on which they evolve leads to apathy through obfuscation”.Footnote 44 Based on these ideas, the new generation of chatbots, deployed at scale for commercial purposes, may have the potential to harm everyone, to the extent that they find themselves, at different moments throughout their lifetime and even throughout their day, in a situation of scarcity, disengagement and, thus, vulnerability. In addition, as demonstrated by countless experiments, not only children or other vulnerable groups are potential victims of unethical or illegal marketing techniques that exploit subliminal or similar techniques.Footnote 45 It cannot be assumed that the risks associated with the output of chatbots, especially with respect to their effects on the human mind, can be purely avoided by protecting special categories, such as minors. A deeper and interdisciplinary reflection is needed in order to understand the exact scope of the potential harms for individuals and society at large.
The second set of risks is associated with the accuracy of the outputs. The first experiences with ChatGPT3 and Bing Chat have reported mixed results. In some instances, the replies of the chatbots are well written and factually accurate, such as when asked to summarise a given paragraph. On the other hand, other studies have highlighted the presence of gross inaccuracies and plain falsehoods within the replies of ChatGPT.Footnote 46 If allowed to circulate (eg in the form of social media posts), these falsehoods and inaccuracies may raise issues in any sphere of social life, from politics to health and safety. In addition, as mentioned previously, commercial products such as ChatGPT3 have been allowed to be deployed at scale and for commercial purposes in an untrained population, which in vast part has not been prepared to deal with this technology. In addition, most of the commercial chatbots that have been released so far do not seem to provide sources for their statements. Accordingly, users have no means of verifying whether the information given is reliable or not. And if a user is required to double-check every piece of information that emerges from the chatbots, their utility may be greatly undermined.
A third broad set of questions, which is linked to the problems of training and the lack of sources as highlighted above, is whether the output of chatbots might interfere with the rights of human authors and creators. In particular, outputs can interfere with copyright and other rights attached to human creativity and also constitute plagiarism in society at large and within the narrower field of education. As it has been argued, “ChatGPT’s ability to produce large amounts of plausible-sounding content and to rewrite existing text in different styles, making plagiarism detection near-impossible, may stretch the current system to its limits and undermine trust”.Footnote 47 Conversely, a claim has been made that some LLM-based tools may be able to detect whether a text has been written by another LLM model, although doubts remain regarding their efficacy.Footnote 48 Although this is crucial, detecting such practices would only be the first step in finding a solution to the problem of preserving human creations from the outputs of chatbots without obviously jeopardising the great support that tools such as ChatGPT could provide to authors and creators in general.
And, in this respect, it is also possible that text-generating AI-based tools will augment the creative potential of humans in the same way as other AI-based tools have done in other instances. For example, after an AI-based system had defeated the human world champion in the game of Go, a board game similar to chess,Footnote 49 human professional Go players started training and playing games against AI-based systems. Ultimately, a player who had been training with AI beat an AI-based system.Footnote 50 Research in cognitive psychology has submitted that training with AI-based computers fostered human Go players’ ability to think outside the box and made them better decision-makers, allowing them to eventually outsmart the machine.Footnote 51 Therefore, it is possible that generative AI tools, such as chatbots, could be used to foster human abilities and skills, “augmenting” our potentialities in a fruitful collaboration between humans and machines. It will then be crucial to correctly recognise and protect the rights of “augmented human creators” as well as to clarify the role and rights of the programmers and owners of the AI tools used and possibly of the AI creator as well.Footnote 52
3. Systemic risks associated with the deployment of the new chatbots at scale for commercial purposes
As was mentioned previously, NLP and machine learning are not new techniques. In particular, chatbots have been deployed for years – for example, in customer support functions. Yet, as mentioned above, the current wave of new chatbots is different. Firstly, the capabilities of the new LLM-powered chatbots are greatly increased thanks to the advances in computing and the vast availability of datasets for training. Although these advances are impressive and should be welcomed, a lot remains to be done in terms of the energy consumption of the required investments. Secondly, these chatbots are now sold as commercial products, and they have reached a vast and untrained population. As pointed out above, the majority of users possess limited information on how chatbots work and are not aware that the logical statements made by chatbots do not follow the rationality commanding human language and meaning and instead follow a logic based merely on statistical reasoning. In addition, the general public knows little or nothing about the training of ChatGPT and similar products and what parameters and constraints guide them, although some such information may be accessible to experts in the field. Furthermore, when a user gets a reply from these chatbots, they are usually not provided with the sources of the information it contains. Thus, confirming the accuracy of a statement provided may be burdensome for the average user, which, given the lack of AI literacy of most users, will probably lead to the general acceptance of the chatbot’s statement as true.
All of these factors contribute to the potential risks of harm to individuals and society at large. As a consequence, regulators should be able to identify and prevent a series of risks that are linked to these factors. Building on writings from computer science and other disciplines, this paper identifies three categories of such types of risks.
Firstly, policymakers and lawyers should urgently address the environmental costs of training and operating these chatbots. The impressive escalation in the amount of computing used to train and operate LLMs has a significant environmental impact. As pointed out previously, machine learning is an energy-hungry endeavour, which translates notably into CO2 emissions, one of the main drivers of climate change.Footnote 53 Additional research has also studied the impacts of machine learning in general on all greenhouse gas (GHG) emissions, identifying three different stages of machine learning in which high levels of GHG emissions are involved: computing-related impacts, the immediate impacts of applying machine learning and system-level impacts. Footnote 54
Additionally, research has shown that institutions and stakeholders in the field of machine learning tend to concentrate on the optimisation of models rather than operating a whole cost–benefit analysis of a new, more powerful technology with respect in particular to its environmental costs and energy efficiency.Footnote 55 Although we should welcome calls to use AI and machine learning in the context of mitigation and adaptation efforts regarding climate change,Footnote 56 a more holistic approach to the costs and benefits of this technology appears to be the essential first step to be performed.Footnote 57 Considerations related to climate change need to inform all policy decisions regarding LLM deployment, and it has to become a pivotal objective, at the policy and legal level, to rein in energy-costly models.
An additional and very pressing negative effect of the environmental costs of machine learning and chatbots in particular is that such costs tend to accrue to disadvantaged groups in society, which are not the same groups that benefit from the financial or social advantages of the technology and are in general subject to many different instances of discrimination and environmental racism.Footnote 58 With the deployment of a new generation of chatbots at scale and the profits generated by their operating companies, the issue of representation of marginalised groups within the decision-making processes leading to ever-bigger models with higher energy consumption levels and emissions should be high on the agenda of policymakers at the national and international level.
Secondly, it is foreseeable that the risks that the new generation of chatbots pose to individuals – identified in the previous sections as discrimination of certain groups, loss of privacy, interference with creative rights, misleading statements and other manipulation risks – will be amplified at the societal level by the sheer number of users of such chatbots. In short, when a chatbot is used by a million users every single day, harm to individuals may become harmful to society. Let us imagine that it becomes possible to extract the personal data of individuals from the replies of one of these chatbots.Footnote 59 If this happens to one person, it is a data breach and a privacy intrusion relative to such an individual. If this happens to millions of individuals, the problem becomes a cybersecurity issue and needs to be addressed at the societal level. Similar reasoning can be applied for all of the above risks.
Finally, a pressing systemic issue is the disruption effect that these chatbots may provoke in many of the fundamental social institutions that underpin liberal democracies: the job market, the education system, the political system and the maintenance of free competition. The increasing availability of commercial products running on LLMs, which can generate output that is overall as good as human output, may prompt companies to reduce their number of employees.Footnote 60 Similarly, education institutions across all grades may find it difficult to continue to teach and assess students within the traditional curriculum when students have access to these tools.Footnote 61 Chatbots may also produce social media posts or other types of scripts that can convey false or misleading information and be diffused at scale amongst the population, with the potential to disrupt democratic processes and free elections.Footnote 62
All of these risks of disruption must be taken into account by policymakers and correctly addressed using old and new legal tools to allow society to benefit from – rather than be overwhelmed by – chatbots running on LLMs.
IV. Large language models and the law
As explained in Section III, the new generation of chatbots and generative AI in general have and will continue to have significant repercussions across many different sectors of society and, consequently, many different subfields of law. At the time of writing, policymakers and regulators at the national and supranational level are debating and putting forward ideas regarding whether and how they should regulate generative AI.Footnote 63 The remainder of this paper provides some reflections regarding how policymakers can think about the law as a means within this endeavour and reflects upon the current solutions adopted by the EU AI Act – Parliament Amendments.
1. Old and new questions for the law
A first issue that has arisen in the policy debate at the national level about the best way to regulate generative AI is whether new regulation is necessary at all and if new regulation could hamper innovation, putting the national economy at a disadvantage as compared to other countries that may let the new technology run free of regulation. However, from a legal point of view, it appears that this should not be the first question that policy should address. On the contrary, there should first be a reflection on what existing laws that are enforceable at present are relevant and applicable to generative AI.Footnote 64
Along this line, in many jurisdictions there is a significant body of enforceable legal rules that should be relevant and applicable to many aspects of the deployment of chatbots. Under this perspective, these amazingly disruptive tools do not raise disruptive legal questions but rather old ones. For example, as discussed earlier, the Italian DPA has applied the General Data Protection Regulation (GDPR) to the chatbots Replika and ChatGPT and found them in breach of those existing rules.Footnote 65 Another relevant area of law is competition law. As it has been maintained, the introduction of the Transformer, the hardware underpinning LLMs, has led to a concentration of the required immense computing in the hands of a few companies or states around the world.Footnote 66 A very topical question is how to ensure that such a concentration of the means of computing does not lead to a fragmentation of the market and the creation of an oligopoly that prevents smaller players from accessing it. Existing rules of competition should apply to similar situations and behaviours of enterprises. For example, China’s policymakers are launching initiatives to allocate computing power to different services and areas to exploit efficiently the country’s existing resources and allow different market players to access state-owned or privately owned computers.Footnote 67 In a neighbouring field, rules of consumer protection should also be relevant to products like ChatGPT.Footnote 68 Finally, rules that seek to protect individuals and groups from discrimination should be fully applicable and applied whenever generative AI is used.Footnote 69
Undoubtedly, existing laws that are relevant to generative AI will need to be adapted and tweaked, at least partially, to meet some of the specific challenges raised by this new technology. In other instances, it will be necessary to clarify the extent to which existing laws do apply to generative AI. These adaptations and tweaks may happen either when existing rules are applied by courts or authorities in specific cases or in a preventative way by lawmakers amending existing laws.
A parallel could be drawn in this respect with how tax and labour laws have been applied to platforms allowing peer-to-peer economic exchanges, such as Airbnb and Uber. In matters of taxation, it was unclear whether existing laws would apply to peer-to-peer, short-term rentals made through Airbnb, especially concerning tourist taxes that, in major cities around the world, local administrations impose on tourists and that are usually collected by hotels and operators of other traditional forms of touristic accommodation rentals. Such taxes, along with many other aspects of the economic activity allowed by Airbnb, have been regulated in major tourist cities after such cities experienced the negative consequences of the rise in such types of rentals for tourists,Footnote 70 in particular with respect to the payment by hosts of tourism taxes.Footnote 71 To achieve this shift in tax rules, different regulatory techniques have been deployed: some cities have collaborated with the platforms and drafted common guidelines (eg Barcelona),Footnote 72 and others have introduced legally binding rules (eg Tokyo).Footnote 73
Similarly, Uber did not consider drivers using its platform as employees but rather as self-employed. Consequently, Uber would not respect the obligations of an employer, such as paid holidays and proper breaktimes. It has been through litigation that, under certain conditions, Uber drivers have been recognised to be employees.Footnote 74
More generally, what we are currently witnessing in the field of chatbots and generative AI is a transitional phase, during which these clarifications and adaptations are happening gradually, as society and the regulators realise the challenges linked to the deployment of the technology, similar to other fields that have been characterised by innovation.Footnote 75 During this phase, it is important for policymakers to clearly state that chatbots do not benefit from exemptions and that existing laws apply insofar as relevant, including data protection laws and fundamental rights. In this respect, the recent Chinese rules for the regulation of generative AI are interesting, as they clarify that any such product needs to respect all existing laws, in addition to the few specific rules introduced by such measures.Footnote 76 In addition, those who act as legal advisors of operators and users of chatbots and other generative AI tools should be mindful of the possible legal risks, in particular with respect to the possible application of laws already in force to the uses of this new technology.
After having surveyed the existing laws that constrain AI, there will remain other important and truly innovative questions that will not be well apprehended by existing laws. In these respects, the issue will indeed be whether new rules are needed and what form such rules should take: top-down regulation, judge-made law or various forms of soft laws and collaborative rules. One field that seems ripe for profound modifications is copyright, both regarding the use of copyrighted works in the training of the models and regarding the protection of work generated by authors and artists with the support of generative AI tools. These issues seem truly novel, in the sense that the current legal framework seems unable to correctly apprehend them. It therefore seems that new ways of rewarding authors for allowing the training of generative AI on their works, along with the opening of the possibility to protect creative work that uses generative AI, should be considered as possible developments of the legal system.
2. The draft European Union Artificial Intelligence Act
At the time of writing, the EU’s lawmakers are discussing an ambitious, comprehensive regulation on AI.Footnote 77 The Commission’s proposal was published in the spring of 2021, with the desire to position the EU as the world regulator of AI, “winning” the global regulatory race.Footnote 78 However, given the many issues raised by AI, especially with respect to fundamental rights, intensive discussions have taken place in the EP during the legislative procedure. The rise of chatbots at the end of 2022 further complicated the legislative process and prompted the EP to add amendments specifically targeting foundation models. The EP’s amendments are discussed in this section, albeit the latest discussions and available information seem to confirm that most of these amendments will not be included in the final text of the AI Act.Footnote 79
At the outset, it is interesting to note that the EP has embraced the transitional nature of legal rules regarding foundation models adopted at this early stage for the commercial deployment of this technology. Recital 60(h) of the EU AI Act – Parliament Amendments states that, since foundation models are a “new and fast-evolving” AI application, the Commission and other specialised EU bodies should “periodically assess the legislative and governance framework of such models”. Secondly, the added recitals also show that, in proposing specific rules for foundation models, the EP was moved by two main concerns. On the one hand, foundation models are instrumental to many different products (“downstream applications and systems”),Footnote 80 and therefore their correct deployment is necessary to avoid a negative “domino effect” regarding such products. On the other hand, the EP has expressed the willingness to protect providers of AI products that rely on a foundation model, which they did not develop, trying to ensure that such providers receive from the developers of the foundation model all of the necessary information and support to ensure compliance of their downstream applications with the future AI Act.Footnote 81 Finally, it appears that many of these concerns, which are expressed in detail in the recitals, did not necessarily find a specific corresponding rule in the amendments to the text of the regulation itself. For example, the reference to foundation models provided through APIFootnote 82 is only found in the recitals.Footnote 83
The main provision proposed by the EP on foundation models is Article 28b, complemented by the new Annex VII C, which addresses some of the risks of chatbots highlighted in Section III.
Firstly, Article 28b incorporates some principles regarding data used in training. Article 28b(e) requires the providers of foundational models to “process and incorporate only datasets that are subject to appropriate data governance measures”. The text of the proposal seems to leave it open to providers to set the exact type of data governance measures for foundation models. It only provides one example of such governance measures, notably “measures to examine the suitability of the data sources and possible biases and appropriate mitigation”. In addition, Annex VII C also requires a “description of the data sources used in the development of the foundational model”.Footnote 84 These proposals therefore bring together a requirement regarding the quality of the data (“data governance measures”) and another one regarding the transparency of the datasets used in training. These two aspects together sketch a system whereby developers and providers of foundation models are required to develop and follow their internal procedures to ensure the quality of the data, whereas the open character of the data sources should allow public scrutiny on the part of regulatory bodies, independent researchers, civil society and the press. Although this system seems to meet some of the requirements for keeping foundation models open,Footnote 85 recent research has shown that the currently available commercial chatbots are far from actually meeting these requirements.Footnote 86 In addition, the EU AI Act is still in the process of being negotiated, even though commercial chatbots have been deployed for almost a year. In the foreseeable future, this lack of regulation and accountability will continue, at least in the EU, further exacerbating the opacity of the training data and, arguably, the discriminatory or biased characters of the commonly used datasets.Footnote 87
The EU AI Act – Parliament Amendments also requires the providers of foundation models used for generative AI, such as LLMs, to “document and make publicly available a sufficiently detailed summary of the use of training data protected under copyright law”.Footnote 88 This provision needs to be read in conjunction with Article 4 of the Copyright Directive,Footnote 89 according to which training and the retention of data for training are allowed unless the holder of a right over content has expressly, including in a machine-readable format, excluded such use of their work. It seems, therefore, that the EU is consolidating its approach to allowing the use of copyrighted materials in the training of foundation models under the sole conditions of transparency and provided that the copyright holder has not opposed such use. This is in contrast with the present situation in the USA, where the issue is currently unsettled under copyright law and is the object of extensive litigation.Footnote 90 Although this approach allows machine learning and foundation models – which rely on large amounts of data – to exist and therefore may favour innovation, it also disregards the moral and economic rights of creators, as recently highlighted by authors and representatives of the creative industries.Footnote 91 And even if copyright may not ultimately be the panacea through which such moral and economic rights are guaranteed to single creators,Footnote 92 it seems necessary to reflect upon and devise legal solutions regarding how human creativity and content production can be preserved in the face of the rise of chatbots, most of which are profitable products of international commercial conglomerates.
Another issue that is addressed by the EU AI Act – Parliament Amendments is the labelling of machine-generated content. Nonetheless, the current text seems to require disclosure in the form of clear labelling or watermarking only for generated content that qualifies as a “deep fake”,Footnote 93 defined as “content that would falsely appear to be authentic or truthful, and which features depictions of persons appearing to say or do things they did not say or do”.Footnote 94 At this stage, therefore, there are no requirements in the EU AI Act – Parliament Amendments regarding disclosure that text has been machine generated in general but only insofar as it constitutes a “deep fake”.Footnote 95 This solution seems good overall considering that a blanket requirement to watermark text generated by AI may indeed take some of the utility out of these new tools, and even put certain individuals or groups at a disadvantage. For example, using a chatbot to proofread written communications could be a way for non-native speakers to perform certain tasks or access certain services from which they would otherwise be excluded.
Concerning the risks of harm and inaccuracies, the EU AI Act – Parliament Amendments lays down a series of compliance requirements aimed at making known information about the capabilities, performance, limitations and risks of foundation models, as well as the measures taken by the provider to mitigate such risks.Footnote 96 At the current stage, these obligations only require disclosure and do not seem to require any specific level of risk-proofing or action on the part of the provider of the LLM. Ample reference is made to possible future benchmarks and industry standards,Footnote 97 which the EU lawmaker expects will also be facilitated by newly established bodies.Footnote 98 In addition, these disclosure obligations apply to the providers of foundation models, whereas deployers of products based on such foundation models are not covered. Indeed, the idea behind this choice seems to be that entities wishing to use a foundation model, such as an LLM, to create and deploy a product should be able to use the public information on the foundation model to ensure compliance with the future European regulation.Footnote 99 In reality, in the current state of the market for chatbots, this exclusion may not be very relevant because the main commercial chatbots on the market are indeed those launched by the same companies having developed the underlying LLMs.Footnote 100 However, in the future, should commercial products be built and deployed on the market by a different entity than the one having developed and put on the market the LLM, such an exclusion might create regulatory loopholes and possibly a lack of accountability.
Another related aspect regards the moderation of machine-generated content. Article 28(4)(c) EU AI Act – Parliament Amendments requires providers of products such as text-generating LLMs to “train, and where applicable, design and develop the foundation model in such a way as to ensure adequate safeguards against the generation of content in breach of Union law in line with the generally-acknowledged state of the art, and without prejudice to fundamental rights, including the freedom of expression”.Footnote 101 In the current state of generative chatbots, this requirement seems to place an overwhelming burden on providers of LLMs, in particular because LLMs may provide false information due to the way in which they produce text, which has no link to actual meaning but is merely based on a statistically probable combination of words. In this respect, could a false statement about the content of EU law be “in breach of EU law”? Conversely, would a statement that encourages discrimination be in breach of EU law because it is contrary to Article 21 of the EU Charter of Fundamental Rights? And, if the affirmative is the case, it should be for the provider to determine in advance the extent to which such a statement could be considered “not in breach of EU law” under the EU’s view of freedom of expression. Although the concern underlying this provision seems understandable, the future compromise text should retain a wording that empowers providers rather than one that could only elicit further doubts and, possibly, litigation. Although not a silver bullet, a step in this direction could be to only refer to the EU Charter of Fundamental Rights as a benchmark of legality rather than the whole of EU law.
Finally, with respect to environmental concerns and, more broadly, to the computation utilised for the development of foundation models, the EU AI Act – Parliament Amendments requires providers of foundation models to disclose the model size, computer power and training time used.Footnote 102 It also requires providers to disclose the energy consumption of the model and take steps to make the training of foundational models more sustainable.Footnote 103 These requirements with respect to energy consumption are not only applicable to foundation models. The EP has added energy-saving requirements for AI systems in general.Footnote 104 This is an important step in leadership for the European legislator because providers and developers of foundation models will have to adapt to the stricter European standards to access the EU market and probably would not differentiate for other markets given the related costs. In this way, the EU legislation might foster a positive cycle regarding the achievement of more sustainable AI.
V. Conclusion
At the dawn of the commercialisation of chatbots leveraging LLM technology, and in view of their potentialities, the legal system is called to swiftly respond to the risks that they pose to individuals and society. New technologies are bringing about a new cognitive revolutionFootnote 105 that will prompt humans to adapt to the new methods of information processing and communication that are brought about by AI-based technologies such as LLMs.
The role of law in this scenario is crucial. Technological inventions are not neutral, nor are they good per se. On the contrary, any new system embeds values, whether we like such an idea or not.Footnote 106 Accordingly, lawyers and policymakers should take a hard look at the potentialities and risks of chatbots leveraging LLMs and create a regulatory and legal framework that is able to steer this technology towards the common good and a future in which humans are empowered rather than overwhelmed by it.
Competing interests
The author declares none.