
By: Jacob Alhadeff
In the early 2000s, courts determined that the emerging technology of peer-to-peer “file-sharing” was massively infringing and categorically abolished its use. Here, the Ninth Circuit and Supreme Court found that Napster, Aimster, and Grokster were secondarily liable for the reproductions of their users. Each of these companies facilitated or instructed their users on how to share verbatim copies of media files with millions of other people online. In this nascent internet, users were able to download each other’s music and movies virtually for free. In response, the courts held these companies liable for the infringements of their users. In so doing, they functionally destroyed that form of peer-to-peer “file-sharing.” File-sharing and AI are in not analogous, but multiple recent lawsuits present a similarly existential question for AI art companies. Courts should not find AI art companies massively infringing and risk fundamentally undermining these text-to-art AIs.
Text-to-art AI, aka generative art or AI art, allows users to type in a simple phrase, such as “a happy lawyer,” and the AI will generate a nightmarish representation of this law student’s desired future.
Currently, this AI art functions only because (1) billions of original human authors throughout history have created art that has been posted online, (2) companies such as Stability AI (“Stable Diffusion”) or Open AI (“Dall-E”) download/copy these images to train their AI, and (3) end-users prompt the AI, which then generates an image that corresponds to the input text. Due to the large data requirements, all three of these steps are necessary for the technology, and finding either the second or third steps generally infringing poses and existential threat to AI Art.
In a recent class action filed against Stability AI, et al (“Stable Diffusion”), plaintiffs allege that Stable Diffusion directly and vicariously infringed on the artist’s copyright through both the training of the AI and the generation of derivative images, i.e., steps 2 and 3 above. Answering each of these claims requires complex legal analyses. However, functionally, a finding of infringement on any of these counts threatens to fundamentally undermine the viability of text-to-art AI technology. Therefore, regardless of the legal analysis (which likely points in the same direction anyways) courts should not find Stable Diffusion liable for infringement because doing so would contravene the constitutionally enumerated purpose of copyright—to incentivize the progress of the arts.
In general, artists have potential copyright infringement claims against AI Art companies (1) for downloading their art to train their AI and (2) for the AI’s substantially similar generations that the end-user prompts. In the conventional text-to-art AI context, these AI art companies should not be found liable for infringement in either instance because doing so would undermine the progress of the arts. However, a finding of non-infringement leaves conventional artists with unaddressed cognizable harms. Neither of these two potential outcomes are ideal.
How courts answer these questions will shape how AI art and artists function in this brave new world of artistry. However, copyright infringement, the primary mode of redress that copyright protection offers, does not effectively balance the interests of the primary stakeholders. Instead of relying on the courts, Congress should create an AI Copyright Act that protects conventional artistry, ensures AI Art’s viability, and curbs its greatest harms.
Finding AI Art Infringing Would Undermine the Underlying Technology
A finding of infringement for the underlying training or the outputs undermines AI Art for many reasons: copyright’s large statutory damages, the low bar for granting someone a copyright, that works are retroactively copyrightable, the length of copyright, and the volume of images the AI generates and needs for training.
First, copyright provides statutory damages of $750 to $30,000 and up to $150,000 if the infringement is willful. Determining the statutory value of each infringement is likely moot because of the massive volume of potential infringements. Moreover, it is likely that if infringement is found, AI art companies would be enjoined from functioning, as occurred in the “file-sharing” cases of the early 2000s.
Second, the threshold for a copyrightable work is incredibly low, so it is likely that many of the billions of images used in Stable Diffusion’s training data are copyrightable. In Feist, the Supreme Court wrote, “the requisite level of creativity is extremely low [to receive copyright]; even a slight amount will suffice. The vast majority of works make the grade quite easily.” This incredibly low bar means that each of us likely creates several copyrightable works every day.
Third, works are retroactively copyrightable, meaning that the law does not require the plaintiff to have registered their work with the copyright office to receive their exclusive monopoly. Therefore, an author can register their copyright after they are made aware of an infringement and still have a valid claim. If these companies were found liable, then anyone with a marginally creative image in a training set would have a potentially valid claim against a generative art company.
Fourth, the copyright monopoly lasts for 70 years after the death of the author. Therefore, many of the copyrights in the training set have not lapsed. Retroactive copyright registration combined with the extensive duration of copyrightability means that few of the training images are likely in the public domain. In other words, “virtually all datasets that will be created for ML [Machine Learning] will contain copyrighted materials.”
Finally, as discussed earlier, the two bases for infringement claims against the AI art companies are (1) copying to train the AI and (2) copying in the resultant end generation. Each basis would likely result in billions or millions of potential claims, respectively. First, Stable Diffusion is trained on approximately 5.85 billion images which they downloaded from the internet. Given these four characteristics of copyright, it is likely that if infringement were found, many or all of the copyright owners of these images would then have a claim against AI art companies. Second, regarding infringement of end generations, Dall-E has suggested that their AI produces millions of generations every day. If AI art companies were found liable for infringing outputs, then any generation that was found to be substantially similar to an artist’s copyrighted original would be the basis of another claim against Dall-E. This would open them up to innumerable infringement claims every day.
At the same time, generative art is highly non-deterministic, meaning that, on its face, it is hard to know what the AI will generate before it is generated. The AI’s emergent properties, combined with the subjective and fact-specific “substantial similarity” analysis of infringement, do not lend themselves to an AI Art company ensuring that end-generations are non-infringing. More simply, from a technical perspective, it would be near-impossible for an AI art company to guarantee that their generations do not infringe on another’s work.
Finding AI art companies liable for infringement may open them up to trillions of dollars in potential copyright lawsuits or they may simply be enjoined from functioning.
An AI Copyright Act
Instead, Congress should create an AI Copyright Act. Technology forcing a reevaluation of copyright law is not new. In 1998, Congress passed the DMCA (Digital Millennium Copyright Act) to fulfill their WIPO (World Intellectual Property Organization) treaty obligations, reduce piracy, and facilitate e-commerce. While the DMCA’s overly broad application may have stifled research and free speech, it does provide an example of Congress recognizing copyright’s limitations in addressing technological change and responding legislatively. What was true in 1998 is true today.
Finding infringement for a necessary aspect of text-to-art AI may fundamentally undermine the technology and run counter to the constitutionally enumerated purpose of copyright—“to promote the progress of science and useful arts.” On the other hand, finding no infringement leaves these cognizably harmed artists without remedy. Therefore, Congress should enact an AI Copyright Act that balances the interests of conventional artists, technological development, and the public. This legislation should aim to curb the greatest harms posed by text-to-art AI through a safe harbor system like that in the DMCA.