AI Input Data and Fair Use: A View from the U.S.
Document Type
Article
Publication Date
10-2024
Abstract
For an AI system to generate text, images, music, or computer code, it must copy vast amounts of literary, artistic or musical works. Arguably, the massive copying of works, to enable AI systems to “learn” how to produce independent outputs of literary, artistic, musical, audio-visual works or software, could shelter under the fair use defense on the ground that creating training data sufficiently repurposes the copying to count as “transformative” – at least if the outputs enabled by the inputs do not themselves infringe the source content (a highly disputed point). But one should perhaps decouple the inputs from the outputs. As to whether the copying of works into training data is a “transformative” fair use, the Supreme Court’s most recent fair use decision in Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith suggests that analysis may depend on whether there is a market for licensing content for training data. Markets for high quality, reliable training data do exist or are emerging, notably in news media and scholarly publishing, and other authors and copyright owners are endeavoring to develop those markets as well. In that event, even if the outputs might not infringe particular inputs, commercial copying (at least) to create training data would be for the same purpose, and might, absent a “compelling justification” for supplanting authors’ markets, therefore fail a first factor fair use inquiry after AWF.
This article addresses a further issue: because traditional copyright analysis treats artistic style as akin to unprotectable ideas, is the copying of works of authorship in order to generate outputs «in the style of» the copied author or artist a fair use?
Disciplines
Intellectual Property Law | Law
Recommended Citation
Jane C. Ginsburg,
AI Input Data and Fair Use: A View from the U.S.,
2024/2
Auteurs & Media
165
(2024).
Available at:
https://scholarship.law.columbia.edu/faculty_scholarship/4617