April 16, 2024 // Did someone forward you this newsletter? Sign up to receive your own copy here.
Who owns AI-generated content?
On a segment of Comedy Central’s “The Daily Show” last year, the comedian Sarah Silverman quipped: “It might be a good thing that ChatGPT could be a lawyer” because a lot of people are taking it to court.
She is one of them.
Last summer, Silverman filed a lawsuit against OpenAI and Meta, alleging the companies stole her work when they used her memoir in their AI algorithm’s training data.
Silverman is one of many artists, writers, and musicians concerned about how tech companies use copyrighted work to train their AI algorithms. This week, we’re exploring the battle over AI and copyrighted material, and what it looks like to build a web where creators have more power and better compensation for their work.
(As a bonus, read to the bottom of this newsletter for a musical surprise. Sound on 🔊)
// The latest news
It’s been a big month for the ongoing conflict between creatives and tech companies on the legality of whether AI algorithms, like those of OpenAI’s ChatGPT chatbot, can be trained on copyrighted material.
On April 1st, over 200 artists, including Billie Eilish, Nicki Minaj, and Stevie Wonder, wrote an open letter calling AI developers to stop devaluing music and respect artists’ rights.
Last week, a new bill was introduced by US Representative Adam Schiff (D-CA) that would force tech companies to disclose any copyrighted materials used in training their AI models.
// The legal tidal wave
The open letter and proposed national legislation are only the most recent ripples in a growing tidal wave of lawsuits:
Earlier this year, a judge dismissed some of the charges brought by Silverman in her lawsuit against OpenAI, but her main complaint that OpenAI directly violated her copyright still stands (the case is pending).
Last year, Getty Images, a company that provides royalty-free stock images, filed a lawsuit against Stability AI, claiming the company illegally copied millions of Getty’s copyright-protected images as part of its AI training data.
Last month, in a proposed class-action lawsuit, book authors sued AI chip manufacturer Nvidia, alleging that the chipmaker illegally copied and distributed their books without their consent through its AI chatbot.
Last year, The New York Times sued OpenAI and Microsoft over copyright infringement, becoming the first major American media organization to sue both companies.
//
OpenAI has said that it would be “impossible” to create tools like ChatGPT without copyrighted material.
//
// Copyright infringement or fair use?
The fundamental question is whether an AI algorithm is legally allowed to train on the copyrighted, intellectual property of creators.
Lawsuits from Silverman, the New York Times, and others allege that it is illegal for an AI algorithm to train on copyrighted work without permission or compensation from the artists.
Tech companies counter that their use of copyrighted materials falls under fair use, a legal doctrine that allows for the unlicensed use of copyrighted materials when they are remixed or put to a new, creative use.
Google successfully defended Google Books, its project to scan and digitize books online, in a 2005 lawsuit by using fair use as a defense; the snippets it captured from books were fundamentally a different service than the books themselves.
The use of fair use in the Google Books trial will likely be cited as an example by tech companies who argue that AI algorithms are allowed to transform copyrighted material into outputs that are different enough not to infringe on the original material.
OpenAI has said that it would be “impossible” to create tools like ChatGPT without copyrighted material.
// New solutions for uncharted territory
Beyond regulating AI companies and debating the legality of copyright law in the courts, the ethical and legal quandaries around AI algorithms and intellectual property are leading to creative solutions:
New compensation models: Getty Images has partnered with Nvidia to test a new compensation model for creators, where creators whose work an AI algorithm trains on will receive a share of revenues from that service. If an artist's work represents 1% of an AI algorithm’s training data, the artist would receive a 1% revenue share and would be eligible for additional compensation if their work is popular and licensed more frequently.
Invisible “poison”: Nightshade is a tool by creators who don’t want their images to be ingested by AI algorithms. It intentionally confuses the AI algorithms with small differences at the pixel level that are indistinguishable to the human eye but render the works unusable for AI training. For example, a “poisoned” image of a dog would be interpreted by the AI algorithm as an image of a cat. For more, check out this paper by its creators.
New tools for copyright holders: There is a growing suite of tools for copyright holders. The company Spawning AI offers a “Have I Been Trained?” tool for creatives to see if their work has been trained by AI algorithms, and a Do Not Train Registry for creatives to add their domains and images to a registry of work (so far Stability AI and Hugging Face have both agreed not to use data on the registry).
// AI-generated music
Not long ago, the outputs that AI could generate from the creative work it ingested still felt like poor substitutes for the real thing. But with AI technology getting better every day, AI-generated text, images, and music are getting better and becoming less distinguishable from human-generated content.
Users of Suno, an AI algorithm that allows users to generate songs using text prompts, have raised concerns that many of the songs it is producing closely mimic well-known pieces of music.
We fed the following prompt into Suno seeking a song that captures this crux of this newsletter:
A ballad lamenting the unanswerability of the legal and ethical questions surrounding AI algorithms and copyright law.
Which real-life musician does this sound like? Who has a claim on copyright infringement? Reply to this email to weigh in.
Project Liberty in the news
// 🌟 Project Liberty team members visited Bangalore, India, and engaged with dozens of leaders and organizations, including the Foundation for Interoperability in Digital Economy (FIDE) and Juspay, highlighting synergies around DSNP and blockchain technologies. Here's to the next steps in this exciting journey towards achieving digital sovereignty and empowering communities worldwide.
Other notable headlines
// 🚺 An article in The New York Times reported on an epidemic of deepfake nudes in schools. Using artificial intelligence, students have fabricated and shared explicit images of female classmates.
// 👩⚖️ According to an article in The Washington Post, OpenAI has hired more than two dozen in-house lawyers and adopted a new Washington playbook as its legal troubles mount.
// 🤔 An article in The Economist asked, when technology has solved humanity’s deepest problems, what is left to do?
// 💾 An article in The Atlantic revisited BlackPlanet, the homepage of the Black internet when social media was still fun.
// 🖥 With the Affordable Connectivity Program, which subsidizes internet access for low-income households, running out of money in two weeks, an article in The Verge argued that we need a permanent solution to universal broadband access.
// 🚸 An article in Tech Policy Press explored how the American Privacy Rights Act, a significant step towards a comprehensive US data protection policy, protects children.
// 📞 The dumbphone boom is real, according to an article in The New Yorker. A burgeoning cottage industry caters to beleaguered smartphone users desperate to escape their screens.
Partner news & opportunities
// National AI Literacy Day
April 19th
Common Sense Media is a founding organization of National AI Literacy Day, a nationwide day of action, inviting students, parents, educators, and other community members to explore AI. Learn more here.
// Virtual seminar on open-source journalism & war crimes
April 20th at 8am ET
Bellingcat will be featured in a seminar at the International Journalism Festival on how open-source reporting techniques have aided reporting of human rights violations in active conflict zones. Learn more here and watch virtually.
// All Tech is Human Future of Trust and Safety Gathering