Bot Court

New ChatGPT Lawsuits May Be Start of AI’s Legal Sh-tstorm

OpenAI faces allegations of privacy invasion and violating authors' copyright — but this may be just the tip of the iceberg

June 30, 2023

The language bot ChatGPT is at the center of two new class-action lawsuits. Jaap Arriens/NurPhoto/Getty Images

The burgeoning AI industry has just crossed another major milestone, with two new class-action lawsuits calling into question whether this technology violates privacy rights, scrapes intellectual property without consent and negatively affects the public at large. Experts believe they’re likely to be the first in a wave of legal challenges to companies working on such products.

Both suits were filed on Wednesday and target OpenAI, a research lab consisting of both a nonprofit arm and a corporation, over ChatGPT software, a “large language model” capable of generating human-like responses to text input. One, filed by Clarkson, a public interest law firm, is wide-ranging and invokes the potentially “existential” threat of AI itself. The other, filed by the Joseph Saveri Law Firm and attorney Matthew Butterick, is focused on two established authors, Paul Tremblay and Mona Awad, who claim that their books were among those ChatGPT was trained on — a violation of copyright, according to the complaint. (Saveri and Butterick are separately pursuing legal action against OpenAI, GitHub and Microsoft over GitHub Copilot, an AI-based coding product that they argue “appears to profit from the work of open-source programmers by violating the conditions of their open-source licenses.”)

OpenAI did not return a request for comment on the various suits.

Clarkson’s lengthy filing, which includes more than a dozen anonymous plaintiffs, opens by quoting the late theoretical physicist Stephen Hawking as saying that “the rise of powerful AI will be either the best of worst thing ever to happen to humanity.” Warning of the potential for “civilizational collapse” if tech companies do not heed the risks of artificial intelligence, it argues that OpenAI’s publicly available tools, including not just various iterations of ChatGPT but the image-creating product Dall-E, have stolen private and in some cases personally identifying information from “millions of internet users,” children among them, “without their informed consent or knowledge.” The suit requests a temporary freeze on commercial access and development of OpenAI’s products until a host of safeguards have been enacted.

Editor’s picks

The 250 Greatest Guitarists of All Time

The 500 Greatest Albums of All Time

The 50 Worst Decisions in Movie History

Every Awful Thing Trump Has Promised to Do in a Second Term

Saveri and Butterick’s latest suit goes after OpenAI for direct copyright infringement as well as violations of the Digital Millennium Copyright Act (DMCA). Tremblay (who wrote the novel The Cabin at the End of the World) and Awad (author of 13 Ways of Looking at a Fat Girl and Bunny) are the representatives of a proposed class of plaintiffs who would seek damages as well as injunctive relief in the form of changes to ChatGPT. The filing includes ChatGPT’s detailed responses to user questions about the plots of Tremblay’s and Awad’s books — evidence, the attorneys argue, that OpenAI is unduly profiting off of infringed materials, which were scraped by the chat bot.

While the suits venture into uncharted legal territory, they were more or less inevitable, according to those who research AI tech and privacy or practice law around those issues.

“[AI companies] should have and likely did expect these types of challenges,” says Ben Winters, senior counsel at the Electronic Privacy Information Center and head of the organization’s AI and Human Rights Project. He points out that OpenAI CEO Sam Altman mentioned a few prior “frivolous” suits against the company during his congressional testimony on artificial intelligence in May. “Whenever you create a tool that implicates so much personal data and can be used so widely for such harmful and otherwise personal purposes, I would be shocked there is not anticipated legal fire,” Winters says. “Particularly since they allow this sort of unfettered access for third parties to integrate their systems, they end up getting more personal information and more live information that is less publicly available, like keystrokes and browser activity, in ways the consumer could not at all anticipate.”

Tupac Estate Demands Drake Take Down 'Taylor Made Freestyle' Over AI Vocals

'Hype and Magical Thinking': The AI Healthcare Boom Is Here

Fake Photos, Real Harm: AOC and the Fight Against AI Porn

Ari Lightman, a professor of digital media who teaches on emerging technologies at Carnegie Mellon University’s Heinz College, says another major reason we’re seeing legal action against OpenAI is that it pivoted from nonprofit status to become a massively valuable for-profit corporation in 2019. (Indeed, the Clarkson suit argues that the alleged harms to its plaintiffs are a direct result of this transition.) “As soon as you say, ‘We’re valued at $30 billion,’ which all your PR people are going to do, people are going to look at you, saying, ‘Okay, you have a target on your back now,'” Lightman says. “It’s going to be a field day for lawyers, both on the plaintiff and the defendant side,” he adds, predicting that legal teams will “throw the kitchen sink” at the issue while duking it out to establish precedents in a currently nebulous area of the law.

“We think it’s really important to stake out what the rights are here,” Saveri tells Rolling Stone. “And clearly, the companies didn’t ask for permission — you know, they’re not even asking for forgiveness. I think to some degree, they never thought folks like us would say, ‘Wait a minute.'”

In a statement referring to their class-action as a “landmark” federal case, Ryan Clarkson, managing partner of the public interest law firm Clarkson, made it clear that the filing is a warning about AI in general. “OpenAI and Microsoft admit they do not fully understand the technology at the center of the
arms race they’ve ignited” he said in statement. “They’ve released it into the world anyway, and it’s rapidly entangling itself with every aspect of our lives.” (Microsoft has invested billions in OpenAI, which also has millions in financial backing from one of its co-founders, Elon Musk.)

Mehtab Khan, a resident fellow at Yale Law School and the lead for the Yale/Wikimedia Initiative on Intermediaries and Information, says it’s unclear whether companies behind large language models are adequately prepared for the scope of the legal headaches that lie ahead. “I think the general practice of developing LLMs has largely ignored the copyright implications and not yet developed robust measures to take user rights and copyright concerns into account,” she says. Still, she has her doubts about the Clarkson suit, which she says “seems to bunch together” many distinct and topical issues into a single complaint. She expects more copyright suits, and others involving the Computer Fraud and Abuse Act, which “prohibits intentionally accessing a computer without authorization or in excess of authorization.” Then, she says, there will be probably questions of OpenAI’s immunity under Section 230 (which shields internet service providers for liability over what users say or publish on their platforms) as well as allegations of defamation.

As for the copyright suit brought by Saveri and Butterick, Khan regards the the claim that the plaintiffs’ books were included in ChatGPT’s source material as “tenuous,” saying it’s “an incomplete picture of how datasets are developed and trained.” She acknowledges that OpenAI has revealed very little about these datasets, but says the authors will have to prove both that their writing was infringed and “substantial similarity between their works and the output generated by the chatbot.”

Noah Downs, an intellectual property lawyer and partner at Premack Rogers, says both suits “appear fairly damning,” but gives Clarkson’s better odds of succeeding than Saveri’s and Butterick’s, “primarily because privacy law and regulation is both a hot topic and highly regulated, with fairly objective measures for breach.” The copyright complaint, meanwhile, is “more of a toss-up” and “theoretical argument.” He and Khan both anticipate that OpenAI will opt for a “classic fair use defense” in that matter, which Downs explains is “typically claimed when a defendant has used copyright protected material without permission, but for the purpose of commenting on, criticizing, or educating others regarding the underlying works.” Regardless of the varying merits of the two arguments, he notes, “there are enough claims in each case that I anticipate OpenAI will not be able to overcome them all and will end up liable under at least one count in each.”

On the privacy side, Winters says big tech companies have skirted responsibility for invasive data practices due to a lack of strong privacy protections in the United States. “They are likely to use First Amendment protections around scraping and use of personal data,” along with Section 230, to deny liability, he says. Lightman agrees that U.S. citizens are more exposed on this front, having nothing like the strong General Data Protection Regulation in the European Union, and OpenAI can contend that it’s only using publicly available online information.

“But for a company to do this and provide access to this information in a way that I did not consent to — [plaintiffs’ attorneys are] going to argue that that violates privacy laws, privacy infringement,” Lightman says. He compares it to what we saw with social media giants like Facebook. “In terms of how they access data and utilize surveillance capitalism, it’s the exact same thing all over again,” he observes, “where innovation is happening so fast, it’s not so taking a break — which a lot of people are calling for.”

While there’s no shortage of angles at which to attack OpenAI, Butterick — a designer and programmer in Hollywood who only came back to his legal practice thanks to the problems posed by AI — tells Rolling Stone that defending the property of creatives is a valuable approach. “When you start talking about people putting in data to a website, and privacy, there’s issues of terms of use,” he says. “Legally, the book authors don’t have an agreement with OpenAI. We know that they put their books in the world, they publish them, OpenAI found them somewhere, brought them in and trained on them. It’s not legal. I would say, legally speaking, it’s a nice, clean set of facts.”

He and Saveri are hesitant to predict an avalanche of suits against AI companies, though they sense a shift in the national mood, with Butterick mentioning how the specter of this technology has galvanized a strike movement in the entertainment industry. Saveri laughs when reminded that OpenAI’s CEO dismissed their previous GitHub complaint — even though a federal district judge had decided it was sufficient to proceed — as groundless. “It’s a little churlish, right? A little bit, you know, snide,” he says. “There’s a little disconnect.”

“It feels like a lot has changed” since their first OpenAI lawsuit, Saveri says, though it was barely half a year ago. “We were really in uncharted territory.” Since then, he notes, there’s been far more public discussion, those U.S. congressional hearings, and increased momentum to pass AI regulation in Europe. “And as people think about that, and are concerned about it, they have found their way to our threshold.”

New ChatGPT Lawsuits May Be Start of AI’s Legal Sh-tstorm

Miles Klee