Authors Get Mixed Results with Initial Skirmish in OpenAI Lawsuit

Podcast: Play in new window | Download

Authors Get Mixed Results with Initial Skirmish in OpenAI Lawsuit (1) Delve into the complexities of vicarious infringement and DMCA violations in AI training. Scott Hervey and James Kachmar from Weintraub Tobin dissect the recent district court ruling on OpenAI’s copyright infringement allegations on this installment of “The Briefing.” Watch this episode on the

Weintraub YouTube channel here or listen to this podcast episode here.

Show Notes:

Scott As we have previously reported, in 2023, several authors, including the comedian Sarah Silverman, filed putative class action lawsuits against OpenAI’s ChatGPT, alleging various copyright infringement claims. On February 12th, 2024, a district court in the Northern District of California issued its order and ruled on the OpenAI defendants’ motion to dismiss various claims in the two pending putative class action lawsuits. I’m Scott Hervey from Weintraub Tobin, and I’m joined today by my partner, James Kachmar, and we’re going to discuss the Court’s order on this installment of “The Briefing by Weintraub Tobin. James, welcome back to “The Briefing.”

James Thanks, Scott. It’s good to be back.

Scott So, James, could you give us some background on these cases?

James Sure, Scott. The author plaintiffs alleged that OpenAI infringed on their published works by using these works to help train its Large Language Model or LLM. Basically, OpenAI is alleged to have scanned the books into their system to help train the language models. The authors claim that because these books are protected by copyright law, using them in this training and the output generated by OpenAI, which the app is known ChatGPT, by summarizing their books, constituted an infringement of their copyright protections in their works. The plaintiffs in the two separate lawsuits alleged similar claims against OpenAI for both direct and vicarious copyright infringement under the Copyright Act, as well as violation of Section 1202(b) of the Digital Millennium Copyright Act or DMCA, which is removal of copyright management information. The OpenAI defendants moved to dismiss all the claims alleged by the author plaintiffs, with the exception of the first cause of action for direct copyright infringement. It’s a bit unclear from the Court’s order as to why the defendants did not move to dismiss that claim as well.

Scott Yeah, I found that to be interesting. The Court began by recognizing the general rules that govern motions to dismiss in federal actions. In essence, to survive such a motion, a plaintiff must plead enough facts to state a claim to relief that is plausible on its face. In essence, the plaintiff must allege sufficient factual content that allows the Court to draw the reasonable inference that the defendant is liable for the misconduct alleged.

James That’s correct, Scott. Let’s first look at the vicarious copyright infringement claim. The Court noted that the Copyright Act grants the copyright holder exclusive rights to reproduce the copyrighted work and any copies thereof, to prepare derivative works, and distribute copies of the copyrighted work to the public. However, the Court noted that the mere fact that a work is copyrighted does not mean that every element of the work may be protected.

Scott That’s right. To allege a valid copyright infringement claim, the plaintiff must show that one, he or she owns a valid copyright in the work alleged to be infringed, and two, that the defendant copied aspects of protectable aspects of his or her work.

James That’s right, Scott. The Court was really focused on this second prong, which really contains two separate components: copying and unlawful appropriation of a copyrighted work. Generally, a plaintiff can satisfy these elements by showing that the defendant had access to the plaintiff’s work and that the two works share similarities probative of copying, while the hallmark of unlawful appropriation is that the work shares substantial similarities.

Scott The Court noted that a claim of vicarious infringement requires a threshold showing of direct infringement.

James Right. The OpenAI defendants sought to dismiss the vicarious infringement claim on the grounds that, number one, the plaintiffs did not allege direct infringement occurred. Two, that there was allegation that the OpenAI defendants had the right and ability to supervise. Three, there was no allegation that the OpenAI defendants had a direct financial interest. For the Court’s order, it’s really that first element that it focused on in its order.

Scott Ok. The author plaintiffs argued that because the defendants directly copied the copyrighted books to train the language models, they did not need to show a substantial similarity between the two works.

James That’s right. They were relying on a 2012 Ninth Circuit case, Range Road Music, Inc., Versus East Coast Foods. That really involved a cover band playing songs in a venue and copying other musicians’ music that had been copyrighted. The Court here said that the plaintiffs were apparently misunderstanding the holding in Range Road because the Court there excused them, the plaintiffs in that case, from having to show substantial similarity because it was the actual songs that were being played in the venue. The Court noted here that the author plaintiffs had not alleged that ChatGPT outputs contained direct copies of the copyrighted books. Therefore, the plaintiffs really had to allege that there was a substantial similarity between the outputs of ChatGPT and the copyrighted materials. For example, if ChatGPT was asked, “Can you read me Chapter 2 of Sarah Silverman’s book?” That may have been evidence of direct infringement, but here, it was more summarizing what the themes or meaning of the books were. The Court decided to give them leave to file an amended complaint to try to correct this to just satisfy the substantial similarity element.

Scott That’d be interesting if they do, in fact, amend the complaint, and then the Court rehears, we’ll It probably will be another motion to dismiss. If it really is about summarizing themes and concepts, there’ll be an entire argument over whether or not those in and of themselves are protectable under the Copyright Act. Let’s talk about the DMCA claim because this is an interesting one. The DMCA is part of the US copyright law, and it was added in 1999. The DMCA stands for, as you said previously, the Digital Millennium Copyright Act. The DMCA was meant to address the relationship between the copyright and the internet in 1999. There are three main parts of the DMCA. It is when establishing protections for online service providers in certain situations if their users engage in copyright infringement, including by creating the notice and takedown system, which allows for copyright owners to send a notice to an online service provider about infringing material and instructing that service provider to take down that infringing material. Encouraging copyright owners to give greater access to their works in digital formats by providing them with legal protections against unauthorized access to their works, for example, hacking passwords or circumventing encryption technologies. And three, making it unlawful to provide false copyright management information. For example, the names of authors, the names of authors and copyright owners, and the titles of work, or to remove or alter that type of information in certain circumstances. Now, here, the plaintiffs alleged the violation of the provisions of the DMCA dealing with copyright management information and the removal thereof. So, James, how did the Court treat this claim?

James Well, Scott, the Court recognized that one of the essential elements in stating a claim under this portion of the DMCA is alleging what CMI was removed or altered. Then, you must show the requisite mental state, showing that you know or have reasonable grounds to know that removing the CMI would enable, induce fa, facilitate, or conceal infringement.

Scott Yeah, and the plaintiffs allege that OpenAI defendants had, by design, removed CMI from the plaintiff’s copyrighted books during this large language model training process. But this wasn’t enough for the Court, was it?

James No. The problem the Court found was that in the allegations in the, there was nothing specific to support the claim that the CMI had been intentionally removed. In fact, in the complaint, they cited some of the summaries produced by ChatGPT, which referred to the plaintiffs by name, basically identifying the author of the work. The Court said that even if the plaintiffs could show that the OpenAI defendants had knowingly removed CMI during the training process, they had not alleged how admitting CMI and the copies used in the training gave defendants reasonable grounds to know that ChatGPT’s output would induce, enable, facilitate, or conceal infringement, especially since it was identifying the authors by name.

Scott Yeah. The plaintiff presented another unique argument to the Court that OpenAI’s refusal to state which books it was using to train its models would deprive ChatGPT users from knowing if any output is infringing.

James Right, and it’s an interesting claim and position, but what the Court said is there’s no legal authority out there. Plaintiffs did not cite any in opposing the motion to dismiss to support that theory of violation.

Scott Are we expecting an amended complaint here, James?

James Yes. I would assume that the plaintiffs will not give up this easily and will try to amend, especially since their first cause of action is still technically viable.

Scott Yeah. If they do file an amended complaint, you can be almost certain that OpenAI will again move to dismiss the claims, and they probably will raise some preemption issues and other state law claims. We definitely have not seen the last of this specific case or AI training cases in general. James, thanks for bringing this one to our attention.

James Thank you for listening to this episode of “The Briefing.” We hope you enjoyed the episode. If you did, please remember to subscribe, leave us a review, and share this episode with your friends and colleagues. If you have any questions about the topics we covered today, please feel free to leave us a comment. Thanks, Scott.