AI Training and Copyright Infringement: Lessons from the Ross Intelligence Case



Thomson Reuters sued Ross Intelligence for using its content to train its AI technology. Scott Hervey and Tara Sattler talk about this copyright dispute on this installment of The Briefing.

Watch this episode on the Weintraub YouTube channel here.

 

Show Notes:

Scott:
Thompson Reuters, the provider of the Westlaw Legal Research Platform, sued Ross Intelligence for copyright infringement based on Ross’s use of Westlaw content to train Ross’s IP technology. This court ruling on a motion for summary judgment may provide guidance for future similar cases, and it even provides some additional guidance into the application of a post-Warhol fair use defense. We’re going to talk about this case on the next installment of The Briefing by Weintraub Tobin. These are the basic facts underlying this lawsuit. Ross is an AI legal startup. Ross hired a subcontractor to create memos with legal questions and answers. The questions were meant to be those that a lawyer would ask, and the answers were direct quotations from legal opinions. Those memos were used to train Ross’s AI tool. Thompson Reuters contends that these questions were essentially Westlaw case Headnotes. Ross denies that the Westlaw Headnotes were copied but also raises a fair use defense. As the case went forward, both sides moved for summary judgment on Ross’s fair use defense. The court denied the party’s motions for summary judgment on Ross’s fair use defense, but there are a few points in this opinion that may shape the way future AI training cases play out.

Tara:
Before we get into the analysis of Ross’s fair use defense, the court spent a significant amount of time talking about the scope of Westlaw’s copyright. Westlaw’s copyright extends to its Headnotes and its arrangement of the Headnotes and opinions, but its copyright does not extend to the opinions itself.

Scott:
That’s right, and the reason the court spent so much time talking about the scope of Westlaw’s copyright was because Ross challenged Westlaw’s copyright claim in the Headnotes. Ross claims that the Westlaw Headnotes follow or closely mirror the language of the judicial opinions. And if a Headnote merely copies a judicial opinion, it’s not copyrightable. But if it varies more than a trivial amount, then Westlaw owns a valid copyright. The court found that this leaves a genuine factual dispute about how original the Headnotes are. If the Headnotes are mere regurgitation of parts of an opinion, this will severely impact the strength and the extent of Westlaw’s copyright case and Westlaw’s copyright in its whole, including in the Headnotes. And it also goes to whether Ross was copying the Headnotes or the opinions themselves.

Tara:
So, how do you see this applying to other copyright cases involving AI training?

Scott:
Well, this analysis is part of the extrinsic test, which is used in the determination of substantial similarity after the plaintiff has identified specific criteria which it alleges have been copied. The court separates the unprotectable elements, such as facts or ideas, from the elements that are protectable. And then, it sorts out whether there is enough similarities between the works as to the elements that are protectable, such that a reasonable jury could find that the defendant’s work is substantially similar to the protected elements of the plaintiff’s work. This analysis is part of any copyright case, and it certainly will be part of an AI training case as well.

Tara:
Okay, so back to Ross’s fair use defense, because the court finds that Ross actually copied the head notes. So fair use balances four factors: the purpose and character of the use, the nature of the copyrighted work, the amount and substantiality of the portion used in relation to the copyrighted work as a whole, and the effect of the use upon the potential market for the copyrighted material. The first factor assesses whether the use is transformative, as established in the Supreme Court case of Campbell versus Acuff Rose music. Transformativeness occurs where the new work adds something new with a further purpose or different character, altering the first with new expression, meaning, or message.

Scott:
Right? And the Warhol decision now requires courts to ask, as part of the first factor, whether and to what extent the use at issue has a purpose or character different from the original and whether that supports a justification for copying. So now, the first fair use factor will analyze whether the purpose of the use of the second work is different enough from the first to reasonably justify copying. Under the Warhol decision, a transformative use cannot be found for any use that just adds some new expression, meaning, or message. Now, the purpose of the use must be distinct enough from the purpose of the original use to justify copying.

Tara:
We have always known that the first fair use factor is extremely important, but based on recent post-Warhol cases, it seems that the fourth factor is of equal importance also.

Scott:
I would agree with that assessment, Tara. I think this is because the fourth-factor effect on the market also ties into part of the first factor, that part being commerciality.

Tara:
So Ross’s use was clearly commercial, plus its goal is to compete with Westlaw. This weighs against fair use.

Scott:
True, but in this decision, the Court seems to pull away from what may seem to be an overemphasis on the weight of commerciality in Warhol. The court determined that the use in question was not fair use, largely by emphasizing its commercial nature. But the judge in this case said that he declines to overread one decision, especially because the court recognized that a use’s transformativeness may outweigh its commercial character and that in Warhol, both elements pointed in the same direction. And further supporting this court’s position is the recent case of Google versus Oracle, a technological context that is much more like this case. Thompson Reuters versus Ross. In that case, in Google versus Oracle, the court placed much more weight on transformation than commercialism.

Tara:
Westlaw made a strong argument that Ross’s use was not transformative. Westlaw is a legal research platform that synthesizes the law. Ross used Westlaw’s synthesis to build a legal research platform that also synthesizes the law. I’m certain that Ross presented a more nuanced argument supporting transformativeness, right?

Scott:
You’re right, Tara. Yes, Ross did. So, let’s remember the court found that Ross copied the Westlaw Headnotes. Ross argued that it’s copying of the Headnotes is part of building a search engine that avoids human-intermediated materials, meaning a user would simply enter a query and then get a responsive quotation from a judicial opinion, no clicking around or any commentary needed. Once the plain language entries are entered into the ROS database, they are converted into numerical data. Next, ROS feeds that numerical data into its machine learning algorithm to teach the artificial intelligence about legal language. The idea is that the artificial intelligence will be able to recognize patterns in the question-answer pairs. The idea is that those patterns can be used to find answers, not just to the exact questions fed into the AI platform, but to all sorts of other legal questions a user might ask.

Tara:
This seems to follow the logic underlying the intermediate copying cases. In those cases, a user copies material to discover unprotectable information or as a minor step towards developing an entirely new product. So, the final output, despite using copied material as input, is indeed transformative. The Supreme Court has cited these intermediate cases favorably, particularly in the context of adapting the doctrine of fair use in light of rapid technological change.

Scott:
That’s right, the intermediate copying cases will have a great impact on all of the other AI training copyright cases. Ross says that its AI studied the Headnotes and opinion quotes only to analyze language patterns, not to replicate Westlaw’s expression, but define these language patterns. That will allow Ross to develop a wholly new and competing product, a search tool that would produce highly relevant quotations from judicial opinions. In response to natural language questions, the court said that if Ross’s characterization of its activities is true and accurate, the Ross’s final product would not contain any output of infringing material, and Ross’s use would be transformative intermediate copying now the court is leaving it to the jury to determine if Ross’s stated intention is actually, its intention, though, the jury will.

Tara:
Have to determine whether Ross’s AI studied the language patterns in the Headnotes to learn how to produce judicial opinion quotes or whether Ross used the untransformed text of Headnotes to get its AI to replicate and reproduce the creative drafting done by Westlaw’s attorney editors.

Scott:
That’s right. And I think there’s a really tall order for the jury, but they are the trier of fact. I’d like to talk just a little bit about the implications of these intermediate copying cases on the number of AI training copyright infringement cases that are out there right now. I think that this doctrine and the holdings in the intermediate copying cases may very well be an incredibly high hurdle and maybe a hurdle that the plaintiffs may not be able to overcome depending upon what the AI is trained to. So, you know, there’s AI training copyright infringement case brought by Sarah Silverman, right?

Tara:
Tara. Yeah, and a couple of other authors. And in that case, those authors are claiming infringement and copying of their work that is being done when the AI platform chat GPT is being trained using their materials. Right.

Scott:
But if the AI platform in this copyright infringement lawsuit could establish that, okay, it ingested the material, it copied the material, but it was only for the purpose of finding patterns in the writing style such that a user could ask for the creation of an entirely new story that does not incorporate any of the creative elements of the original work. But it’s in the style of Sarah Silverman that might very well be acceptable intermediate copying and could very well be, at least under the finding of this case, fair use.

Tara:
Yeah, I think that’s right. And I also wonder about the new Warhol decision that focuses on transformativeness and the purpose of the use needing to be different in order to actually be a transformative. And that holding wasn’t there yet and hadn’t come down when A lot of these intermediate copying cases were decided. So I wonder if that’s going to have an impact on all of these decisions as well.

Scott:
I mean, it might, but I have to come back to this decision, right, where the court in this case said, I’m not going to overemphasize the reading and the weight of commerciality. Basically, this judge kind of backpedaled from what was like almost equal weight of the fourth factor with the first factor and just basically said, well, if it’s more transformative, that will outweigh the fact that it’s commercial in nature and the fact that it may have a significant impact on the marketplace for the first work, kind of going back to how we used to analyze transformativeness and fair use and the fourth factor because there were a number of decisions that said, if it’s transformative, it doesn’t really matter as much if the transformative work has a negative impact on the market, for the first. Yeah, we’re going to have to keep our eye on this. Right. There’s a lot of this case can impact a lot of elements, not only in AI training but also in fair use analysis post-Warhol. I mean, how important is the fourth factor?

Tara:
Right.

Scott:
Before this case, it seemed to be of equal importance. Now, this court is saying, well, maybe not so much in technology cases. So, we have to keep our eye on kind of both lines from this case. Right? Any progeny, any other cases that cite these cases for those particular aspects? Because I think this case could have some very big impact on those two areas of the ever-changing law.

Tara:
Definitely, it’s really interesting and really timely. So it’ll be great to see how this changes and evolves and where we go from here.

Scott:
Thanks for talking about this today, Tara.

Tara:
Thanks for having me, Scott.

Scott:
Thank you for listening to this episode of the Briefing. We hope you enjoyed this episode. If you did, please remember to subscribe, leave us a review, and share this episode with your friends and colleagues. And if you have any questions about the topics we covered today, please leave us a comment.