The Briefing: Westlaw v. Ross AI – Is This The End of AI Training or The Future of AI Training

Podcast: Play in new window | Download

West Law v. Ross AI Major AI copyright ruling – The Delaware District Court’s decision in Thomson Reuters v. Ross AI could have huge implications for AI training and copyright law. On this episode of The Briefing, Weintraub attorneys Scott Hervey and Andy Tan break down the case, its impact on the AI industry, and what it means for content creators.

Watch this episode on the Weintraub YouTube channel here.

Show Notes:

Scott:
This February, the Delaware District Court, in the case of Thompson Reuters versus Ross AI, issued a decision that will have, in all likelihood, profound ramifications on all pending AI copyright infringement cases. I’m Scott Herbie, a partner at the law firm of Weintraub Tobin, and I’m joined today by my colleague, Andy Tan. We’re going to walk through the court’s decision in Thompson Reuters versus Ross AI and discuss how this case will impact the other AI training, copyright infringement cases currently pending. We’ll also talk about what this case could mean, both for the AI industry and the creators of content on this installment of “The Briefing.” Andy, welcome to “The Briefing.” This is your first time on “The Briefing”, so thanks for doing this.

Andy:

Thanks, Scott. It’s an honor to be part of it. I’ve been a long-time fan and watcher, so it’s great to be on now.

Scott

Well, we’re glad to have you. This case is right up your alley. You do a lot of deals in the AI space, so I thought this one would be appropriate for you to do with me.

Andy

Yeah, it’s definitely coming up, and AI is the hot topic in the legal world for the foreseeable future, I think.

Scott

Yeah, that’s for sure. Well, let’s start. Why don’t we start with the facts of the case because this case, it’s got some interesting twist and turns. Andy, can you take us through the basic facts of the case?

Andy

Yeah, I would be happy to. I’ll run through the basic facts of the case. If you want a more in-depth discussion, you should check out the November 9, 2023, episode of The Briefing, where Scott and our colleague Tara go over it in detail. I’ll just go over the facts Again. But so, who are the players? Reuters owns West Law. It’s one of the primary legal research tools. Ross was a legal research AI startup. I say was because Ross AI closed down as an operating company in 2020. They said it was due to the Thompson Reuters lawsuit, but its insurance coverage probably allowed it to continue to defend the Thompson Reuters lawsuit, so we’re not sure that’s the reason. Ross hired a subcontractor to create memos memos with legal questions and answers. Now, these questions were meant to be those that a lawyer would ask, and the answers were direct quotations from legal opinions. They used these memos to train Ross’s AI legal research tool so that when a user asks a legal question, Ross’s tool responds with relevant judicial opinions, which Reuters is saying is similar to Westlaw’s headnotes. Reuters, the provider of the Westlaw Service, contended that these questions were essentially Westlaw case notes, and the court found, as a matter of law, that Ross copied portions of the Westlaw headnotes.

Andy

Ross challenged Reuters’ copyright in the headnotes and raised a fair use defense.

Scott

That’s right. This case is particularly interesting because it features something rare in federal courts: a judge reversing his own prior summary judgment ruling. Let’s start with the procedural history because that It’s unique. This case, as you said, began in 2020 when Thompson Reuters sued Ross in Delaware district Court. In 2023, Judge Bibas issued a summary judgment opinion that largely denied Thompson Reuters’ motions on copyright infringement and fair use. But then something unusual happened. As the case was heading towards trial that was scheduled for August 2024, Judge Bibas took a closer look at the materials and had what you might say is a judicial epiphany. The judge continued the trial date and invited the parties to renew their summary judgment briefings.

Andy

That’s pretty remarkable. From what we’ve seen, it’s rare for a judge to admit that they might have gotten something wrong.

Scott

That’s right. But if a judge is going to make a mistake or have second thoughts about something, there’s no better topic than the evolving world of AI. The judge actually said, A smart man knows when he’s right, and a wise man knows when he’s wrong. Wisdom does not always find me, so I try to embrace it when it does, even if it comes late as it did here. That’s pretty self-deprecating and funny for this judge. The judge does a complete reversal, and let’s dig into the legal analysis. First, there was the question of copyright validity. As part of the court’s original decision, the court initially said that it was going to leave this to the jury to determine whether Westlaw’s headnotes and its key number system had enough originality to be protected by copyright. Initially, Judge Bibas thought that originality depended on how much the headnotes overlap with the underlying court opinions. Now, this analysis was relevant because in doing an infringement analysis, you need to separate the non-protectable elements from what is protectable, and then you analyze the protectable elements and their similarities. In the recent opinion, the judge said he didn’t think this was the right approach.

Scott

The key insight was that even if a head note quotes from an opinion verbatim, the very act of selecting which portion to excerpt involves creative judgment. The judge drew from the Seminal Supreme Court case of Feist. That’s the telephone bookcase that we all learn about in law school, which held that factual compilations are original works of authorship, protectable under copyright if the compiled makes choices as to selection and arrangement using just a minimal degree of creativity. Based on that, the court found that the notes and key number system were original enough to be protected by copyright.

Andy

That’s right. As for the infringement aspect, the court handled that quickly. They noted that while they slogged through all the headnotes and determined that out of 2,830 headnotes, the court granted summary judgment findings of actual copying of 2,243 of the headnotes. They made this determination only where copying was so obvious that they said no reasonable jury could find otherwise.

Scott

That’s right. The court found infringement even after acknowledging that West law had a higher burden of similarities to meet due to the fact that the headnotes contained less protectable expression.

Andy

Let’s talk about fair use because this is the part that could have huge implications on other AI infringement cases. Let’s break down the four fair use factors. For factor one, the purpose and character of use, the court found Ross’s use was commercial and not transformative, even though the headnotes didn’t appear in the final product.

Scott

Right. Let’s dig into the court’s finding that Ross’s use was not transformative because, essentially, this was the basis on which the court found fair use in 2023 based on transformative intermediate copying. Ross argued that its copying of the headnotes is part of a building of a search engine that avoids human intermediated materials. Ross said that its AI studied the headnotes and opinion quotes only to find language patterns that would allow Ross to develop a search tool that would produce highly relevant quotations from judicial opinions in response to natural language questions and not to replicate Westlaw’s expression.

Andy

The court’s 2023 finding relied heavily on cases like Google v. Oracle and Sony v. Connectix. In those cases, the court found that copying computer code as an intermediate step was fair use. But now, Judge Bibas found those cases inapplicable for two reasons.

Scott

Right. That’s right, Andy. First, the judge said that those cases dealt specifically with computer code, which courts tend to treat differently from other copyrighted works because of its functional nature. Second, in those cases, the copying was necessary to innovate and achieve interoperability. You had to copy the code to make the programs work together. The court found that this wasn’t true here. The court said that Ross didn’t need to copy Westlaw’s headnotes to create a legal research tool. Ross could have created their own summaries of court opinions. The court said that Ross’s use was not transformative because it didn’t have a further purpose or different character from Thompson Reuters’ use. Further, the court said that the intermediate copying case the computer cases, were not applicable here. In those cases, the underlying unprotectable ideas of the computer code could only be reached by copying their expression. The court found that that just That was not the case here. The court found that the copying done by Ross was not reasonably necessary to achieve a new purpose.

Andy

If there’s no transformative use, the next two fair use factors are the nature of the copyrighted work and the amount used, and both of those factors favored Ross. However, the fourth factor, the potential market effect of Ross’s use, could have on the value of original work, seemed to primary factor for this court, they found that Ross’s product would compete directly with West law in the legal research market. Plus, Ross competes with a potential market for AI training data that Reuters might want to develop.

Scott

Right. I mean, we saw market harm playing a much bigger role than we had seen before in the Supreme Court’s decision in the Warhol case. I think normally, in the past, when doing a fair use analysis, we didn’t really focus so much on market harm, but I think going forward, in light of this case and also just in light of the Supreme Court’s decision in Warhol, market harm is going to play a much bigger role. All right, so let’s talk about the big question: How might this case impact the other pending AI copyright infringement training cases? So, Judge Bibas specifically limited his ruling to non-generative AI. He went out of his way to say that this case is not about generative AI. But several aspects of his analysis could be highly relevant to cases involving generative AI, like the New York Times case or the ongoing cases against Anthropic and Metta.

Andy

Right. While the judge did seem to want to limit the applicability of this case, it could still have implications for other pending cases.

Scott

That’s right. First, there’s the court’s analysis of the intermediate copying. The judge rejected Ross’s argument that using copyrighted material as training data should be protected because the material doesn’t appear in the final product. This analysis is crucial because many generative AI companies make the same argument that their models transform the training data so completely that the original works aren’t recognizable in the output. The judge’s analysis in this case could find its way as persuasive authority in the pending generative AI cases.

Andy

Judge Bibas focused on whether the copying was necessary to achieve the defendants’ goals, drawing from cases like Google v. Oracle. He found that unlike in Google, where copying was technologically necessary for interoperability, here, Ross could have created its own legal summaries instead of using West call us headnotes.

Scott

Right. This could be problematic for generative AI companies in the pending lawsuits involving generative AI companies because they really theoretically could have created their own training data or they could have licensed training data. The fact that it would have been more expensive or time consuming is not a legal justification for copying.

Andy

The market harm analysis could be potentially influential in other cases, though, to your point. The court here found that even though Ross’s final product was different from West Law’s, they were still competing in the same market, which is legal research. For example, in the New York Times case, the publishers might argue that even though ChatGPT produces different outputs than newspaper articles, it still competes in the market for information and news.

Scott

Right. However, there are some important distinctions. Generative AI creates new content while Ross really just help users find existing court opinions. That might affect both the transformative use in the market harm analysis.

Andy

It sounds like our conclusion is Judge Bibas explicitly limited his ruling to non-generative AI, but his analytical framework could significantly influence how courts approach generative AI cases.

Scott

Exactly. I think it will. The specific might be different, but the fundamental questions about fair use, transformative purpose, and market impact are going to be similar.

Andy

Yeah, it’ll be fascinating to see how other courts handle these same issues. Will they adopt, judge, Bibas ‘s focus on necessity and market competition, or will they create new frameworks specifically for generative AI?

Scott

Right. Let’s discuss the broader implications of this case, assuming that the framework Judge Bibas sets up is adopted by the other courts hearing the training data infringement cases. Let’s consider how this case could impact the market and both AI companies and content creators.

Andy

Yeah, for the AI companies, it probably means more licensing. It could also invest in the creation of original training data. Either way, this could mean potential increases in the cost of data trading.

Scott

I agree. But I think a clean data set is going to be necessary if AI companies want to secure enterprises licensed licenses or licenses with companies and not just consumer-facing products. No company is going to enter into a license with an AI company that doesn’t, at a minimum, provide indemnity from infringement claims.

Andy

Also, as a result, you could see AI companies either shifting into models where the focus is broad and more on public domain content or have a more narrow focus in terms of function. Personally, I think the more narrow focus is going to result in a focus on business partnerships rather than general consumer applications.

Scott

I agree. In software licensing, the money is in business licenses, as we know. This could also be extremely beneficial to the content industry. The content industry has been dying a slow, painful death, and this actually could be a turnaround. You could see new opportunities to create AI training-specific content. Those opportunities do already exist. I already know of a couple of companies in this space that are AI training content brokers, matching content producers with AI companies that need specific types of video content to train AI. But it’s not as big or as mainstream as it could be. There can be a real marketplace for custom data set creation and cleaned annotated data collection. A training data as a service model is what I see can be possible.

Andy

Yeah, in this case could be the leverage that major content owners have been looking for. Then the larger the content owner’s library, the greater that potential leverage.

Scott

Interestingly, this could start the rise of content cartels or content collectives, or maybe more M&A in the content space being driven by either traditional content companies with the focus of greater leverage in content licensing. You could also see acquisition and vertical integration of large content companies by AI companies, and giving them some type of exclusivity in a particular content space.

Andy

Definitely. The business and legal ramifications of what we’ve been talking about are probably going to reshape how all of society thinks of data ownership and copyright.

Scott

Right. Maybe this could be the lifeline for traditional media as we know it today. Who knows? Because one thing that we don’t really think a lot about is the need for training AI, it’s not static, it’s constant. It constantly needs to be updated. This might result in a continuous revenue flow for traditional media content creators like news that they just haven’t seen for quite some time.

Andy

Definitely agree.

Scott

Andy, thanks for joining me today. I hope you enjoyed it, and we look forward to having you on in the future.

Andy

Absolutely. Thanks a lot, Scott.

Scott:
That’s all for today’s episode of “The Briefing.” Thanks to Andy for joining me today. Thank you, the listener or viewer, for tuning in. We hope you found this episode informative and enjoyable. If you did, please remember to subscribe and leave us a review, and also share this episode with your friends and colleagues. As always, if you have any questions about the topics we covered today, please leave us a comment.