0:00
/
0:00

Split Decision in the Anthropic Case

Just last month a federal judge in California handed down a decision in a high-stakes lawsuit that could redefine how artificial intelligence companies train large language models (LLMs). The case, Bartz v. Anthropic, was closely watched across tech, publishing, and legal circles. And while some headlines framed it as a "win" for Anthropic, the truth is more nuanced

Most significantly, the court found that Anthropic’s training its LLM Claude on lawfully acquired books qualifies as fair use. But it also ruled that Anthropic broke the law by hoarding millions of pirated books. The company now faces a jury trial, possible class action and potentially massive penalties.

This was not time for a victory lap. If AI companies want public trust and long-term viability, they need to build on an ethical and legal foundation. For that, clear rules are a necessity. This case is a big step in that direction.

What Was the Case About?

Three authors, Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, sued Anthropic, alleging the company infringed their copyrights by using their books to train its Claude AI models. The twist? Anthropic acquired the books in two ways. They lawfully purchased and scanned mny books, but 7 million more were from pirate sites like Books3 and Library Genesis.
Anthropic argued that both uses were transformative and protected under the doctrine of fair use. The authors argued that both were infringing. The court split the difference.

A Tale of Two Data Sets

Purchased Books = Fair Use

Judge William H. Alsup ruled that training an AI model on books legally acquired through purchase or license is transformative and protected by fair use. The court compared this to teaching a student how to write by reading literature. This built learning patterns, it was not reproducing pages. One of the ruling’s most memorable lines:

“Plaintiffs’ complaint is no different than if they said that teaching schoolchildren to write well would result in an explosion of competing works.”

The implication is clear: Learning by machines is no different than learning by humans. It is not infringement if the purpose is to generate something new and not to copy existing works verbatim.

Pirated Books = Infringement

The court showed no patience or leniency when it came to Anthropic’s massive stash of pirated books.

Anthropic had downloaded and stored millions of illicit files, many of which were never even used in training. The court said this wasn't fair use, wasn't transformative, and “was irredeemably infringing.”

A jury trial in December 2025 will determine how much Anthropic will pay. If the infringement is found to have been willful, it could trigger statutory damages of up to $150,000 per work.

Those penalties would be a nice payday for the three authors and a payout Anthropic can likely afford. But if the court also certifies a class action for all authors with copyright protected work in the Anthropic data trove, the financial risk could explode into the billions.

A Legal Lifeline for the AI Industry

The part of the ruling that most AI companies are celebrating is the recognition that training models on lawfully obtained content can be fair use. That’s a huge deal.

This means developers don’t need to get individual licenses for every book, article, or website they use—as long as:
- The content was obtained legally,
- The model doesn't reproduce protected expression, and
- The use is transformative (i.e., it’s about learning, not copying).

For AI to evolve, this kind of legal breathing room is essential. Imagine if every student had to get permission to read a book before learning from it. That’s not how human education works, and the court ruled that it’s not how machine learning has to work either.

Fair Use ≠ Free-for-All

Anthropic’s big mistake was failing to respect that boundary. Instead of sourcing training data responsibly, they downloaded pirate files by the millions, kept them in a “central library,” and only began cleaning up after litigation began.

That kind of behavior makes it harder for other companies to argue in good faith for AI-friendly policies. It also fuels distrust among authors, artists, and publishers, many of whom already feel steamrolled by tech.

The lesson is painfully simple: Fair use protects innovation, not exploitation. And stealing content doesn’t become legal just because your end product is cutting-edge.

Entitlement, Then and Now

If this all feels familiar, that’s because we’ve seen it before.

In the early 2000s, Napster burst onto the scene and upended the music industry. Built by an 18-year-old college student named Shawn Fanning, Napster made it possible for millions to share and download MP3s without paying a cent. It was revolutionary and illegal.

While Judge Alsup rejected the analogy to the Napster case, Anthropic’s behavior, which is the root of the problem, has similarities. It’s an attitude that’s all to common in the AI space.

Fanning wasn’t just coding. He was also making a cultural argument: “If it’s on the internet, I’m entitled to use it.” It’s the same logic behind much of today’s AI data scraping: If I can access it, I can use it. If it’s transformative, it’s fair. If it’s innovative, the rules don’t apply.

That mindset, rooted in engineering bravado more than legal reasoning, has fueled everything from early torrent networks to modern machine learning pipelines built on scraped, copyrighted content.

But as the courts ruled then, and are ruling now, freely available doesn’t mean available for free. Just because something can be downloaded, copied, or parsed by code doesn’t mean it’s legally or ethically yours to use.

After the courts shut down Napster, the music industry pivoted to streaming. The turnaround happened when law, licensing, and technology realigned. Spotify, iTunes and their many industry peers won by respecting intellectual property and building a business around it.

Napster was built by a teenager. Anthropic is a multibillion-dollar company that behaved like a teenager. The stakes are higher and the excuses weaker. Innovation doesn’t create entitlement.

If you like what you’re reading, tell your frineds. This post is public so feel free to share it.

Share

Byte Sized Tech Tips is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Leave a comment

Discussion about this video

User's avatar