Book Biz to Big Tech: Pay Up, Then We Can Make Up

The explosion of generative artificial intelligence technologies, including such large language models as ChatGPT, caught many in the book business off guard when it began in earnest in late 2023. Once it became clear that those models had been trained on vast amounts of copyrighted material without permission or compensation, publishing found itself thrown without warning into the ring with Big Tech.

As the industry continues to grapple with the implications of AI, its leaders are advocating for a balanced approach to incorporating the tech in a way that protects copyright while still allowing for innovation. That showdown has now reached the presidential level: last week, industry organizations including the Association of American Publishers (AAP) and the Association of University Presses (AUPresses) delivered responses to the White House Office of Science and Technology Policy’s request for public comment regarding the development of the administration’s Artificial Intelligence Action Plan.

In its submission, the AAP emphasized the critical role of copyright protections in maintaining American leadership in AI markets. It noted that American publishers generate nearly $30 billion annually in the U.S. alone and are part of a broader coalition of industries built on intellectual property that add more than $2.09 trillion in annual value to gross domestic product. For its part, OpenAI and others lobbied the Trump administration to essentially deregulate the industry by eliminating all guardrails—including, potentially, any responsibility to respect copyright.

Meanwhile, publishing houses, industry trade groups, and authors have taken the issue of copyright violations to court, hoping that the rule of law will prevail. At present, there are more than three dozen copyright lawsuits pending. One such case, against OpenAI, was filed by the Authors Guild almost immediately after ChatGPT was launched in 2023, and the guild has been supporting authors in a larger class action lawsuit against Meta for that company’s unlicensed use of what is reportedly more than 7.5 million books to train its LLM, Llama 3. Guild CEO Mary Rasenberger expressed confidence in the outcome of the ongoing legal battles.

“I think we’ll win, and the decisions will be so decisive that it’s going to send very strong messages that AI companies better go license,” Rasenberger said. She compared one standard defense employed by many tech companies accused of scraping copyrighted material without permission—that they originally intended to license content but couldn’t figure out who to talk to or how to do it—as akin to a shoplifter “walking into a store and saying, ‘Well, I don’t know whose stuff this is, so I’m just gonna take it.’ ”

In February, the creative industries scored an early legal win when a federal judge in Delaware ruled in favor of Thompson Reuters, which sued Ross Intelligence for using Westlaw’s copyrighted material to train its AI legal research platform. The court’s flat rejection of Ross’s fair use defense, AAP president and CEO Maria Pallante told PW, suggested a helpful precedent against potential markets being harmed.

Still, Pallante noted, “while favorable to publishers, the ruling doesn’t fully address generative AI issues, since it was more of a ‘direct copy’ case than one involving sophisticated AI training.” She added: “Some of the most important legal questions are based on questions of infringement versus fair use. But there won’t be one decision that decides the future for all time—that’s not how courts work. Some decisions will be targeted.”

Copy, rights

When it comes to the current political landscape, Pallante expressed cautious optimism. During a panel on AI and copyright at the London Book Fair earlier this month, she noted that “the first Trump administration was generally pretty good on IP, and the U.S. Congress, which is predominantly Republican right now, has generally been pretty good on IP, historically.”

That said, the barons of Big Tech have all demonstrated a newfound allegiance to President Donald Trump, raising concerns in the industry that the administration might favor a deregulatory stance. But there are limits to executive power, Rasenberger said, even if Trump is intent on testing them.

“Copyright is in the Constitution—it’s a power given to Congress,” she said. “There’s a lot of case law that emphasizes that. If there’s a change in copyright law, it’s got to go through Congress.”

Industry leaders who spoke with PW about the topic this past month emphasized that they aren’t opposed to AI development. They simply want clarity, compensation, and control over what comes next.

“We believe that protecting intellectual property and nurturing technology are symbiotic, and that the U.S. government has an opportunity to model leadership by upholding long-standing copyright principles,” Pallante said. “We think that’s the most ethical and sustainable framework.”

Rasenberger agreed. “We’re not suing to get rid of AI. We’re suing for control and compensation,” she said. “They’re going to use books one way or another. Let’s get them licensed.”

The defensive crouch taken by many in the industry where AI is concerned is only natural: what comes next has the potential to impact the entire publishing ecosystem, especially where rights are concerned. “I believe that unless the rights have been particularly granted to an entity, then they belong to the author or the creator,” said Regina Brooks, president of the Association of American Literary Agents (AALA)—a belief Big Tech clearly does not share.

Brooks praised some publishers for their efforts to loop their authors into the decision-making, pointing to the partnership HarperCollins inked with Microsoft last year to train an AI model on “select nonfiction backlist titles” that requires authors to opt in. “The value of the content and the fact that anyone will be interested in licensing it comes from the creativity of the ideas and the work of the authors,” she said.

The AAP and aligned organizations are advocating that publishers update their copyright notices to specifically exclude AI model training by default. “Penguin Random House was ahead of the curve on this,” Pallante said. “They had a very strong notice on the physical and digital versions of their books.”

Peter Berkery, executive director of the AUPresses, noted that many of his organization’s 160 member presses worldwide see AI as an opportunity for efficiency, even as they remain cautious about copyright concerns. “We have to seek a balanced approach,” he said. Another concern Berkery noted was the penchant for inaccuracy shown by nearly all AI models to date, and he stressed that it was vital that any tool employed by university presses is able to “deliver accurate citations.”

The bookselling community also has a stake in what comes next. “We need to know what it is that we are selling, exactly,” said Allison Hill, CEO of the American Booksellers Association. “It’s inevitable that AI is going to be part of the publishers’ workflows, but we don’t want to be selling books created by AI. It’s not what our customers want from us. We need transparency. If there is AI-generated content in books, we need to know so we can decide how to handle it.”

Seeking solutions

To that end, several licensing platforms are now emerging that aim to facilitate the legal use of copyrighted materials for AI training and convey to readers when a book is written by humans and not AI-generated. The most prominent of these, Created by Humans, has already inked a partnership with the Authors Guild. “It was the first out there,” Rasenberger said, “and they were very willing to talk to us and hear our perspectives.”

Other potential solutions offered come from the Copyright Clearance Center (CCC), which has extended its business to AI licensing, and Calliope Networks, a new aggregator of content licensing to gen-AI companies for model training that initially focused on audiovisual and music rights before expanding into text. Roy Kaufman, managing director for business development and government relations at the CCC, offered publishers some simple, straightforward advice while speaking on a panel about AI licensing at the London Book Fair: “Do what you can to protect yourself and get paid.”

Larger publishers are increasingly making direct deals with tech companies in lieu of working through intermediaries. But they too, Pallante said, need to be wary of overreaching on copyright grounds: “They can go through their catalogs and figure out what they can license and what they can’t, and then they decide how they’re going to split it with authors.”

The intersection of AI and copyright is a worldwide concern, and with the global economy making it more likely for rights to many IPs to overlap, the ways in which other nations address the issue impacts the U.S. When it comes to English-language rights, the U.K. government has proposed an “opt-out” system for AI training, which allows tech companies to presume permission to train their models on extant material until they are informed otherwise.

Creative industries in the U.K., led in particular by Dan Conway, CEO of the U.K. Publishers Association, are pushing back strongly against this proposal. And the global publishing business continues to coordinate internationally on the issue; the AAP, for instance, has filed comments on the U.K. negotiations, and is working with the International Publishers Association on establishing a framework that can be enforced across the globe.

“Opt-out is clearly a violation of international law—the treaties that we’re part of, and that most of the world are part of—the Berne Convention, the TRIPS Agreement, and others,” Pallante said. “They don’t allow formalities, and having somebody have to actively opt out of a use that somebody made without permission is clearly an obstacle to the exercise and enjoyment of your copyright.”

Despite legitimate concerns over AI-generated content flooding the market, industry leaders see human connection as publishing’s most enduring value. “Readers hate AI-generated anything. They get mad when they see an AI-generated cover,” Rasenberger said. “Human readers want that human connection to an author. It makes you see the world differently.”

Brooks observed that AI’s tendency to homogenize highlights the value of authentic voices. “It’s going to be much more difficult for AI to appropriate content from diverse authors,” she said, suggesting that such books “are less formulaic and have distinct points of view.”

The next two years will likely bring crucial court decisions that will shape how AI and publishing interact for decades to come—and determine if tech companies will offer any remuneration for the IP they have already plundered. The stakes, industry leaders stressed, cannot be overstated: no company, Big Tech or otherwise, should get free access to the creative work that represents decades of investment in human creativity.

Or, as Rasenberger put it: “You shouldn’t have stolen our books. And now you’re going to have to pay.”