The advent of AI-enabled audiobook narration has been a hot topic of discussion in audiobook circles of late, and according to a number of audiobook narrators and other industry professionals, an October PW article raised the temperature of debate further. In that piece, consultant and PW columnist Thad McIlroy discussed the state of AI-enabled audiobook narration and its potential appeal to audiobook creators and consumers. He also profiled several players in this nascent nook of the still-booming digital audiobook segment. As the audiobook industry begins to more extensively weigh the pros and cons of another new technology on its doorstep, a growing number of players are joining the conversation.
“It’s practically all we talk about,” said audiobook narrator Hillary Huber. “I am a member of the SAG-AFTRA [Screen Actors Guild–American Federation of Television and Radio Artists] Audiobook Steering Committee, and I am a board member of our newly formed Professional Audiobook Narrators Association, or PANA, and believe me we are circling the wagons.” She noted that McIlroy’s piece created an uproar in narrator circles—“much of it fear driven rather than informed,” she admitted—and that it had “galvanized our community to get educated and get involved.”
Proponents of AI audiobook narration tout its much lower production costs (compared to a traditional recording of a human narrator) as a way to improve profitability of audiobooks as well as allowing publishers to publish more audiobooks that have limited audiences. But according to actor and narrator Emily Lawrence, cofounder of PANA and president of its board of directors, “It’s very easy to reduce this issue to dollars and cents, but it’s very complicated and nuanced.” If AI narration proliferates, “it’s not just narrators who will lose their jobs,” Lawrence said. “There’s an entire ecosystem of people who rely on audiobooks for their livelihood. People who direct audiobooks, people who edit audiobooks, people who check audiobook narration for word-for-word perfection against the manuscript.”
Lawrence believes there are many ethical issues surrounding AI technology. “For example,” she notes, “if I were to license my voice, and lose all control over how my voice is then used, my voice could potentially be used to voice content that I find morally repulsive.” She also points out that “as of now, a lot of AI licensing consists of non-union contracts,” and that narrators are vulnerable to entering agreements that exploit their voices and don’t offer fair compensation.
Similarly, in Huber’s view, the negatives of AI outweigh any positives. She places “loss of livelihoods, loss of integrity in storytelling, and loss of personal connection” high on her list of concerns. “The only pros I see are financial,” she said. “And it’s the other team that benefits, not the narrators nor the listeners. Do you really think [AI company] Speechki is going to pass their savings on to the listener? No. Listeners make choices about what to spend money on, and they have a right to demand clear labeling of robot voices, as do authors. And then there is the potential theft of our voices—our speech patterns, our acting choices—to create the AI. That’s a whole other can of worms.”
Publishers wade in
Some traditional audiobook publishers are eyeing new tech frontiers while seemingly treading lightly in terms of AI. “Our team continues to weigh the pros and cons and are very vocal about their thoughts and concerns on both sides,” said Anthony Goff, who until this month was senior v-p and publisher of Hachette Audio. “There are a lot of intricacies here that need to be considered.”
For Goff, “narrators are the heart and soul of our industry. You simply cannot have a beautifully crafted audiobook, a unique collaboration and work of art, without them. We believe that sensitive attention to casting the right voice actor is a crucial part of bringing our authors’ works to their fullest expression in audio.”
Echoing the position of other audio publishers, Goff sees a future in which human and computer-generated narration can happily coexist. “We do not anticipate AI narration replacing human readers,” he said, “but we do see opportunities to use it in specific ways—particularly as the quality of it has greatly improved.” He cited an example: “We have been using AI as an in-house tool for advance listening copies to get early ‘reads’ out to sales reps and even to some of their buyers.”
Pushkin Industries, the audio production company cofounded by Jacob Weisberg and author, narrator, and podcast host Malcolm Gladwell, has found AI helpful as a podcast tool. In a blog post on the website of Descript, an audio editing software, producers of Gladwell’s Revisionist History podcast described employing Descript’s Overdub feature to create a cloned version of Gladwell’s voice for use in a “scratch mix,” or rough version of the narrative, to test ideas.
According to Nicole Morano, publicity director at Pushkin Industries, this kind of AI narration does not play a part in the company’s audiobook productions. “Pushkin is dedicated to producing high-quality, sound-rich audio, honoring the human voice in all of its nuance,” she said. “Our projects, both podcasts and audiobooks, rely heavily on archival audio—including music, scoring, radio clips, original recordings of speeches and interviews—for our audiences to appreciate the variety and power of audio. We are also interested in technology and are open to solutions that help us achieve our goal of producing audio in any format that challenges listeners, encourages their curiosity, and inspires joy in them.”
Amazon-owned audiobook behemoth Audible—which is both a publisher and a retailer—has long held an anti-synthetic-narration policy that is clearly stated in the requirements section of its ACX audiobook self-publishing platform. But in recent months Audible has been called out by narrators and others who have discovered and flagged several AI-narrated titles listed on its site. Audible took down the titles, but concern remains among narrators that other AI-narrated titles will slip through the cracks.
Expanding the market?
At Hachette, Goff noted that his team is looking at using AI for some titles that have never been produced in audio before—a move that would help ensure that “the largest possible number of Hachette’s titles are always accessible in audio format,” he said. “Interest in previously unrecorded content would help us make decisions about what would make sense to bring to market as fully produced audiobook editions moving forward, created by a professional narrator and our dedicated production staff.”
Goff’s experimentation lines up with the key point that those who champion AI narration raise: AI can provide publishers with a cost-effective way to produce more audiobooks to help meet burgeoning consumer demand. Industry statistics illustrate the gulf between the number of audiobooks that get to market and the number that could potentially be recorded. According to the most recent data from the Audio Publishers Association, more than 71,000 audiobooks were published in 2020. Though that number marks an industry high, it’s still only a fraction of the number of print books published in 2020.
Actor and audiobook narrator Steven Jay Cohen appreciates this mismatch argument, at least in theory. “Since most published content each year never makes it into audio, AI-narrated books could solve an accessibility issue on paper, but not in practice,” he said. “The reason that many of these titles are not made into audio is not the cost—it is that they were written in such a way as to make them unintelligible when read from beginning to end without the charts, graphs, etc. embedded within their pages. If all that was ever needed was a voice to read out loud, then narration would have died years ago.”
Cohen added, “Unless the voice talking to you understands the content that it is attempting to relay, the listener will have a harder time ‘owning’ the information shared. This would be evident to anyone who has studied learning modalities. An auditory learner is doing more than just assessing how human a voice sounds.”
Cohen also offered a look ahead at how AI might affect the industry. “Because AI will come in as a cheaper alternative to live narration, it will affect the market mostly from the bottom up,” he said. “I would expect the indie/royalty-share market to be affected first. And as the algorithms improve, they will slowly work their way up the chain to more traditional producers. The traditional publishers will likely test out the technology on their lowest-earning content before considering using it elsewhere.”
Despite AI narration’s potential to help grow the audiobook sector, its emergence is “creating an existential crisis for our narrator community,” Lawrence said. “It is not only threatening to take away our jobs and completely remove us from the equation but—and this is my main concern—it’s threatening the art that we love. And as a community, we fully believe that what we do is art. Whether I’m out of a job or not, I would be devastated that the art that I care so deeply about is so horribly compromised.”
Next steps
Certainly, industry debate of AI in audiobooks is picking up steam. The forthcoming virtual 2022 APA Conference offers an opportunity to hear from various players via a two-part panel session on March 2 at 1 p.m. ET. The first part offers an overview of how AI is being used in the audiobook field and commentary from three companies providing AI services and tools. During the second part, two audio publishers will share opposing views on AI: one will speak on the importance of humans in audiobook production, the other will discuss how it uses AI in its publishing process.
“We’re always looking at all of the different things that are happening in the industry,” said APA executive director Michele Cobb. “Some of those are technology driven, some of those are human performance driven, and we’re trying to keep our members apprised of what’s happening. And because we do have such a large constituency of different types of members—publishers, narrators, producers, suppliers, retailers—everyone is involved in the conversation.”
Narrators are taking action by joining and raising their voices. According to Lawrence, narrators’ concerns about AI technology in the audiobook sphere was one of the drivers of her decision to cofound PANA last fall. The group grew to nearly 400 members in just three weeks after it began accepting them in October. “There was very much a sentiment among the narrator community that an organization that was run by and for narrators was going to be an important force for us in the industry on this issue as well as many others,” Lawrence said. “We’re a member-driven organization that is very focused on community activism and volunteer power.”
Huber is one of those member-narrators at the ready. “AI is coming,” she said. “We can’t stop it, but hopefully we can be proactive by creating protections and, more importantly, by raising awareness in general so that listeners and authors can make informed choices.”
Goff said publishers will continue to assess where AI may fit into the audiobook landscape. “The technology is good and getting better but can never fully replace the art of audiobook creation,” he noted. “With the amazing growth of audio over the past few years, it’s clear that consumers love narration by professionals, and we do not see AI replacing that—rather, we see it supplementing it.”