AI Comes For Audiobooks

AI Comes For Audiobooks

By Tanya Eby

THERE CAN ONLY BE ONE

A couple of years ago, there was an article published on Publishers Weekly touting the shiny potential of AI generated audiobooks. This article sent shock WAVs (wink wink) through the audiobook industry. Narrators were so ‘shook’ that they quickly responded on social media IN ALL CAPS. This is equivalent to yelling, and audiobook narrators as a general rule, don’t yell (our voice is our instrument and our livelihood), so this was a big deal. 

  The article stated: “Wouldn’t it be great if publishers could get rid of ‘the talent’ and reduce the long production cycles, push a button, and, presto, an instant audiobook, ready for sale?” Followed by: “The value proposition for automated audiobook creation becomes a combination of cost and convenience, where convenience is a combination of both simplifying and vastly accelerating the production process.”

  Ah, ease of production! Lowering cost! Get rid of those pesky actors, directors, and engineers! 

  What the article didn’t acknowledge was the profound difference in AI text to speech vs. narrators. AI creates a product. Narrators (and the production team) create an experience. 

  So, narrators have a right to yell. The article was biased and written by someone who had a vested interest with possible financial gain in AI technology. 

Now, Apple has unleashed the AI Narrator through their AppleBooks platform, making it even easier and quicker and cheaper! for writers to turn their books into listenable experiences. (If you can ignore the herky jerky quality of a computer reading to you.) 

The truth I’m starting to face: AI is coming for audiobooks, and AI is ruthless. It feels like a Highlander situation where thousands of narrators using their well-researched Scottish accents are fighting against a behemoth of technology, and eventually, there will only be one. 

  Of course, I have a vested interest too in writing this article. I’m fighting for my job, my life, my kids’ security. I’m not alone in this. And this fight against AI isn’t limited to audiobooks. AI’s slick tendrils are reaching out everywhere: to bank tellers, truck drivers, servers, hotel staff, artists and more, as automation threatens to take over. Automation has always been a threat. Way back in the 1700s, on the banks of some English village, weavers were replaced by machines in order to make more material more efficiently. I’m not saying automation is inherently evil, but it is bad for workers. As technology adds, it also subtracts. The truth of it is AI is coming not just for audiobooks…AI is coming for you.

  Ahem. 

  Perhaps it is clear that I am sometimes dramatic and possibly have read, and enjoyed, many sci-fi novels. Sci-fi, though, can become history if we’re not careful.

 

LET’S NOT FILTER THIS

 

Audiobooks are booming and there’s a lot of money to be had in the industry. AI developers want you to believe that computer generated audiobooks will be cheaper, easier to produce, and faster. Maybe that’s even true, but, of course, it isn’t the whole truth. It’s the shiny spin. Audiobooks might be cheaper to produce, eventually, by cutting out all the human work that goes into creating an experience, but that’s not a savings that will be passed on to the consumer. The cost for audiobooks will remain the same. There are already audiobooks out there ‘read’ by computers, and there’s no cost difference. Audiobooks will be cheaper to produce, sold at the same rate, and hence the profit will be larger for the rights holder. 

So this really isn’t about the consumer. It’s about profit.

Can we maybe for once, not gloss over things? Can we not Instagram reality with filters and fancy lighting and just say what this is about? It’s about money. More money for the rights holder. Not less cost for the consumer. And while there may be in the near future more audiobooks produced with less cost to the rights holder, it doesn’t necessarily mean greater accessibility or better experiences for listeners. It just means more.

An argument that investors frequently make in producing audiobooks cheaper with AI technology is the issue of accessibility. Audiobooks are expensive, and there are many people who read audiobooks not just for pleasure but because auditory reading helps. (Yes. Listening to an audiobook is reading. The brain engages in story the same way when hearing the words as it does when seeing the words.) 

Elizabeth is a fan of audiobooks and she says, “My son has dyslexia. Audiobooks were essential in helping him learn to read. Text-to-speech just doesn't work when learning reading fluency and comprehension.” 

There’s something about listening to a human voice. Text-to-speech or AI can come close to mimicking the voice, but it can’t mimic the soul of what happens when one human talks to another. Books are more than just a collection of words. These words are connected to each other, over and over, to build an entire world created by an amazing author, and to be interpreted by the performer. In this collection of words and sentences and chapters, there is meaning, intention, desires, emotions, history, connection, and on and on. 

An audiobook narrator will also change their voice for different characters by using accents, altering the pitch or texture of their voice to denote age and personality differences, etc. And, like Lark references, there is fluency and comprehension that happens through spoken language. With AI, the words are all there and spoken in the proper order, but something’s just a little off. What’s off, is there’s no soul in the words, because AI doesn’t have breath to breathe into the piece.

Laura Martin enjoys audiobooks too and says, “They were very helpful in making it through the isolation of recent times. This probably speaks more to my pathetic social life, but listening to the narrators provides a semblance of human connection.”

We are so isolated for many reasons: politics, the pandemic, modern living. This loss of connection is showing. Sometimes I listen to audiobooks to be transported to another world but also because there‘s just something primally comforting about being read to. Additionally, many of the narrators are my friends, so I get to have them in my kitchen with me while I cook without worrying about masks and infection and all those scary things. Come on over, friends. While you battle orcs, I’ll battle making sure this chick pea curry has enough flavor. 

WHAT ABOUT ACCESSIBILITY? 

Accessibility is important. We need audiobooks. For so many reasons. But to say that accessibility is only possible with the use of AI is also a bit of a spin. There are things companies can do right now to make audiobooks more accessible, but are they doing them?

Charity Schaffer is a small business owner and read 100 audiobooks last year. She says: “The only audiobook advertising I see is for Audible but there are many other valuable sources, often free. Why is this? It’s a gap that could bridge so many worlds.”

Right now, companies could allow greater access to audiobooks through libraries. This already exists, but there could be more. There could be programs developed where people with learning struggles could access audiobooks for free. And there could be a push in educating the community about how to access these resources. Maybe some of these things already exist, but they’re hard to find. Why? Because, again, we’re talking about money. 

I’m not idealistic enough to believe that all books should be free. Audiobooks are part of the publishing umbrella and are goods that can be sold, and should be. There are real jobs and livelihoods connected to the production of audiobooks. Authors benefit from royalties which helps support them in their lives and encourages them to keep writing. 

To argue that AI is a better choice for audiobooks simply because of accessibility, isn’t the full story. Even though a computer can translate written text to spoken word, is it actually more accessible? Is it better? Is good enough, enough?

   A computer can read words to you but it can’t help you synthesize the material. It can’t break down a complicated line of text and pull out the meaning, add pauses, slow it down or emphasize important information. It can’t whisper a line and have you feel the intensity of the silence. It can’t put the slightest spin on a sentence so that you understand the line is said with contempt instead of praise.

Author Evelyn Jeannie Hall says: “For me, listening to audiobooks is a unique auditory experience. It’s performance art. I love reading as well, but I often remember books I’ve listened to—the intonations, the emotions it evoked—more than the written word. Only audiobooks make me gasp in response and I love that.”

  Have you ever been to an MFA reading or open mic night? Think of the poet who reads and is lost in the beauty of their language. It sure sounds pretty but what does all that stuff about starlight and heaving bodies mean? Then think of the poet who reads and focuses on the meaning of the poem. The beauty of the language shines through, but at the end of the poem, you feel something. The hairs on your arms rise. Your throat constricts. You get tingly. Something. Because the meaning, along with the musicality, sings to you. Computers don’t sing. Computers recite.

AI AUDIOBOOKS ARE NOT AUDIOBOOKS

 

Let’s call it what it is. AI audiobooks are NOT audiobooks. They’re text-to-speech programs. An audiobook is performed by an actor (or many actors) who interprets the words of an author to create an experience that is layered with meaning and nuance. So if AI is the narrator of a book, sell it as a text-to-speech book. If the book is performed by a human, then call it an audiobook. This way, consumers know what they’re getting. 

There is room in the industry for advancing technology and supporting the beauty and importance of human artistic expression. Think back to sitting in kindergarten, cross-legged, listening with wide eyes while a volunteer parent read to the class. Think of attending a performance of Shakespeare where the actors interpreted the language in a way where you suddenly realized that Shakespeare is funny and tragic and sometimes downright sexy. 

There are many types of listeners, with many types of reasons for loving audiobooks. Let’s give them a wide range of voices to choose from that fit the diverse stories we are lucky enough to experience. AI shouldn’t replace narrators, emerging as the victor for all eternity. In this fight about AI and narrators, narrators–and the teams of humans that create an audiobook experience–want to be valued. We want to keep working. We want to be heard. 

 ###



TANYA EBY has had many roles in audiobooks: narrator, producer, publisher, and director. She recorded her 1,000th title in the spring of 2022. She is also a blogger, novelist, and writer of tiny love poems. Find her on IG  TWITTER and her website: tanyaeby.com

Previous
Previous

Who Do You Want To Be?

Next
Next

One Week Social Media Freeeee