Can Data Help Shoot Your Book to the Top?

Melissa Gouty
May 10, 2020
6 min read

Popular books, big data, and the New York Times Bestseller List

Popularity equates to the New York Times’ Best Seller List

I recently got a response from a reader who was disappointed that “The Rest of the Story:

Delia Owens and Where the Crawdads Sing,” didn’t discuss what made the book so wildly popular. That comment made me wonder: What factors make a book skyrocket to success? What is it about certain books that make them popular?

Is there research that might help writers understand the essential traits of a popular book? Can data help writers get to the elusive and much-desired bestseller list?

The term “bestseller” originated with a guy named Charles Reade, an English writer who was extremely popular and prolific in the mid-1800s, but who we’ve never even heard of today. John Sutherland, a literary critic, suggested that we don’t know about Charles Reade now because he’s the perfect example of the idea that a “Bestseller” is temporary. Bestsellers are simply “a snapshot of an age.” Here today and gone tomorrow.

First, you might want to know what, exactly, is the definition of “bestseller”?

A bestseller is defined as a book that sells 5,000 or more diverse sales in a week. The term “diverse” means that the sales have to have come from multiple sources. In other words, you can’t sell five thousand books to your loyal fans on your private website. The sales have to come from a number of vendors.

Lots of Bestseller Lists exist, but the most widely recognized (and the one most desired by authors) is the New York Times Bestseller List. This list is compiled from sales figures submitted by vendors throughout the United States. Data comes in from independent bookstores, national chains, newsstands, universities, gift stores and supermarkets, as well as online book retailers.

The New York Times Best Seller List has its controversies, but that’s a story for a different day. It is still considered to be the most powerful list in the publishing industry.

What are the odds?

One data analyst, Albert-Laszlo Barabasi, has evaluated thousands of fiction and non-fiction titles that hit the NYT Best Seller List, and his findings shed some light on general patterns.

100,000 titles are published traditionally each year. Over the past ten years, only 800 fiction titles made it to the bestseller list. A tiny percentage of the books released made it to the Bestseller list.
Of those 800 literary fiction books, 67% of them are plot-driven books in the categories of romance, mystery, or thrillers.
The bestseller list is dominated by only a few authors. 85% of best-selling novelists had multiple books listed. In ten years, James Patterson, writer of thrillers, had 51 books listed.
For nonfiction releases, biographies and memoirs have the best chance of making it to the bestseller list.
It’s a lot harder for a nonfiction writer to have multiple titles on the bestseller list. Only 14% do.
The best chance of making the bestseller list is to do it immediately after your book is released. While there are some exceptions to this finding — most notably The Help and Where the Crawdads Sing — most book sales peak immediately after release and decrease over time.
Only 8% of the books on the NYT list make it to the number one spot.
25% of books that hit the list fall off after one week.
37% spend more than 4 weeks on the list.

Another team of researchers programmed computers to see if they could predict bestsellers. Jodie Archer and Matthew Jenkins wrote a book titled, The Bestseller Code based on the data they accumulated after taking 20,000 books and sorting them into 500 categories. They created unique algorithms that tracked emotional language and word patterns. Their data suggested that

Most successful novelists concentrate on one or two topics in their books while less successful writers might include 6 topics.
Some topics within the books may discourage sales: smoking, deep grief, and political revolutions to name a few.
Books set in the jungle, the desert, or the ocean don’t sell as well as books set in the home.
Dog storylines outweigh cat storylines.

Should data influence what writers write?

Does knowing these facts alter what you want to write about or how you want to structure your next book? Do you suddenly start putting a dog into every book, even though you don’t own one yourself? Do you change the setting of your novel from the jungle to the park because you think it might sell better?

Nope. Writing is still an occupation not mandated by computer-driven criteria. Whatever you write, it has to be authentic. Writing to some assigned formula takes the joy out of the writer and passion out of the book.

Can data predict bestsellers based on the intangible emotional appeal a book might have for a reader? Can algorithms understand beautiful language? Do scientists understand the impact of crafting a universal experience in words? Can computers detect perfect pacing or the charisma of a character?

Nope. The quality of writing may be subjective. Editors, agents, and critics never agree on those elements, so how could programmers agree on creating an algorithm that could detect that?

I appreciate these extrapolations. They may even subtly influence my subconscious. But I’m a writer who creates from the wellspring of her heart and data doesn’t drive me. Landing a book on the bestseller list would fulfill a fantasy, but the hope of that happening is NOT why I write.

Writing with the sole purpose of landing on the bestseller list is a fool’s errand. I would wager that most writers don’t even think about that when they’re pounding out their manuscript page by painful page.

We write because we “feel” a book — not because it’s prescribed in some formula somewhere.

Back to the question that spurred my interest in this topic…

What made “Where the Crawdads Sing” skyrocket to the bestseller list and stay there?

“The Long Tail of Where the Crawdads Sing” by Alexandra Alter explains it best:

The book was not clearly defined in a genre, a quality that made publishers hesitant to take the book on. But the fact that the novel didn’t fit into a prescribed “type” of book turned out to be an asset. Where the Crawdads Sing had multiple ways for a reader to relate to it. It included a mystery, a murder, a fight-for-survival, a coming-of-age story, and lush lyrical nature writing with accurate botanical sketches.
The story appealed to a wide demographic of readers from all political affiliations
The novel was heavily promoted by Independent Bookstores.
Reese Witherspoon chose it as her Book Club selection.
Word-of-mouth hype increased sales.
Sales increased visibility.

Alter suggests that “Crawdads” succeeded in part because it spoke to the quiet and isolation that we often crave in our modern world: It had

“…an oddly unifying effect in a time of rapid technological advances and constant social media connectivity.”

The unexpected success of Where the Crawdads Sing proves that data, predictions, and prescribed expectations of publishers can’t account for the response of the human heart.

Data and research might help identify patterns of bestsellers, but the “unexpected” can still happen

While computers can analyze thousands of books and millions of words and compile findings, they can’t account for the unexpected appeal of a unique story or a “different” kind of book that touches a reader in surprising ways.

The computers would never have predicted the success of Where the Crawdads Sing. They wouldn’t have guessed that the novel would defy the typical pattern of peaking sales immediately after it was released and then faltering. Instead, it built up sales, hundreds of thousands of them as people talked about it and bookstores promoted it.

A book’s emotional and psychological impact on each individual can’t be put into a statistical analysis. We laugh, cry, get angry, or feel pain with a book because it touches us on a soul-level, something we can’t measure or even anticipate.

As Frances Wilson so eloquently states in her essay, “When Harry Met Fifty Shades: What Makes a Book Popular”

The point about bestsellers, from Grisham to Gone Girl, is that they echo stories we already know and which have sunk, like shipwrecks, deep into our psyche. The books that sell by the shedload are those that return us to the narrative shapes of our childhood, to the plots that shaped us.

So you want to write a bestseller?

Go for it. I wish you luck.

Knowing the patterns of the bestseller list might help in your quest to write a popular book. Understanding the data may give you an inside scoop on what to write and how to write it.

I’m fascinated by the ability of data scientists to create algorithms intended to predict popular success. I respect the intellect required to develop technology and methodology that might help me advance in my chosen profession of writing.

And yet…

I want to ignore the data and just write my story. I want it to rush out of the river of my being, roaring and wild, uncalculated and free. I want to caress it, shape it, nurture it, honor it, and put it out into the world, hoping that — despite the odds — the unexpected will happen and it will shoot miraculously to the top.

No data needed.