If you’re working in publishing in 2023 as a writer, agent, or journalist, there’s no doubt you’re thinking about AI and how ChatGPT is changing the industry. We at Authory.com certainly are. Authory is a self-updating automated platform that aggregates all the content you publish—articles, videos, podcasts, social media posts, and more—to create a discrete, complete, and searchable online portfolio. It’s a great resource for writers of all kinds, from book authors to bloggers and journalists.
For us, the conversation keeps circling back to the role of humanness, particularly how we can know whether a human wrote something and not a machine. That question is one that’s stumped even the creators of ChatGPT. OpenAI quite quietly shut down its AI classifier because of “low rates of accuracy.”
To train a large language model on how language is used and manipulated, you take large swaths of information and run them through large algorithms. OpenAI shares that it used publicly accessible information on the internet, licensed information from third parties, and information from human trainers for ChatGPT. All of these groups of information are content that’s already been created.
Here at Authory, we’re in a different situation. We don’t need to search the internet for hoards of information. Authory generates portfolios and automatically archives content for individual writers when they share their bylines or publications. We then archive that work for our customers. As a byproduct, we have access to thousands of pieces of writing that, because they’ve been added by their authors, are effectively verified as human-created.
As such, we’ve rethought the classifier problem stumping OpenAI by changing it slightly. Instead of comparing writing to OpenAI, we compare a specific human’s writing to what they’ve created recently. We base “humanness” on how well someone matches their previous styles. We call someone’s style a “fingerprint”; each author can have multiple signature styles, just as their hands have multiple fingerprints. We trained a large language model on generating embeddings that are based on the style gleaned from over a million articles. This embed is mapped to a vector where semantically similar vectors (in our case, similar writing style) are placed closer together. The content here is ignored; it’s how you say something that matters.
In essence, we’re not comparing ChatGPT’s output to the information that trained it. We’re seeing if what was written has a style that matches what was previously created by a very specific-sounding human, and then we look at how that sounds different from what’s created by a machine. The TL;DR version: the machine reads different books than you, and its essays sound different than yours.
Now, while this technology can classify what kind of writing is AI-generated and what’s been created by a human, it doesn’t necessarily get at that larger question: what differentiates human-generated text from machine-generated? Today, we’d wager that most people can tell when a human has written something from ChatGPT outputs. ChatGPT can be great for summarizing content, giving instructions, and curating what’s out there, but not so great when it comes to telling a story and having a style, a voice, that still feels uniquely human (and that’s why Authory is able to detect the difference).
To this end, we’ve created a “Human Writer” Certificate to provide credible evidence that your articles were, with a high degree of certainty, written by a human, not by artificial intelligence.
And this matters. For me, and for many people, humanness is why we read or listen. We’re not just looking for summaries of information, we’re looking for stories. We like our pundits and our influencers. And we like specific authors for their style. We know Jane Austen, James Joyce, Clarice Lispector, and J.K. Rowling, each whose narrative styles we’ve come to love like a friend’s mannerisms.
But of course, there’s not just how you say it, but what’s said. And where would we be without their books, which allow us to peek into lives most of us could never know.
And that’s one way in which generative AI may always fall short of humans. Large language models work by digesting information and summarizing it; they become generative AI when they can predict what needs to be said, or said next. But information that’s new to the world? Well, that needs to be seen and experienced by someone before it can ever be shared digitally. In essence, ChatGPT can’t write an article or story about something that isn’t yet publicly known. Humans are discoverers of what yet hasn’t been touched, shared, or made sense of in the natural world. It is through life that we bring new stories into the world.
Eric Hauch is the CEO of Authory.com.