From its launch in 2001, Wikipedia—the Internet’s hugely popular, user-created free encyclopedia—was viewed with suspicion by reference publishers and librarians. It was unreliable and unsustainable, its critics argued—how can you trust a sprawling free resource that can be added to and edited by anyone?
Some 16 years later, with more than 45 million entries in 300 languages, as well as some 45 million images and 100,000 citizen editors (known as Wikipedians), this nonprofit experiment in harnessing the power of collective knowledge is well established. Wikipedia today is a top-five global website. And many publishers and librarians today are collaborating with Wikipedia, seeing it not as a threat, but as an ally that can drive users to their local libraries, and to expert resources.
One of the most important of those collaborations is the Wikipedia Library, a program that makes a digital library of high-quality resources freely available to Wikipedia editors.
The Wikipedia Library (TWL) is the vision of Jake Orlowitz, who runs the project for the Wikimedia Foundation. A longtime, dedicated Wikipedian (he has made more than 35,000 Wikipedia edits himself), Orlowitz says it all began with a realization he had back in 2010: Wikipedia editors really needed access to a good library.
“One day I made a whimsical, and fortuitous, call to HighBeam Research, because my two-week free trial had just ended, and I was looking to finish a biography on an influential but niche alternative medicine practitioner who had recently made the news,” Orlowitz told PW. “I asked, ‘Do you have any free accounts I could use to edit Wikipedia? And, maybe a few for some of my editor friends?’ Their response was immediate: ‘How about 1,000?’ That’s when the Wikipedia Library was born.”
Today, the Wikipedia Library provides Wikipedia editors access to more than 80,000 subscription journals and a growing collection of books from more than 70 high-quality publishers. It also facilitates working relationships with the library community, including work with the major library associations and related partners like the Internet Archive, and OCLC.
For librarians, working with Wikipedia makes sense for many reasons, not the least of which is that libraries and Wikipedia share a core mission to serve the information needs of users. For publishers, they benefit by getting their works in front of Wikipedia’s vast audience—about 500 million readers a month.
“Publishers want to be found, they want to be read, they want to be shared, and they want to be counted,” Orlowitz says. “Being cited on Wikipedia is a way to do that.”
PW recently caught up with Orlowitz to get his take on Wikipedia today, and the state of reference in today’s ever-shifting digital information landscape.
For starters, what’s in the works for the Wikipedia Library—any new initiatives forthcoming?
We have so much going on, but there are three key pieces. The first is improvements to the Wikipedia Library Card Platform. Some 70 publishers have donated access to their resources for Wikipedia editors, but that access so far has been managed through a clunky, inefficient individual signup method. The improved platform will be a single place for Wikipedia editors to apply for and receive direct access to TWL resources through a proxy.
We are also expanding our library networks and outreach through things like the #1lib1ref campaign and the newly formed Wikipedia Library User Group, an independent, volunteer-run Wikimedia affiliate. The #1lib1ref campaign asks librarians to “give Wikipedia the gift of a citation” by finding a source for those infamous “citation needed” tags that pop up on Wikipedia articles. And the user group will solidify and expand library-Wikipedia collaborations far beyond what my small team can do.
Third, we’re looking to provide a better experience for our users. In the long run, we want all of Wikipedia’s citations to be free to read for users. In the meantime, we’re working with partners like the Internet Archive to make sure more than a million URLs are properly archived and functioning; with OCLC to make it possible to cite books automatically, via an ISBN; and with OAdoi and OAbot to make free versions of paywalled sources cited on Wikipedia accessible and easy to find.
It has been interesting to watch librarians’ opinions evolve with Wikipedia. How would you say the library community views Wikipedia today?
When I founded the Wikipedia Library in 2011, we mostly had secret supporters—librarians who would whisper to us at ALA conferences that, despite what people said, they still used Wikipedia. But that stigma was always there—that it’s not reliable, don’t use it, period. Some saw us as an upheaval of authority, of good reference practices, of academic rigor.
But, at the same time, librarians were also some of the earliest adopters of Wikipedia, and many of our standout community members are librarians. Today, I’d say familiarity, acceptance, and pragmatism are helping to remove the stigma from Wikipedia. A lot of work has gone into changing the Wikipedia narrative from ‘you just can’t trust it’ to a more nuanced one , that Wikipedia actually serves to introduce users to broader information literacy skills. People can, and definitely should, use Wikipedia in lots of settings—even academic or clinical ones. But they should use it as a starting point.
In a forthcoming essay, you write that your goal is to position Wikipedia as the “virtual front page of every library in the world.” Explain that vision?
For years, Lorcan Dempsey at OCLC Research has been saying that “discovery happens elsewhere,” meaning that whatever your website, collection, institution, or resource, it is seldom a person’s first stop in their quest for information. When it comes to finding information, users almost never start where the information originates. The entry points almost always live somewhere else, whether that’s the most accessible place, or the place with the most efficient search, most ubiquitous presence, or the lowest cost.
Discovery happens on Wikipedia. A billion unique devices access Wikipedia every month. It attracts more than 6,000 page views every second. And Wikipedia results are often found on the first page, if not in the first three spots, of a Google search. So if Wikipedia is used by almost everybody, well, what would you call that? We’d call it the virtual front page of every library in the world. You have to be where your users are. And our hope is that readers who engage with Wikipedia will go on to explore the full-text resources cited there, whether in books, repositories, publisher websites, or, of course, in their public or university libraries.
One of the fascinating aspects of Wikipedia to me is how, early on, before social media, it foresaw the power of collective knowledge online. Can you talk about the power of crowdsourcing to fulfill our information needs, and the challenge of vetting the deluge of information online?
Having an inspiring mission and an open platform can ignite a deep a desire in people to contribute, collaborate, and share. But what I can say about the magic of crowdsourcing is that you don’t just throw any problem or question at any crowd on any platform. It takes a consciously designed, addictively compelling experience. It takes a large, diverse, and inclusive community with healthy behavioral norms. And it takes countless hours of consideration and effort.
As for vetting, Wikipedia uses both algorithmic and human filtering—a virtual gauntlet through which information must pass. Yes, Wikipedia is the encyclopedia ‘anyone can edit.’ But those edits must pass through machine learning bots running on increasingly sophisticated neural networks looking for common vandalism patterns, through hundreds of language-matching RegEx filters catching bad words, through thousands of human “recent change” patrollers, and through tens of thousands of people’s personal article watch lists. Then they go through the eyes of readers, any of whom can change an error or add a missing piece. We congratulate people when they say, “I edited Wikipedia!” But the real marker is being able to say, “I made an edit to Wikipedia—and it stuck.”
Information abundance is a challenge of the digital age, but we also now live in an age of “fake news,” where authority is under attack. Even flat-earthers are resurgent in 2017. Has this complicated Wikipedia’s mission?
Facts have been under threat for as long as there has been publishing, but I agree that threat has accelerated over the past decade with the flourishing of social media. Indeed, as the world increasingly moves online, citizenship is increasingly digital citizenship. At Wikipedia, we have tried to communicate this though our #factsmatter messaging, because digital literacy—the ability to distinguish fact from rumor, virality from authority, and evidence from misinformation—is essential to the healthy functioning of all people and governments. Digital literacy is an inoculation against propaganda.
I don’t know how to stop Macedonian fake news farms from pushing out false stories on social media. But what I do know is that those items won’t last on Wikipedia. As a tribe, Wikipedians are very critical consumers of information and rigorous about sourcing. What matters on Wikipedia is not your opinion or even one’s credentials as an editor, but the sources that back up your claim. We don’t care how viral the source, or how it flew through the Twitterverse. We care only that it verifies the specific content of an article, and that the author or publisher have a reputation for fact-checking and accuracy.
In the pages of PW we’ve had some lively discussions in recent years about the future of reference publishing in the digital age. What’s your take on what publishers face today, and where things are heading?
I think publishing is in an exciting but tumultuous in-between phase, especially academic publishing. There’s been tremendous evolution and flux around everything from peer review, to article levels and alternative metrics, open access and business models, creative commons licensing, social media, you name it. And it’s not just a function of consumer demand and alternative technologies. There are also compelling moral and political arguments that publicly or philanthropically funded research and information—including life-saving discoveries and life-empowering research—shouldn’t be locked up and available only to those with massive subscription budgets.
Personally, I believe and passionately advocate that open access is the future we should head toward, whether that means a proliferation of free-to-access preprint databases such as arXiv, a global flip to “gold” open access journals, or some other, more fully realized version of open research. But I’m also pragmatic about all of this. I grew up in the era when Napster basically overthrew the entire distribution and consumption model of record companies. I went from buying albums on CD to buying songs on iTunes to carelessly downloading songs from random online links and then back to paying for monthly access to Pandora and Spotify. So, I work on the assumption that systematic change is hard, and that we should work collaboratively on positive solutions.
The discussion has also been lively regarding the future of library reference services. Any trends or insights you see among the libraries and librarians you engage with?
I won’t speak for the library community, but I’ll share a personal observation. I worked on the last two New Media Coalition Horizon Reports on Libraries, and I see a lot of excitement around two opposing forces: digital collections, and physical spaces. These are undergoing a dynamic evolution as people look to meet and make things in library buildings, but also want to search and consume entirely online, virtual objects. In both cases, users don’t want static, read-only information. They want to engage, interrogate, adapt, and create. What I would emphasize is that, no matter what you do to physical collections or the reference desk, the role of a well-trained librarian has never been more vital.
I’m curious to know your personal thoughts on the future of print resources in libraries? Anything you still prefer in print?
I think content is, in an economic sense, fungible—it matters more that you have it than how you prefer to get it, or where you use it. So I think libraries need to be platform-neutral and user-centric. If users want print, give them print. If they want to stream e-books, and never come into your newly renovated building, well, give them e-books.
Personally, I operate my life from an Android phone and a business-class Chromebook. But I do keep a few books around—mostly because I want to physically possess them. Open Access by Peter Suber and The Atlas of New Librarianship by David Lankes are always on my desk. I held onto some classic tomes after college, Locke and Foucault. I own a hardcover of Kurzweil’s sprawling, exceedingly optimistic The Singularity Is Near. I’ll read the New Yorker in print, though I subscribe online.
What I do prefer in print, though, and keep ready by my bed, is poetry—Ladinsky’s translation of the Sufi poet Hafiz is ecstatic. And Ursula K. Le Guin’s essays are lovely, and count as something other than prose.
Much has changed over Wikipedia’s first two decades. Can you give us a sense of what’s coming next?
Well, we just spent over a year working on a movement-wide strategy consultation envisioning the next 15 years. It involved thousands of conversations with volunteers, experts, researchers, and more.
The broad strokes are: We will be developed in more languages. Our interface, which is simple and uncluttered now, will become even more digestible and user-friendly. Where our content is free to read, it will become easy to share. Where our community is open to all, it will become more inclusive, especially to those with fewer resources. Where our subjects are well covered and free-flowing now, our knowledge about our knowledge will become increasingly structured and machine-readable. Where our community already has thousands of bots working overtime to keep Wikipedia clean, we will increasingly use artificial intelligence to perform other tasks that used to consume large amounts of mental, and emotional human labor.
The finer details will be officially published later this fall. But the entire process has been already transparently documented—on a wiki.