Recently, experts working on ORCID and ISNI, both ISO-standard identifiers, spoke with one of the founders of schema.org to facilitate their use as embedded persona references, particularly through extensions such as BibExtend. That probably sounds like gobbledygook to you. You may even think it irrelevant to your day job—but you would be quite wrong. What that string of acronyms represent are ongoing efforts to incorporate author identifiers into metadata about books and other publications so they can be found more easily through search engine queries at Google, Bing, etc. Does it sound more important now?
Indeed, there is a flurry of work underway in the Internet standards community that involves publishing. And yet, publishers are largely absent from most of these conversations. If you look at the membership of the most important Web standards organization, the World Wide Web Consortium (W3C), there are only two large publishers listed: Hachette and Pearson. None of the other Big Five: Random, Macmillan, HarperCollins—are currently members. And yet today, publishers commonly complain that users can’t find their books for sale when they do searches on the Web.
If you perform a Google search for Catcher in the Rye by J.D. Salinger, for example, you’ll get a listing at Amazon, a panel with metadata about the title, including direct links to Goodreads and Barnes & Noble. What enables that display are metadata standards, like schema.org. If HarperCollins wants to make an authoritative link to one of its published titles appear alongside these entries, it would transmit the proper schema.org metadata as part of a manifest it provides to search engines.
And that’s not at all hard. Schema is relatively simple, and a heck of a lot less confusing than ONIX. Schema is a parsimonious standard that is focused on linking together disparate tags and identifiers so they can be presented in a manner that helps users (and other software) make associations between relevant data, like reviews and pricing. ONIX is a metadata behemoth that requires arcane content management databases and massaging. Generally, few outside publishing have even heard about it.
As the pace of change in Internet technologies has quickened, and our ability to work with networked data grows more sophisticated, publishers have continued to place overwhelming reliance on traditional “house” standards organizations like BISG and AAP that have to cope with the needs of existing workflows and supply chains. These organizations do tremendous work. Unfortunately, they are not a big part of the conversation around the technical standards that make information easily locatable on the Web, or media capable of fluid interaction within the browser. In contrast, organizations focused on publishing technologies, like IDPF and Readium, provide a very useful bridge to larger standards organizations like the W3C. The IDPF has worked to make the larger Web standards community aware of the needs of the publishing industry, and helped catalyze vital new W3C communities, like the Digital Publishing Interest Group.
Publishers need to continue to support these organizations, but they shouldn’t assume that the full 360-degree scope of their needs will necessarily be addressed. The focused efforts behind ePub 3 and online reading environments will be built atop of the broader standards of the Internet and Web ecosystem, rather than existing as separate silos. It’s likely that we’ll soon see the PDF format as the last standalone digital document format; standards will forever now be open to the network.
What does that mean for publishers? For one, publishers need to staff standards organizations with senior, technically informed individuals who can authoritatively represent their interests. As Bill McCoy, the executive director of IDPF, notes: “By all means, please participate in IDPF and help shape the next generation of portable documents based on Web standards, but that’s not enough to ensure that publisher requirements will get addressed across the wide spectrum of standards that collectively comprise the Open Web platform. The W3C’s Digital Publishing Activity will not achieve critical mass if only two publishers end up being W3C members.”
As you browse the technology at this year’s London Book Fair, consider being more active in broader open-source and open-standards efforts, including but not limited to joining and participating in the W3C. These organizations can often be expensive to join, and their processes sometimes time-consuming, but they are building the paper, pen, and ink of the next generation of storytelling. And it is critical for you as publishers to articulate your vision for how we will build the literature of the future in a networked age.