Last week I had the pleasure to be invited by the Association of Learned and Professional Society Publishers to speak about the future of publishing and what role data might play. The panel I was on was comprised of folks from Nature, Wiley-Blackwell (think, Dummies books and CliffNotes), the Royal Society of Chemistry and was chaired by Geoff Bilder of CrossRef.
Data plays an important role in the process of getting an article into a journal. It is the raw material from which researchers extract meaning and analyze their findings. But once the article has been written and the sources cited, that is normally the end of the road for the data so far as consumers of the information can see.
What normally happens then is that the people reading the article who want to explore the issue further embark on a laborious text mining exercise. They find the numbers among the prose so they can put them back together to have a look for themselves.
There are many obvious problems with this, not the least of which is that in an attempt to keep control of the data, the authors are essentially losing track of who might be doing what with it anyway. The integrity of the data is compromised due to invariable human-error in the extraction. And the sapling of exploration, innovation and derivative works is pruned before it has a chance to thrive.
While it will be a long time before both technology and attitudes change to the point that all raw data will be open and available, it is high time the derived data referenced in articles be made available to the people reading them.
There are around 20-25 thousand scholarly journals active today, and this is growing at a rate of 3 or 4% annually. Global journal readership is in the 10-15 million range, and about 6 million of those readers are also researchers and potential authors.
The number of scholars is trending similarly upward, fueled in large part by growing Chinese, Indian and other developing countries making massive investments in education, research and development.
In such an environment, how can we add to the experience for all key stakeholders: authors, readers, publishers?
Authors are motivated by a myriad of reasons. Top among them are recognition among peers and the need to publish (the old maxim "publish or perish" is as true today as it was 100 years ago). Including the data cited in their work will help engaged people quickly with their research
and enable them to reach a wider audience - including people in other fields. Data becomes another work with ownership/stewardship for which authors and researchers would receive credit.
Readers of journals are motivated by the need to be kept informed. Clearly, trust in article findings is key. And if interests are piqued by the findings, they want to explore their own hypotheses. What better way to achieve trust, enable exploration and garner interest and good-will than to allow the reader to get in on the fun of analysis with real data?
Meanwhile, publishing companies are actively looking for new and innovative ways to engage both readers and authors, build brand loyalty and community, and generate income. Including data and interfaces by which to analyze it will open up many possible revenue opportunities while at the same time help build trust, further the open data agenda and, importantly, build a strong community of passionate users of their products.
New tools in distributing and sharing data will only make journals more accessible, which in turn will make knowledge more accessible while still continuing to enrich the experience of authors, publishers, and readers. We are happy that publishers are thinking about these kinds of problems and exploring solutions. We're here to help!
Recent Comments