Wikipedia:Wikipedia Signpost/2014-01-15/Op-ed
Licensed for reuse? Citing open-access sources in Wikipedia articles
- The views expressed in this op-ed are those of the author only; responses and critical commentary are invited in the comments section. The Signpost welcomes proposals for op-eds at our opinion desk.
It is heavily ironic that two decades after the World Wide Web was started—largely to make it easier to share scholarly research—most of our past and present research publications are still hidden behind paywalls for private profit. The bitter twist is that the vast majority of this research is publicly funded, to the tune of hundreds of billions of dollars worldwide each year.
This has placed Wikipedia in an awkward position with respect to its verifiability policy: "all material in Wikipedia mainspace, including everything in articles, lists and captions, must be verifiable [so that] people reading and editing the encyclopedia can check that the information comes from a reliable source." Combined with the policy on identifying reliable sources, the paywall dilemma faced by editors and readers becomes clearer: "many Wikipedia articles rely on scholarly material. When available, academic and peer-reviewed publications, scholarly monographs, and textbooks are usually the most reliable sources." Not only this, none of the academic journals most cited on the English Wikipedia are open access (PLOS ONE breaks the drought at No. 22 on that list).
While WP:PAYWALL advises: "Do not reject sources just because they are hard or costly to access". Commenting on a draft proposal that Wikipedia articles should preferentially cite open-access literature, one editor wrote that "verifiability isn't an option if people are expected to pay in excess of $20 to view a single article ... over closed- or toll-access resources of equivalent scholarly quality". That draft proposal—started in 2007 when the English Wikipedia was half its current age—died quietly like so many.
But what if we could just mark references as being open, rather than preferentially citing them over closed ones? WikiProject Open Access is currently exploring the options, and the Workgroup on Open Access Metadata and Indicators (OAMI) at the National Information Standards Organization has been working on a set of recommendations for how to provide information about the use and re-use rights of scholarly articles. A draft version was released last week, and public comments are invited until 4 February.
These recommendations boil down to two metadata tags:
<free_to_read>
, which signals whether and when a publication is available publicly without a requirement for payment or registration, and<license_ref>
, which points to a stable place on the web containing the licensing terms applicable to that publication.
The recommendations don't include:
- a definition of the term open access;
- specifications as to which licensing terms would be acceptable, or whether and how they should be version-controlled; and
- suggestions for icons that may be suitable for signalling the content of the proposed tags.
Similar recommendations have been put forward in a more broadly scoped draft report from Jisc, the UK body that supports senior-high-school and higher education. The draft had been was released for public comment in September, and its final version is still being worked on. A related report from the Confederation of Open Access Repositories looked at components of license clauses in use by scholarly publishers.
One of the organisations involved in the NISO Workgroup is CrossRef, which is working on including the proposed tags into their metadata and making that information available through their API, in collaboration with the Directory of Open Access Journals. The Open Article Gauge, developed by Cottage Labs with support from the Public Library of Science (PLOS), already provides article-level information about licensing terms for a subset of the scholarly literature; PLOS has signalled an interest in implementing a system that would provide licensing information for references cited in articles published in its journals, which are among the most well-known open-access journals.
The NISO document contains a scenario quite similar to searching for illustrations for use in Wikipedia articles:
“ | A user wishes to use visual images from an article, either in a single case or in some automated re-use pipeline. Acting in good faith, the user seeks licensing information, e.g., at PubMed Central or a similar source, to ascertain his/her rights. However, in some cases the article licensing metadata is contradictory or incorrect. For example, an article might be properly licensed under CC BY, but the publisher (or whoever is adding metadata) is making conflicting licensing statements or identifies other restrictions not provided for in the license.1 | ” |
The reference 1 (broken in the NISO document) refers to the November 2012 open-access report (part of the Wikimedia GLAM newsletter), which lists examples of such conflicting licensing statements and served as the basis for a more detailed analysis published and presented last October.
It is the potential for these kinds of incongruencies that motivated the NISO group to opt for signalling only the stable home (the URI) of the licensing terms and not individual use and re-use rights. Many publishers use licensing terms incompatible with Creative Commons licenses, and to understand their implications, Wikipedia users might need legal assistance; this makes it difficult to see how signalling those terms (other than perhaps by way of {{closed access}} or {{subscription required}}) would incur any benefit to those users.
The case is different for Creative Commons licenses: their URI (e.g. http://creativecommons.org/licenses/by/4.0/) already signals re-use rights, making it easy to implement the <license_ref>
, while their corresponding <free_to_read>
tag can always be set to "yes", and compatibility with the NISO recommendations would be ensured.
On Wikimedia sites, a number of external link icons are already in use that act on certain elements of a URI—for example, a lock icon for HTTPS, as in https://www.eff.org/copyrightweek (which is this week, a period of action around copyright, organised by the Electronic Frontier Foundation). So having the CC BY icon displayed right next to external links that contain the string "http://creativecommons.org/licenses/by/" would be straightforward. Once the licensing information is available via the CrossRef API, a link to the appropriate CC URI could be added automatically to template-based references (e.g. by way of Citation bot, which was migrated to Wikimedia Labs last weekend).
Since Wikidata has enabled phase I support for Wikisource on Tuesday, it would even be possible to link to the full text available from Wikisource (see also the Wikisource vision) and to the corresponding Wikidata entry, as demonstrated in the reference. Of course, there is room to economise on space, such as by linking the icons directly rather than adjacent text bits, and if the article is covered on other Wikimedia platforms (e.g. Wikiquote, Wikinews, Wikispecies), the corresponding links could be included as well.
Currently, Wikidata items can be created for sources supporting statements on Wikidata, but the details of whether and how other sources (e.g. those supporting statements in a Wikipedia or Wikibooks page) are to be handled—or whether Citation bot should be ported to Wikidata—remain yet to be worked out. Two taskforces have been created to work on this: one for books and one for periodicals.
Irrespective of the details, I think that if Wikipedia articles were to signal the openness of scholarly references they cite, this would go a long way towards raising awareness of open licensing among users of Wikimedia content, amplifying similar efforts by open-access publishers and even Google, whose image search by re-use rights (available since 2009) was simplified this week.
References
- ^ a b Williams, J. T.; Carpenter, K. E.; Van Tassell, J. L.; Hoetjes, P.; Toller, W.; Etnoyer, P.; Smith, M. (2010). Gratwicke, Brian (ed.). "Biodiversity Assessment of the Fishes of Saba Bank Atoll, Netherlands Antilles". PLOS ONE. 5 (5): e10676. Bibcode:2010PLoSO...510676W. doi:10.1371/journal.pone.0010676. PMC 2873961. PMID 20505760. CC0 full text media metadata
Discuss this story
- On that I disagree. Kinda. I think. It depends on what you mean by policy. Open-access sources should be encouraged for reasons of verifiability--which is a reason "other than the quality and reliability". Sources that one "can't instantly access" negatively affects verifiability, and as such, negatively affects the quality of Wikipedia. Otherwise, I think we're going to have to come to some common understanding on how many thousands of dollars readers/editors should have to spend, and/or how many thousands of miles they should have to travel, to give effect to WP:VERIFY. Keeping in mind readers/editors could be anywhere on Earth--and beyond. Int21h (talk) 03:32, 22 January 2014 (UTC)[reply]
- On sources that are completely equal wrt reliability and bias then freely-accessible is a bonus point that should be encouraged. But typically they are not. Encouraging freely accessible sources can actually introduce bias -- for example, when some newspapers are freely accessible but others aren't. Many of our high-quality articles on "serious" topics are sourced to books, and I wouldn't want people complaining they fail WP:V because they should use BBC Online instead. I should note that some readers have better libraries than others (my local library is useless) and not everyone is a student or academic with access to university libraries. So sourcing to pay-for media is a significant barrier for many editors. -- Colin°Talk 09:42, 22 January 2014 (UTC)[reply]
- Verification should never require travel or the expenditure of significant sums of money. In the vast majority of cases, all that is required is getting a library card, or filling out an interlibrary loan request, or making a post on Wikipedia:WikiProject Resource Exchange/Resource Request. We should encourage editors to take those steps instead of worrying about hypotheticals. Any verification that requires significant travel or spending is likely a matter for professional scientists and historians and not amateur encyclopedia authors. Gamaliel (talk) 18:48, 22 January 2014 (UTC)[reply]
- Let me guess... you're American. In the UK, I can pay anywhere from £4.50 to £15 to borrow a book through interlibrary loan and wait 6-8 weeks for it to arrive and then be asked to return it soon after. And if the item isn't available (perhaps a reference work not for loan) then I can still be charged for the search. Nobody is going to go through such a process unless they are serious about editing the article, not just verifying one fact. I'm afraid your views on verification not requiring significant travel, time or expenditure are not held by any policy and neither should they. -- Colin°Talk 20:21, 22 January 2014 (UTC)[reply]
- Okay, I deserved that. I really should know better, being a librarian and having recently read a novel in which the main character is a frequent user of interlibrary loan in the UK. But there are other avenues to pursue for verification. And not every editor is going to be able to verify every citation, and there's nothing wrong with that. The alternative is much more unpleasant, that, as you said, we substitute BBC Online for books as sources. No one would take Wikipedia seriously at that point. Gamaliel (talk) 22:19, 22 January 2014 (UTC)[reply]
I'm a bit late to the party, but the claim "Not only this, none of the academic journals most cited on the English Wikipedia are open access (PLOS ONE breaks the drought at No. 22 on that list)." is misleading at best. Going back to the 15 January 2014 version of the compilation [1], we see that the Journal of Biological Chemistry is at the top of the list, and is a delayed open access journal (12 months embargo). Likewise for #3 PNAS (6 months embargo), #4 Genome Research (6 months embargo), #6 Cell (12 months embargo). Headbomb {talk / contribs / physics / books} 19:09, 11 February 2016 (UTC)[reply]