Jump to content

User:Maunus/PeerReviewReform

From Wikipedia, the free encyclopedia

This is a proposal for a reform of the way that English wikipedia uses peer review as a quality assessment tool. The proposal is meant to replace all existing peer review processes (DYK, GAN, Peer review, and FAC). The problems this proposal seeks to solve are:

  1. A backlog of reviews due to lack of reviewers. This is solved by creating a more streamlined review process, with a single review process instead of four different ones. This is hoped to consolidate the pool of potential reviewers and encourage and facilitate the participation of all reviewers at all levels of the review process. By making the final attainment of the benchmark dependent on a !voting process modeled on RfA and RfCs, the process becomes quicker and it becomes easier to attract participants because it is not required to be intimately familiar with the article in order to vote.
  2. A tendency for reviews to become stale and be archived without comments or improvements. This is solved because the review does not begin until a Review Team assumes joint responsibility for the article's improvement.
  3. A tendency for first time article writers to be likely to have their FA nominations fail, or to have other kinds of unpleasant experiences related to being individually responsible for meeting criteria that others are enforcing. This is solved because a nominator is joined by additional reviewers who commit to helping the article over a benchmark. By making the nomination and review process collaborative rather than individual, the process becomes less individual focused and solitary, and more collaborative.
  4. A tendency for reviews to focus on formal requirements such as MOS rather than on the quality of content. This is solved by adding a more content and less formality focused review process in which editors jointly improve the article, and by adding a level of external expert review which is aimed specifically at catching content problems.
  5. A tendency for GA reviews to be very superficial and based on a single person running over the GA criteria as a checklist without offering help in improving the article further. This is solved by making the Review process about improving the article and not about crossing off points on a checklist, and by making the final decision a matter of a general !vote rather than a decision by a single reviewer.
  6. A risk of improper use of "Quid-pro-Quo" or clique formation which may compromise the integrity of the review process. This is solved by making the final decision subject to a general !vote and by using the positive powers of Quid pro Quo principles and clique formation to making the review stage more collaborative encouraging mutual help between users in improving articles - but making the final decision to promote an article more objective.

Why we need a peer review reform

[edit]

The rate at which Wikipedia generates high quality content has stagnated. This is particularly visible in the processes of GA and peer review where nominations can sit in queues for months without reviews beginning, but also at the FA process where increasingly nominations are archived for lack of attention.

The slowing pace of quality improvement and review is likely due to several factors:

  1. the number of active editors is decreasing across Wikipedia which means that the number of reviewers also declines.
  2. review processes have gradually raised the standards which increases the quality but makes it harder for new editors to make the bar either as writers or reviewers . According to a recent survey of FA nominations it is extremely rare for articles nominated by first time nominators to pass the review.
  3. the review process is sometimes experienced as off-putting by newcomers which makes them reluctant to participate and hampers recruitment and retention of writers and reviewers.

Furthermore the current review process has a number of flaws that have been noticed in the past. For example:

  1. the review process encourages editors to form cliques in which they mutually review and promote each other's articles.
  2. the review process encourages editors to "collect stars" by writing articles about topics that require relatively little effort because the topics are small and obscure or can be mass produced by formula.
  3. the review process discourages editors from writing articles about large or complex topics and about topics that are likely to be of wide interest.
  4. the review process encourages reviewers to be confrontational rather than collaborative with nominators because their primary task is seen to be to enforce a standard rather than to improve articles to meet the standard.
  5. the review process encourages GA reviewers to be superficial and use the criteria as a simple checklist without substantially engaging with the article or suggesting improvements.

Some statistics

[edit]

This table describes the FA nominations and promotions from 2008-2018. We see that the number of nominations have dropped, but the rate of promotion has increased. Since 2010 we promote between 300-350 articles to FA status per year, and nominations have fallen to around 450 per year. This may be interpreted to mean that at this point the FA process is carried by a smallish group FA regulars, who know the process and know what an article needs to be like for it to pass.

Year FACs closed Promoted Archived
2008 1328 719 (54%) 609 (46%)
2009 991 522 (53%) 469 (47%)
2010 925 513 (55%) 412 (45%)
2011 665 355 (53%) 310 (47%)
2012 636 375 (59%) 261 (41%)
2013 651 390 (60%) 261 (40%)
2014 505 322 (64%) 183 (36%)
2015 485 303 (62%) 182 (38%)
2016 365 227 (62%) 138 (38%)
2017 461 338 (73%) 123 (27%)
2018 (JF) 61 52 (85%) 9 (15%)

A single rubric-based review process

[edit]

The quality of an article is assessed based on the following rubric, which assesses nine different parameters grouped into three categories (Content, Style, Additional material). Each parameter receives a value of 1, 2 or 3 - 3 being the highest quality and 1 the lowest acceptable quality. The specific level of quality described by each value is described in detail. The Rubric is used by the Review Team when deciding for what Benchmark status the article should be nominated for a !vote, and when forming a consensus about whether the article is ready for !voting.

The rubric is basically a checklist - but to award values higher than 1 it does require the reviewer to engage critically with the article and assess its merits. But most importantly the rubric is used by a team of editors while they work towards improving the article for the final !vote, and as a guide to the !voters when deciding to !vote support or oppose. This means that we do not get situations in which a single reviewer passes the article over a with the checklist and promotes the article - rather it becomes an instrument for orienting improvements such as sometimes happens with GA reviews.

The rubric is supposed to be simple to use, so that basically any user can assess any article with it. Ideally the same rubric will be used by wikiprojects to monitor internal article quality - and ideally wikiprojects will gradually change to use the BA, GA, EA assessments instead of the currently impressionistic evaluations of articles as "stub", "start class", "C class", B class, GA, and FA. Any article that has not been reviewed and promoted will be classified as UA - Unreviewed Article.

To be a Basic Article (eligible for DYK) the article must score at least a value of 1 in all 9 parameters, to be a Good Article the article must score at least 2 in all parameters, and to be an Excellent Article (or FA if we want to keep that terminology) it must have at least a score of 3 in each parameter.

Content

[edit]
Comprehensiveness:
  1. The Basic Article adequately defines the subject and accurately provides the most basic and significant facts about it. It does not necessarily go into details or provide complex or extended description of any aspect of the topic.
  2. The Good Article adequately defines the subject, and accurately describes all of the main aspects of the subject and provides the most significant details. It does not necessarily describe every aspect of the topic in depth.
  3. The Excellent Article accurately defines the subject, accurately describes every relevant aspect of the subject, and assigns the due weight to each aspect of the topic without focusing unduly on specific aspects or neglecting others. Compliance with comprehensiveness requirements for EA articles may require soliciting and receiving an External Peer Review (see below).
Referencing:
  1. The Basic Article complies with WP:V; it has inline citations to reliable sources for all information in the article. It includes no plagiarism or close paraphrasing.
  2. The Good Article has inline citations to reliable sources including those for direct quotations, statistics, published opinion, counter-intuitive or controversial statements that are challenged or likely to be challenged, and contentious material relating to living persons; it has at least one inline citation in each paragraph. Citations are preferably to high quality sources. It shows an adequate engagement with the scholarly literature on the topic.
  3. The Excellent Article includes inline citations to the best available sources and represents a thorough and representative survey of the relevant literature. Compliance with referencing requirements for EA articles may require soliciting and receiving an External Peer Review (see below).
Neutrality:
  1. The Basic Article complies with WP:NPOV; it has no obvious bias or tendentious representation. It is not currently tagged for POV concerns or other forms of bias, and there are no on-going discussions about neutrality concerns or bias.
  2. The Good Article comes across as objective and neutral, without obvious or implicit tendentious representation, the selection of sources and information is not tendentious or biased.
  3. The Excellent Article is fully objective and neutral, and faithfully represents the state of knowledge in the fields of knowledge to which it pertains, it describes all notable and significant viewpoints according to the principle of due weight. Compliance with neutrality requirements for EA articles may require soliciting and receiving an External Peer Review (see below).

Style

[edit]
Layout:
  1. The Basic Article has an adequate definition sentence, and presents information in an accessible and organized manner.
  2. The Good article is well-organized, it has an adequate lead and represents information in a structured manner with an adequate use of sections and subsections, it complies with the MOS layout requirements, and the requirements for lead sections, words to watch and embedded lists.
  3. The Excellent Article is well-organized and complies fully with the MOS, has a well-written lead that is neither too long or too short and which serves as a full stand-alone summary of the article, and it has a graphically appealing lay-out without disturbing elements or layout glitches that detract from its visual presentation.
Citations:
  1. The Basic Article includes sources for the information, and presents them in a way so that it is possible to identify the source of each piece of information.
  2. The Good Article includes inline citations for all information and at least one citation in each paragraph, it contains a list of references which is presented in accordance with the layout style guideline
  3. The Excellent Article includes inline citations for all information, consistently formatted in accordance with the MOS using either footnotes (<ref>Smith 2007, p. 1.</ref>) or Harvard referencing (Smith 2007, p. 1)—see citing sources for suggestions on formatting references. The use of citation templates is not required.
Length:
  1. The Basic Article is at least one paragraph long, and contains more than 1,500 characters of prose.
  2. The Good Article contains at least a Lead paragraph and a main section with any relevant subsections. It follows the Size Guideline.
  3. The Excellent Article stays focused on the main topic without going into unnecessary detail and uses summary style.

Additional material

[edit]
Media/Illustrations:
  1. The Basic Article does not require media.
  2. The Good Article is illustrated, if possible, by images which are tagged with their copyright status, and valid fair use rationales are provided for non-free content; and which are relevant to the topic, and have suitable captions.
  3. The Excellent Article has images and other media, where appropriate, with succinct captions, and acceptable copyright status. Images included follow the image use policy. Non-free images or media must satisfy the criteria for inclusion of non-free content and be labeled accordingly.
Hook:
  1. The Basic Article has a "hook" which is a one-sentence description of the articles content designed to advertise the article and make readers interested in it. The Hook must conform to the DYKJ Hook guidelines.
  2. The Good Article does not require a hook unless it is to be featured at DYK
  3. The Featured Article must have a one-paragraph summary that can appear on the front page when the article is featured there.

A possible Fourth Level of Quality

[edit]

If the community thinks it is a good idea it could be possible to add a fourth level to the rubric for articles that have been vetted by external experts through the external review process, and vouched for by the copyeditors guild. Such an article would ideally be of fully publishable academic quality. It would however still have to pass through the community !vote.

The Peer Review Process

[edit]

The Proposal Stage

[edit]

The proposal stage begins when an editor decides that he wants to begin the process of improving the article with the objective of having it reviewed for a particular quality benchmark. They add the article title to a list of nominations, add their own username as the proposer, and provide a short note on the status they hope to achieve. Then editors can sign up to participate in the review process.

The proposer can push the article to the review stage as soon as one additional participant signs up. The proposer may invite other editors to participate through talkpages and wikiprojects, or they may simply wait and see if someone signs up. When the desired number of participants have signed up for the review the proposal passes to become active and the participants convene at the review page to establish what is required to make the article meet the benchmark. They are now a review team.

However, if the proposal does not gather any participants after a predetermined period of time, the article is automatically removed from the list and the proposal has failed. The proposer is free to propose the article again after a short waiting period (perhaps one month).

The purpose of the proposal stage is to connect article writers with people who are more experienced in the quality parameters and who also share an interest in the specific article and are willing to collaborate on it. It is also supposed to separate out the articles that are not ready for review yet or which no other editor is likely to want to put the efforts into reviewing. This will hopefully make for a more economic use of reviewer efforts by and decreasing the likelihood of reviews going stale in mid-process.

The Review Stage

[edit]

When a proposal passes from the proposal stage to the review stage the Review Team convenes at the article's review page. They begin a discussion aimed at assessing the strengths and weaknesses of the article and to see what improvements are needed for the article to pass a given benchmark. All of the editors assume responsibility for specific roles in the review (e.g. research, citations, copyediting, etc.) and all members of the team participate in editing the article with the shared goal of reaching the benchmark. Collectively the review team is responsible for the improvement of the article, and it cannot pass to the voting stage before they are all in agreement that it meets the benchmark requirements for a given status.

Reviewers are encouraged to gather statements from a copyeditor/MOS specialist (through the copyeditors guild) and a copyright/image specialist, and an external expert reviewer. These specialists may give a statement on the review page stating that they support (or not) the current state of their article with regards to their area of expertise. The statements are not necessary but will be taken into account by the !voters.

Of course any editor is able to join the review at the review page and give their opinion, but unless they sign up as part of the review team, the decision to push to a vote can be made by the team without their support. !Voters of course will be able to take into account any critiques and statements of opposition by non-members of the review team.

When the review team agrees that the benchmark has been met, they push the article to the !voting stage, nominating it for a specific status. A team of two is large enough for pushing to voting on BA and GA status, but a team of three is required to push the article to voting for EA status. The voting nomination includes a short statement about the articles quality and why the Review Team feels it meets the benchmark.

The purpose of the review process is to improve the article as much as possible- and to do so collaboratively in order to increase motivation and make the process more enjoyable. All of the reviewers will be interested in improving the article so that the !vote is most likely to be successful, and as all reviewers are jointly responsible for nominating the article for the !vote, they will be interested in actively improving the article, in building consensus. By having a strong Review Team to vouch for an articles quality, and evidence of a rigorous review process on the Review page, the !vote will be more likely to be favorable. This motivates the Review Team to be maximally efficient at improving the article.

[Questions to be solved: What happens in case of conflict within the review team? (the review is abandoned and never pushed to a vote, stale reviews are automatically archived and a new proposal is necessary)]

The !Voting Stage

[edit]

The Review Team prepares a brief nomination outlining the evidence of quality (e.g. external review statements, statements by the copyeditors guild, and comments by any non-team members received during the review stage).

A page of votes are maintained in which editors can vote Support or Oppose, giving their own views about whether the Review Teams assessment is correct. They may also add their support for a different status from the one that the article has been nominated for, for example writing "support GA" and Oppose EA - if they believe that the article meets the GA benchmark but not the EA benchmark. Oppose !voters are encouraged to supply a brief overview of which of the 9 parameters they think need work and why - but this is not a requirement - especially not if other !voters have already given reasons for opposing (e.g. a "Oppose per User:X" !vote as often happens at RfA).

After a given voting period a review-supervisor gauges the consensus to either promote the article. The review supervisor has discretion to disregard ill-considered !votes in assessing the consensus. There is a minimum number of participation requirement, which can be adjusted by the review supervisors as the process is implemented. Apart from closing !votes the review-supervisers have the discretion to push articles to the DYK queue, the FA queue, or back to the review stage (if the review team so desires) or to fail it completely so that a new proposal stage is required. Anyone who has participated in at least 3 successful EA nominations can be a review superviser.

The purpose of the !voting stage is to make sure that for each article there is a broad community support that it passes a given benchmark, and to make participating in the review process less labor intensive and more inviting. The !voting also is aimed at avoiding a "clique"-formation in the decision process to keep it more objective (although "clique" formation during the review stage is actually encouraged - because cliques work well together). Generally Wikipedians like to !vote and we often see good participation in !vote based processes such as RfA and RfC and ANI threads - this is why it may be a good way to widen participation in the quality assessment process.

External Reviews

[edit]

The Peer Review Reform proposed here includes the option of institutionalizing External Peer Review as part of the process. This can be done in two basic ways: Either an external peer review can be a part of the requirements for Excellent Article in the rubric system as described above, or it can be part of the requirements for an additional fourth benchmark level. If it is integrated into the process without adding a fourth level it can be either a requirement or an option that a Review Team can use during the review process. If optional it can be considered option they can use to bolster their own confidence in their assessment of the article's quality and as a part of the nomination for !vote which presumably will help convince !voters of the quality of the article. In the following it will be assumed that External Peer Review is simply integrated into the criteria for Excellent Article either as a requirement or as an option.

Having reached the stage of quality that that they consider appropriate, the Review Team decides to undertake an external peer review. They start by identifying one or more academics who have relevant expertise and they then decide to approach one or all of them to request a peer review. They send an email to the expert in question describing the topic and the process and politely requesting a review of the article. The expert will be offered named recognition as a reviewer of the article, and potentially some other form of recognition such as being mentioned on a wikimedia thank-you page that they can refer to in their CVs. If the expert agrees to review the article a deadline is set and a pdf version of the article is mailed to the expert along with the rubric requirements for EA status. The expert is then asked to asses the article in relation to the criteria and to write down the assessment on a sub-page of the review page (noting of course that the review commentary will be publicly visible and licensed under wikipedia's open content license).

Having received the Peer Review the Review Team may choose to implement further changes before pushing the article to a vote, to solicit further external peer reviews, or to push it directly to a vote.

How to Implement the Reform

[edit]

One challenge will be the implementation of the reform since it will replace all existing review processes, this poses logistical and organizational challenges .

What do we do with existing articles and their statuses?

[edit]

The easiest way to handle this would be to automatically convert all articles that have been review in the following way:

FA articles will be automatically re-classified as EA
GA articles will be automatically re-classified as GA
Articles that have been approved by DYK or given B-class status by a wikiproject will be re-classified as BA

Any article proposed for review can have its status changed through a vote. An article that has been promoted to GA and which is subsequently proposed as an EA under the new system would have have to be sufficiently improved that the Review Team considers the article to reach the EA standard (and therefore also the GA standard). If and article fails the !vote for EA status, the Review supervisor may decide that there is a consensus that the article meets the GA benchmark, or a consensus that it does not, and may choose to set its quality status to GA and BA depending on the consensus at the !vote.