Jump to content

Wikipedia:Wikipedia Signpost/2013-06-19/News and notes

From Wikipedia, the free encyclopedia
News and notes

Swedish Wikipedia's millionth article leads to protests; WMF elections—where are all the voters?

Swedish Wikipedia reaches one million articles with a bot

With Erysichton elaborata, the Swedish Wikipedia passed the one million article rubicon this week, following closely on the heels of the Spanish Wikipedia last month. While this is a mostly symbolic achievement, serving as a convenient benchmark with which to gain publicity and attention in an increasingly statistical world, the particular method by which the Swedish site has passed the mark has garnered significant attention—and controversy.

The Swedish Wikipedia, alongside the Dutch and much smaller Wikipedias, is one of the few to allow bots—semi-automated or automated programs—to mass-create articles. Using this method has allowed them to leap from about 968,000 articles in May to about 1,044,000 now, with about 454,000 of them being bot-created. This puts them as the fifth-largest Wikipedia, up from ninth just one month ago, and the same method has pushed the Dutch past the Germans, who had long held the title of second-largest Wikipedia. By comparison, the Polish Wikipedia, which had a similar total to the Swedish in May, is now at 973,000 articles.

The Dutch and Swedish totals come despite their far smaller userbases—for example, the Germans have an active userbase that is five times the size of the Dutch and eight times the size of the Swedish. By the same metric, the Polish are twice the size of the Swedish.

The bot-created articles themselves are basic enough: they are about four sentences long, with an infobox and sources from a common database. Each article is tagged with {{Robotskapad}} a template that notes its origins. Before it received attention for the achievement it represents, Erysichton elaborata provides an excellent example.

The Signpost contacted the bot operator, Lsj, for his thoughts. He told us that the idea for bot-created articles came from the Dutch Wikipedia and an idea mentioned on the Swedish equivalent of the Village Pump in early 2012. While a "handful" of editors were "adamantly opposed", the great majority were in favor. Several smaller trials were conducted before the large-scale project that led to the millionth article, including on birds and sponges.

He told us that bot-created articles can offer significant benefits to Wikimedia communities: "human minds should not be wasted on mind-numbing tasks that a machine can do equally well. Let the machines do the grunt work, and let humans do what requires real intelligence." Bots are also better and far faster at repetitive tasks than humans, who can inadvertently introduce errors. Any bot errors, which in an ironic twist are typically kindled human mistakes, can usually be fixed by a second bot run, similar to what Lsjbot will be doing to add images to the biological articles it has created.

The very concept of bot-created articles, though, has garnered significant opposition in the Wikimedia community as a whole, particularly from German Wikipedians. The prominent editor Achim Raschka authored a piece in the German-language news outlet Kurier. He lamented the Swedish Wikipedia's "bitter" milestone, which puts a spotlight on an article that has little more than "their existence and taxonomic pigeonholing" and omits key information like where the species lives or what it does. Raschka told the Signpost that these stub articles impart little useful information to readers—he asks, "who could be helped with [these] fragment[s] of data?" He also pointed at an entry Denis Diderot wrote for the Encyclopédie, titled "Aguaxima":


... the bot is always right, uses a neutral language, forms complete sentences, provides verifiable facts and makes no trouble, unlike us human authors. It knows ... correct formatting, rarely [vandalizes], addresses no other authors offensively, sought no barrier tests, never complains and is easily turned off without resistance. There are no bots with gender bias and of course no problems with the author leaving the site. If in any topic people are missing, there is no problem, as the programming of a few new bots by specially trained bots, perhaps with steward rights, proceeds rapidly. They are absolutely reliable even with a vote. ... We simply need to take note: Bots are better Wikipedians, our days are gone. We have only consumption, sex and drugs. But this does not have to be bad, right?

Schlesinger, "Die Zukunft heißt Botpedia," 16 June 2013.

A separate Kurier article by Schlesinger, which hyperbolically compared the bot-created articles to the famous novel Brave New World and claimed that bots can and will replace human editors, is a non sequitur. While bots can create article shells and—as can be seen on the Swedish Wikipedia—even short stubs, they can never be programmed to mass-create detailed articles capable of becoming featured or even good articles.

There was also extensive discussion on the Wikimedia-l mailing list and a Wikipedia blog post.

Lsj was unaware of the wider German-language attacks on bot-created articles, but after examining them, found that they were principally based in deeply held principles, making them difficult or impossible to provide an effective counter-argument.

In reply to Hubertl's sarcastic mailing list post, Lsj commented that the statistics, including view counts, editor numbers, and participation, contradict Hubertl's argument.

Still, a major problem could come from human error. Lsj acknowledges that source materials' errors could then creep into articles, but explains this by saying that a second bot run would fix the problem. The obvious rhetorical reply is simple: what if an error only creeps up every so often and is not fixable by bots? What if these errors are not caught until a significant amount of articles are created? A small base of active users may not be able to deal with the required cleanup.

Despite the risks, carefully planned bot-created articles could hold significant benefits for the Wikimedia movement. As Lsj told the Signpost:


While German-language Wikipedians lament the loss in quality in these programmatic articles, especially when compared to their stringent biology project guidelines, a short article may be better than none at all. This advantage is particularly apparent in smaller languages, whose Foundation projects have few editors and limited sources of information on the Internet, but far less so for wikis with larger userbases and article counts. It remains to be seen if more wikis will choose to bolster their content in this way.

This article was updated with comments from Achim Raschka.

Low voter numbers in WMF elections

Voter turnout by day, showing the onset and the effects of emailed reminder notifications halfway through the election period.

With little more than a day before voting closes for the WMF elections for three community seats on the ten-member Board of Trustees, fewer than 1700 Wikimedians out of a purported 90,000 active editors have turned out to vote—about one in every 50. This compares with a vote of almost 3500 in the last elections for these two-year seats, in June 2011.

Voter proportions by language
Arabic is spoken in 27 nation states by nearly half a billion speakers; but where are the voters?
The disappointing rate of participation is despite a lengthy pre-election period and almost two weeks of voting, with banners on all WMF sites and reminder emails sent out. The graph shows the day-by-day vote until the time of publication. The typical spurt of interest followed by a rapid fall-off in numbers occurred twice: once at the open of voting on 8 June, and once a week later on 15 June, corresponding to the distribution of email notifications.

Risker, a member of the volunteer election committee, commented: "It is lower than I would have expected ... It may be that the active community of 2013 is not as interested in the 'meta' aspects of the Wikimedia movement as in the past, as we have mostly followed the same processes as existed over the past several elections. Or it could be something entirely different. It's generally much harder to figure out why people don't do things than why they do them."

Of the 1659 votes cast at the time of writing, 592 (35.7%) are from English-language sites, 221 (13.3%) German, 157 (9.5%) Italian, 153 (9.2%) French, 82 (4.9%) Spanish, 55 (3.3%) Commons, 48 (2.9%) Polish, 41 (2.5%) Chinese, and 310 (18.7%) from all other languages.

Other languages on the radar are Japanese (27 voters) and Indonesian (12)—both welcome signs of the beginnings of a closer engagement with the worldwide movement—and Hebrew (10), Finnish (9), Danish (7), and Norwegian (7).

A notable disappointment is Hindi, with one voter out of some 200 million native speakers and a significant number of second-language speakers—the fourth-most-spoken language in the world—and an active and growing offline movement in the subcontinent.

Arabic, counting all dialects, has well over 400 million speakers, including 300 million native speakers, but managed to garner only four voters; this is despite a marked shift from the English and French Wikipedias to the Arabic Wikipedia in Arabic-speaking countries, and a successful start to a WMF education program in Egyptian universities.

Editors can vote until UTC 23:59 Saturday 22 June, by clicking on this link to the SecurePoll interface. Instructions on voting and information about candidates is at Meta. The close of voting corresponds to Saturday afternoon to evening in the Americas, before sunrise on Sunday morning in the Subcontinent, and early to late Sunday morning in East Asia and Australia/New Zealand.

In brief

  • Wales portrait with an odd backstory: A portrait of Jimmy Wales that was painted with a person's penis was the subject of a Commons deletion discussion, alongside a video of how the image was created. The portrait was uploaded by and possibly requested by Russavia, who was unblocked just months ago via an arbitration appeal. (The Signpost has carefully worded this due to Russavia's persistent refusal to give a definitive positive or negative answer when asked in multiple locations if he inspired the image's creation.) The discussion is leaning towards keeping both. Russavia was indefinitely blocked from the English Wikipedia last week.
  • New hires: The Wikimedia Foundation has brought four temporary community liaisons on board, including users Elitre, WhatamIdoing, The Interior, and Keegan. They have also hired a new director of analytics, Toby Negrin.
  • Privacy policy: The Foundation is asking for community input in formulating a new privacy policy on its projects. The move comes after the recent PRISM scandal in the United States, which drew a Foundation response.
  • Happy birthday: The Foundation is now ten years old.
  • South Africans want free access to Wikipedia: A Facebook campaign to allow free access on cellphones in South Africa so students can do their homework has inspired a WMF blog post. Related coverage is in this week's "In the media". The students state: