Wikipedia:Wikipedia Signpost/Single/2013-02-04
Examining the popularity of Wikipedia articles: catalysts, trends, and applications
On February 12, 2012, news of Whitney Houston's death brought 425 hits per second to her Wikipedia article, the highest peak traffic on any article since at least January 2010.It is broadly known that Wikipedia is the sixth most popular website on the Internet, but the English Wikipedia now has over 4 million articles and 29 million total pages. Much less attention has been given to traffic patterns and trends in content viewed. The Wikimedia Foundation makes available aggregate raw article view data for all of its projects.
This article attempts to convey some of the fascinating phenomena that underlie extremely popular articles, and perhaps more importantly to editors, discusses how this information can be used to improve the project moving forward. While some dismiss view spikes as the manifestation of shallow pop culture interests (e.g., Justin Bieber is the 6th most popular article over the past 3 years, see Tab. 2), these are valuable opportunities to study reader behavior and to shape the public perception of our projects.
Wikipedia's most popular articles
We have begun producing two weekly charts on the most popular articles on Wikipedia, the WP:5000 list, and the moderated WP:5000/Top25Report.
WP:5000, an automated list of the 5,000 most popular pages on Wikipedia, is now being compiled weekly. It also identifies how many featured articles, good articles, and lists are included. For the current list covering January 27 to February 2, we find 239 featured articles and 468 good articles in the top 5000 pages. However, this report is based on raw data and includes non-article pages and popularly requested redlinks, like "Com/fluendo/plugin/KateDec.class" at No. 15 on the current list (a script used to stream media content; see Cortado (software)), as well as 18k Gold Watch at position 166, a recurring entry likely fueled by spambots. More information on how the WP:5000 results are computed is found below.
The WP:5000/Top25Report is a manually moderated weekly Top 25 list started in January 2013 of the most popular articles on English Wikipedia. Similar in format to best-selling book or music charts, it is a bit more user friendly in that it excludes non-article pages, likely DOS attack entries, and the Main page. It also tracks how long an article has remained in the Top 25. Throughout January 2013, certain American football-related pages have been popular (a yearly trend seen during the playoff season of that sport), as well as popular recently released movies such as Django Unchained and notable recent deaths such as Aaron Swartz.
The origins of heightened popularity
Articles which are "extremely popular" on Wikipedia fall into the category of either (1) occasional or isolated popularity, or (2) consistent popularity.
The prime sources of occasional or isolated popularity include:
Rank | Article | Date (UTC) | Views/hr | Views/sec | Notes |
---|---|---|---|---|---|
1 | Whitney Houston | 12 Feb 2012 | 1532302 | 425.6 | Death of subject |
2 | Amy Winehouse | 23 Jul 2011 | 1359091 | 377.5 | Death of subject |
3 | Steve Jobs | 6 Oct 2011 | 1063665 | 295.5 | Death of subject |
4 | Madonna (entertainer) | 6 Feb 2012 | 993062 | 275.9 | Super Bowl halftime |
5 | Osama bin Laden | 2 May 2011 | 862169 | 239.5 | Death of subject |
6 | The Who | 7 Feb 2010 | 567905 | 157.8 | Super Bowl halftime |
7 | Ryan Dunn | 20 Jun 2011 | 522301 | 145.1 | Death of subject |
8 | Jodie Foster | 14 Jan 2013 | 451270 | 125.4 | Golden Globes speech |
- Cultural events and deaths: The best way to reach the highest levels of Wikipedia popularity are to be a celebrity who (a) dies, or (b) plays the Super Bowl halftime show (see Tab. 1). This year's Super Bowl entertainment, Beyoncé Knowles, just missed the chart with 100–110 views/second. Generally, prominent deaths dominate the top-100 traffic events and beyond. However, less morbid events are occasionally on the same scale, such as Jodie Foster following her recent coming out at the 2013 Golden Globes, Bubba Watson upon winning the 2012 Masters Tournament, and Ice hockey at the 2010 Winter Olympics during the final match between the U.S. and Canada (all drew over 250,000 views in a single hour).
- Google Doodles: Google often replaces its logo to commemorate anniversaries and other events, and clicking on the logo will usually produce the search results for that topic. With Wikipedia appearing first for many search engine queries, this can be a tremendous source of traffic. When the 110th birthday of Dennis Gabor was celebrated in this fashion on June 5, 2010, his article peaked at over 55 views per second (this for an article that currently sees only about 140 views per day). There are many other examples, including Winsor McCay on October 15, 2012, Gideon Sundback on April 24, 2012, and the London Underground last month.
- Non-human views and DOS attacks: Page access data cannot distinguish between human and automated attackers. The most dramatic example occurred on March 9, 2010, when the Jyllands-Posten Muhammad cartoons controversy article saw 5.3 million views in a single hour (likely the densest view-hour at any point in Wikipedia's history). Due to the religious controversy/sensitivity surrounding the topic, this is believed to be an attack designed to prevent others from viewing the page and its associated imagery. Ironically, the Denial of Service article also appears to be a frequent target. Often, it can be hard to distinguish between malicious attacks, accidental misconfiguration (e.g. bot testing), and undiscovered catalysts of human traffic. In compiling the WP:5000/Top25Report, some discretion is applied to attempt to remove odd anomalies. For example, Cat anatomy has been a popular article in raw page views for a few months (and not only on Caturdays), after previously being much less popular.
- Second screen effect: Though not nearly on the scale of the above spikes, we find that television programs and their content are reflected in page view data. This can be as broad as spikes on the Big Bang Theory article when the program airs on popular networks, but is even seen in small traffic bumps when a quiz show like Jeopardy! or Who Wants to be a Millionaire? asks about a particular topic. This phenomenon has recently been more thoroughly investigated on the German Wikipedia.[1]
- Slashdot effect: When extremely popular aggregation sites like Slashdot or Reddit prominently link to Wikipedia, traffic follows. Internally, Wikipedia's Main page can have much the same effect.
- Temporal patterns: The Christmas article is popular in December, Easter peaks around that holiday, and Christianity-related articles tend to see unusual amounts of Sunday traffic. This is just the start of patterns which are reflected diurnally, annually, and at other pre-determined intervals.
Rank | Article |
---|---|
1 | Wiki |
2 | |
3 | United States |
4 | YouTube |
5 | |
6 | Justin Bieber |
7 | Glee (TV series) |
8 | Sex |
9 | Wikipedia |
10 | Lady Gaga |
11 | Eminem |
12 | How I Met Your Mother |
13 | United Kingdom |
14 | The Big Bang Theory |
15 | India |
16 | World War II |
Meanwhile, reasons for long-term popularity are somewhat more intuitive. Tab. 2 shows the most popular articles over the last ~3 years. In addition to the broad underlying cultural and academic interests of Wikipedia's audience, we encourage the reader to consider:
- English Wikipedia's readership is not representative of English speaking populations. Previous studies have shown that Wikipedia's readership tends to be somewhat young, male, and educated—and their interests are likely to vary accordingly. Anecdotal evidence suggests significant traffic is driven by primary/secondary/university students in academic contexts, and we find that related topics are frequent vandal targets as well[2] (e.g., classic English literature, trigonometry concepts, etc.).
- Notice that Google, YouTube, and Facebook are all consistently popular articles. We speculate this is due in part to people accidentally typing these site names/URLs into a Wikipedia search box (either in the Mediawiki interface or a web browser) when intending to actually visit the sites themselves; related to, but not a case of typosquatting.
Applications and use-cases of the data
For anti-vandalism/damage
The impetus behind storing these statistics was to better understand damage response on Wikipedia (the dissertation topic of author User:West.andrew.g). By storing statistics for every article at the finest granularity possible (hourly), it becomes possible to accurately estimate the number of readers who saw any particular article version. While practical writings have often focused on the time to revert of damaging edits, we argue that the quantity of persons who view it is the more relevant metric. Vandalism that survives for days on an obscure article is effectively harmless if no one visits that article.
Fig. 1 plots the CDF of both the lifespan and view count of about 500,000 recent damaging edits. As the graph shows, at median just 1 person will be exposed to a damaging edit. Such an impressive figure is a testament to the automated (e.g. ClueBot NG) and semi-automated (e.g., Huggle and STiki) mechanisms that have recently been brought to bear on the task. While these tools produce probabilistic measures of damage, only STiki will soon integrate an article's popularity into its prioritization schema.
Fig. 1 also shows that ~10% of damaging edits are viewed by 100+ persons. Deeper analysis shows that many of the associated survival times are quite short, and these are often the result of damage to extremely popular articles. With the human latency already quite minimal (and a certain amount of latency being inherent), new solutions are needed. Consider that spammers could opportunistically target very popular pages to exploit these brief windows of opportunity. [3] Dynamically and autonomously moving articles in and out of "page protection" or "pending changes" based on their traffic patterns is another possible use-case for this data. As Fig. 2 demonstrates, the power-law distribution of views over articles would suggest relatively few articles need to be protected to have significant impact.
Spam and vandalism are surface-level issues. Recent analysis of deleted revisions on English Wikipedia showed copyright violations, being much harder to detect in casual patrolling work, to have significant lifespans and end-user exposures. [4] This finding has motivated research into autonomous means of copyright violation discovery (see WP:Turnitin).
Improving article quality
Article popularity can also be a measure for deciding which articles to improve, a concept already familiar to WikiProjects who keep tabs on the popularity of articles within their project (e.g., Wikipedia:WikiProject Songs has a watchlist for the 1,500 most popular song articles). At the aggregate level, the distribution of page views follows a "power law distribution". Fig. 2 represents one months' views on Wikipedia graphed against a Zipf distribution (a distribution where the most frequent item will occur approximately twice as often as the next item, three times as often as the third item, and so forth.)
The top 25 most viewed pages represent 4% of all total views, and the top 5000 represent 19% of all views. Though the distribution has an extremely long tail, the top 5000 data provides an opportunity to locate popular but poorly written articles that need attention, as opposed to randomly selecting one of the 4.15 million remaining articles on the project. That is not to say that articles deep in the long tail are less important, but for editors interested in prioritizing article improvement based on popularity and effect on public perception, the WP:5000 data is an important tool.
Insights into popular culture
These statistics also provide an opportunity to study what is popular in contemporary culture. Before the growth of the Internet, the primary quantitative measures of contemporary popularity included bestselling book and music charts, box office sales, and television and radio ratings. The digital age now gathers vast quantities of data on consumption not previously available, but some observations from the past still hold true. The fact that Justin Bieber was the sixth most popular article from 2010–12, far ahead of more critically appreciated talent, is consistent with what James D. Hart (author of The Oxford Companion to American Literature) observed in 1950 in writing about the most popular books of the mid-19th century:
“ | If a student of taste wants to know the thoughts and feelings of the majority who lived during Franklin Pierce's administration [1853–57], he will find more positive value in Maria Cummins' The Lamplighter or T.S. Arthur's Ten Nights in a Bar-Room than he will in Thoreau's Walden [the former being far more popular] – all books published in 1854.... Usually the book that is popular pleases the reader because it is shaped by the same forces that mold his non-reading hours, so that its dispositions and convictions, its language and subject, re-create the sense of the present, to die away as soon as that present becomes the past.[5] | ” |
Thus, in the same way, page view statistics permit us to consider that Justin Bieber and One Direction—as maligned as they may be critically—are more popular and likely influential on culture, than say, Kendrick Lamar, chosen by Pitchfork Media as releasing the best album of 2012.[6]
Data details and alternative perspectives
All the statistics in this article were produced by aggregating raw data made available by the WMF. This data contains hourly hit data on a per article basis for all WMF language/project combinations. Since Jan. 1, 2010 User:West.andrew.g has been parsing these files nightly and storing the English Wikipedia (article namespace) portions to a database hosted at the University of Pennsylvania. This is a non-trivial undertaking, consuming 1TB+ yearly. In addition to being the basis for several academic results [3][4] (and motivated by earlier third-party work[7]), he has more recently begun publishing the aforementioned weekly reports of the top 5000 articles, made available monthly reports for 2012, and released the source code behind these computations.
Others have used the same data for alternative purposes: User:Henrik has developed a tool for looking up the traffic history of specific articles. The Wikitrends site concentrates on dramatic popularity increases/decreases. WMF analyst Erik Zachte produces WikiStats, which provides a higher-level perspective on all WMF projects in numerous statistical dimensions. Mr. Zachte also has a fascinating portfolio of his WMF statistical work. These Wikipedia/WMF-specific resources complement other Internet-scale observations regarding search and popularity; most famously the Google Zeitgeist.
There are some caveats in interpreting this data. First, this is a raw presentation of traffic and popularity. It is known that English Wikipedia traffic has generally been increasing over time (per [1]). This fact, and the growing Internet connectivity that likely underlies it, lends some bias to more recent events. Second, it should be mentioned that logs may have under reported page view data in early 2010.
References
- ^ (17 November 2012). Wikipedia-Zugriffszahlen bestätigen Second-Screen-Trend, martinrycak.de (in German, article investigates how Wikipedia traffic matches German television shows during broadcast times) (English translation)
- ^ West, Andrew G., Sampath Kannan, and Insup Lee. Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata. In EUROSEC ‘10: Proceedings of the Third European Workshop on System Security, pp. 22–28. Paris, France. April 2010. (@ACM) – (Author's version available for download)
- ^ a b West, Andrew G. Jian Chang, Krishna Venkatasubramanian, Oleg Sokolsky, and Insup Lee. Link Spamming Wikipedia for Profit. In CEAS '11: Proceedings of the Eighth Annual Collaboration, Electronic Messaging, Anti-Abuse, and Spam Conference, pp. 152–161, Perth, Australia. September 2011. – (@ACM) – (Author's version available for download)
- ^ a b West, Andrew G. and Insup Lee. What Wikipedia Deletes: Characterizing Dangerous Collaborative Content. In WikiSym '11: Proceedings of the Seventh International Symposium on Wikis and Open Collaboration, pp. 25–28, Mountain View, CA, USA. October 2011. – (@ACM) – (Author's version available for download)
- ^ Hart, James D. The Popular Book: A History of America's Literary Taste (1950), p. 281
- ^ (20 December 2012). The Top 50 Albums of 2012, Pitchfork
- ^ Priedhorsky, Reid, Jilin Chen, Shyong (Tony) K. Lam, Katherine Panciera, Loren Terveen, and John Riedl. Creating, Destroying, and Restoring Value in Wikipedia. In GROUP '07: Proceedings of the International ACM Conference on Supporting Group Work, pp. 259–268, Sanibel Island, Florida, USA. November 2007. – (@ACM)
Reader comments
Article Feedback tool faces community resistance
Article feedback, at least through talk pages, has been a part of Wikipedia since its inception in 2001. The use of these pages by new editors, though, has typically been limited at best.
As part of the Wikimedia Foundation's (WMF) Public Policy Initiative, a specialized form of article feedback was developed and added to a selected set of English Wikipedia articles in September 2010 (see Signpost coverage). Over the next several months, the tool was tweaked several times, resulting in several iterations until version four was deployed to every English Wikipedia article in July 2012. This iteration allowed readers to rate articles from one to five stars in four categories: trustworthiness, objectivity, completeness, and how well it was written.
In December 2011, the WMF began the transition to adding version five (also known as AFTv5) to 10% of the articles, aiming for a full deployment in the first quarter of this year. Version five added an area where readers could give written feedback on articles. The feedback is directed to a centralized page, with various options, including seeing feedback from specific articles or only articles on your watchlist. Editors have the ability to feature a post, mark the concern as resolved, hide the post or request oversight, up- or down-vote a post, or flag the comment as abuse.
A request for comment (RfC) on the project was opened by MZMcBride in mid-January 2013. His impetus, as described in a Signpost op-ed in August 2012, was that the extension was "deployed without anti-abuse mechanisms", leading to the feedback area becoming a "safe haven for spam and other useless noise." In addition, he told the Signpost:
“ | I've been seeing a dichotomy between tools that help editors reduce backlogs and tools that create new backlogs. For tools that create new backlogs, I think there must be a clear demonstration from the community (or whoever is expected to work on these backlogs) that they're interested in creating these piles of work. In the case of AFTv5, there hasn't been such a demonstration, as far as I'm aware. Compare to [AFTv5 to] tools such as Page Curation that help editors reduce the backlog of unpatrolled new pages. In that case, the tool is helping, not hurting, and there's sufficient community consensus that we want new articles. | ” |
In his view the feedback tool should be used only on an opt-in basis, where editors who are interested in the article—e.g. someone who wrote the piece and wishes to solicit feedback on how they can improve it further—will actually respond to the feedback. He believes the new backlogs are a burden, and deploying the tool to the entire site would make the issue worse. This has a strong basis in fact: according to the WMF, readers submitted an average of 4100 posts per day, of which fewer than 10% were moderated. These figures have fallen since that blog post. When scaled to the full site, the WMF expected that over 900,000 posts per month, or over 30,000 a day, would come in via the feedback tool—a figure per month more than all of the current feedback put together.
A more radical viewpoint was put forward by GregJackP, who simply stated that "the tool is useless" and that the community "should eliminate the feature." He expanded on his views to the Signpost, saying that he believes the most feedback is blank, "a general statement of dissatisfaction or satisfaction, or just garbage/spam." In GregJackP's assessment, the amount of time it takes for editors to find the positive feedback is far outweighed by the "garbage", and the positive feedback is typically " just a question that you have to research and determine if it [is] something that can even be found."
Contrary to this view, editors like Mike Cline believe that the tool is becoming a major source of data for how the public views Wikipedia. Tom Morris eloquently stated:
“ | Shutting down the article feedback tool rather than improving it is a bad strategy. We do need better tools for churning through AFT5 responses and patrolling them. We need something like Huggle or STiki to do basic triage on the feedback we get, to remove libel and the "OMG I LOVE JUSTIN BIEBER" type things. The rest, though, those are telling us about potentially fixable issues with Wikipedia. If a reader, in good faith, wishes to give us feedback about an article, we should listen. We might set the feedback to one side because we aren't the sort of editors who can necessarily do anything about it. But if we stop listening to readers who have information that can improve the article, what's the damn point? | ” |
Oliver Keyes, the community liaison for the article feedback program, acknowledged the low level of moderation to the Signpost: "Are there sufficient resources to moderate and respond to all of the feedback? The honest answer is 'probably not'." However, he then related the issue to Wikipedia as a whole: "I don't see this as a problem: we're a wiki. Always have been, always will be. Edits will need oversighting or deleting, bad edits will slip through the cracks, and we accept that because it's necessary to produce the good things that an open system gives us. I see no reason not to take the same attitude with feedback."
Keyes told the Signpost that between 30 and 60 percent of all feedback was rated by editors as 'useful', which was a finding backed up by the fourth quarter report from the article feedback team, which reported that 40% of a random sampling in February through April was found to be helpful by at least two editors. In addition, he says that the WMF communicated its goals through the program through 17 different office hours on IRC (held at different times to target different regions of the world), mailing lists, and the village pump, in addition to the project talk page and a regular newsletter. The latter two alone reached at least 220 people, and probably more, far more than any typical Wikipedia discussion.
Still, the current request for comment has a large majority in favor of GregJackP's comment, more than double the second-most supported view (MZMcBride's) at the time of writing. The RfC will remain open until February 21.
In brief
- Wikimania scholarships: Applications for scholarships to Wikimania 2013 in Hong Kong are now being accepted. Both full and partial scholarships are available—covering airfare, lodging, and registration; and up to half of the estimated airfare, respectively. Applicants will be rated on their Wikimedia activity (both on- and off-wiki), their open-source activity more broadly, their interest in both Wikimania and the Wikimedia movement, and their grasp of English. Applications will be accepted until 23:59 UTC on 22 February.
- Chapters association: The Signpost reported last week on the problems with the proposed name of the planned association ("Wikimedia Chapters Association"), since the use of the name Wikimedia was inconsistent with the Wikimedia Foundation's trademark policy. On February 5, the WMF's Board of Trustees published a letter setting out its position towards the organization. It states, in part, that "Our reservations about the Chapters Association are serious, and we have difficulty envisioning circumstances in which the Wikimedia Foundation would be able to recognize it."
- Ann Arbor edit-a-thon: The newest Wikipedian-in-Residence, Michael Barera (see the Signpost's coverage last week), along with the Gerald R. Ford Presidential Library and the Michigan Wikipedians, will be hosting an edit-a-thon at the presidential library, with the goals of assisting new editors and creating or improving Wikipedia's coverage of Gerald Ford, the 38th President of the United States.
- Guided tours: As announced on the Wikimedia Blog, the Editor Engagement Experiments teams has built and launched a new guided tour system for new users.
- Individual Engagement Grants: Applications for IEGs, the new WMF grant scheme, are due by February 15 and can be reviewed on Meta.
- Ombudsman Commission: The appointments to the Ombudsman Commission, the body dealing with WMF privacy policy complaints, have been announced. Three editors (FloNight, Sir48, and Thogo) will return to the commission, while four editors (Deskana, Erzbischof, Huji, and Levg) will be joining the commission for the first time.
- Steward election: The annual election of stewards, who have complete access on all WMF wikis to deal with transproject vandalism, among other matters, will open for voting on February 8.
- English Wikipedia
- Administrator proposals: The Signpost welcomes the newest administrator, Jason Quinn, who passed with 138 in support to 29 opposed. Three requests for adminship remain open, all with over 90% support as of publishing time.
- Adminship reform: The second round of the 2013 request for comments on the request for adminship process has started.
- Star Trek: The rather contentious debate over the capitalization of Star Trek (I/i)nto Darkness has ended in favor of a capital "I". See this week's "In the media."
Reader comments
Land of the Midnight Sun – WikiProject Norway
This week, we took a trip to WikiProject Norway. Started in February 2005, WikiProject Norway has become the home for almost 34,000 articles about the world's best place to live, including 16 Featured Articles, 19 Featured Lists, and nearly 250 Good Articles. The project works on a to do list, maintains a categorization system, watches article alerts, and serves as a discussion forum. Interested editors should sign their name to the project's membership list and add the project's page to their watchlist. We interviewed Mentoz86, Hordaland, and Arsenikk.
What motivated you to join WikiProject Norway? Do you live in Norway? Have you contributed to any of the project's Good or Featured Articles?
- Mentoz86: I am Norwegian, and has used the English Wikipedia as a "primary point of reference" for a while. I started to edit Wikipedia in English, when I found out that articles about Norwegian football was not as good as I wanted them to be as a reader, and I joined WikiProject Norway to help out on updating and expanding articles about Norwegian football.
- Hordaland: I've lived in Norway for decades, though I'm originally from the USA.
- Arsenikk: I have lived in Norway for most of my life. I contribute mostly about Norwegian topics largely because I have good access to sources, for instance library books, making it logistically easier. I have written 156 GAs and 18 FLs within the project scope.
Do you speak Norwegian? Have you contributed to either of the Norwegian Wikipedias? Why do two versions of Wikipedia exist for the Norwegian language?
- Hordaland: Yes, I speak, read, write and teach Norwegian. Yes, I've contributed to the Norwegian Wikipedias, tho not for a long time. Why there are two written versions of the Norwegian language is a long story. In the 1500s or so, Norwegian like other languages at the time was developing a written language. But then Norway was dominated by Denmark for a few hundred years, and the Danish language took over as the written language. Starting in the first half of the 1800s, people started wanting a Norwegian language, naturally enough. Bokmål is norwegianized Danish. Nynorsk is based upon the dialects of Norway, which differ quite a bit from Danish. However, any Norwegian can read any Norwegian text as well as Danish without too much trouble. (Spoken Danish is, however, another story.)
- Arsenikk: I am a native speaker of both English and Norwegian. My contributions to the Norwegian Wikipedia are sporadic; my largest work was translating an article I had written in English to Norwegian during the SOPA blackout and getting it to GA in both languages.
Are there any significant gaps in the coverage of Norway on the English Wikipedia? Are some regions or time periods better represented than others? What can be done to fill the gaps?
- Arsenikk: I have not noticed a significant regional bias in coverage. I often write about locations elsewhere to where I live, have lived and come from. Regarding periods, I often find that the Norse and Viking period is well-covered, as is World War II. There is of course a recent bias from ca. 2000 till present. An area I find has been overseen—and which I recently have tried to counter somewhat—is the period from the 1950s to the 1990s. This is an exciting period in which the nation first had to rebuild after the war, started off as one of the poorest countries in Europe, to find oil and go through an economic boom and bust in the 1980s. To counteract the bias requires access to written sources and an awareness of what areas need work. The skewed topical scope is created largely because there are a limited number of active contributors and they all prefer writing about their own favorite topics.
The project has several Featured and Good Topics related to public transportation in Oslo. Was there a concerted effort to get these articles to their current status? Are some topics easier for the average Wikipedian to contribute to than others?
- Mentoz86: I believe this is the effort of User:Arsenikk's fantastic work. He has written a whole lot of GA's in his field of interest, so most of the projects GA's are about public transportation, football stadiums, airports.
- Arsenikk: I wrote most of the Oslo transport articles, despite never having lived there. Rail transport is my favorite topic and writing about the numerous metro and tram lines in Oslo is easy because it is well documented in books and periodicals. I believe easy topics are those which one has an interest in oneself. Running for a good or featured topic can make it easier to get that one last article in place.
Does WikiProject Norway collaborate with the projects of any neighboring countries? Are there any regional projects that could become a space for collaborative work on Scandinavian topics?
- Hordaland: I don't know, but it ought to be possible. I've worked on a big translation project where the Danish group and the Norwegian group cooperated very well. On Wikipedia I translated a Swedish article to English, Sleep (non-human), with the help of someone who knows Swedish.
- Arsenikk: Norway spent the era from 1397 to 1905 in unions with Sweden and/or Denmark. The countries share a common Norse heritage and have close cultural and linguistic bonds. Norwegian, Sweden and Danish are mutually intelligible. The potential for common articles is large, although Norwegians would not be dependent on people from other countries to comprehend sources. I have for instance relied largely on a Swedish book to write a subarticle about Scandinavian Airlines—the common flag carrier of the three countries.
What are the project's most urgent needs? How can a new contributor help today?
- Mentoz86: We have a couple of editors who are very good on creating new stubs, and like I mentioned above, some editors who are very good on improving articles to GA-status. But we have some outdated articles, and a lot of stubs who can be expanded, and new contributors are welcome to help out on this.
- Arsenikk: There are a number high-viewership and important articles which are in rather poor condition. Examples include major cities and towns, counties, major companies and institutions. Often it is easier to work with less extensive articles, but with regard to our readers it would be better if we focused more on central issues.
Anything else you'd like to add?
- Arsenikk: Language is a major barrier when writing about a non-English country and it is often impossible to reach beyond the most basic facts if relying on English sources. I would presume I use ninety percent Norwegian sources simply because no equivalent English versions exist. On the other hand this means that the English Wikipedia is the ultimate and most comprehensive guide available to many non-major Norwegian topics. A local resident may simply borrow a book about an important Norwegian topic; for the rest of the world the book is not available or comprehensible. As English is establishing itself as a world language, Wikipedia is helping spread knowledge not only in a free way but also in a linguistically barrier-free way to a global audience. This is one of my main motivations for participating in the project.
Next week, we'll summarize all the most important information in an infobox. Until then, wade through our antiquated sentences and paragraphs in the archive.
Reader comments
Portal people on potent potables and portable potholes
This week, the Signpost's featured content section continues its recap of 2012 by looking at featured portals, a small yet active part of the project. We interviewed FPOC directors Cirt and OhanaUnited.
Cirt
We've had a bunch of portals promoted to Featured Portal quality in 2012 among a diverse group of subjects, including: Animation, Arts, Conservatism, Indonesia, History, Maryland Roads, and New England. So far in 2013, we've promoted portals Bollywood, Cheshire, Massachusetts, and Society. I've personally worked on two of these: Arts and Society, as part of the Main Page Featured Portal drive. This is an effort to improve all portals linked from the top-right navigation of the Main Page to Featured quality. We've only got two more portals left to improve all the way up to Featured status in this quality improvement drive, Geography and Technology; the former is almost there and the latter is coming along nicely. It's been fun helping out with the quality improvement process of portals in the past year. Hopefully it won't take too much longer to complete the Main Page Featured Portal drive, and that will serve as a good model for future contributors to portals.
OhanaUnited
Being a small project is a double-edged sword. On one hand, we rarely (if ever) have to deal with vandals and other wiki-drama. On the other hand, our target audience is not as much as some article pages and therefore portals are less frequently maintained even though the articles showcased in the portal may be more up to date. Often there is a lot of inertia from the community and we have less capital to work with when the total number of votes matter a lot in a discussion. Just over a year ago, there were a number of good ideas presented to increase the visibility of portals. Including Cirt's idea of improving the portals listed on the main page to featured status, most of the ideas presented have received strong support and implemented. The one idea, championed by myself, would involve changing the standard bullet points on the main page to diagrams that reflect the portal. Even though it has already been implemented in German Wikipedia's home page (where the idea of portals originated), the effort involved in getting this implemented in English Wikipedia would be far too much. At this point, we're aiming for more participants in the featured portals candidate process.
Featured articles
Seven featured articles were promoted this week:
- "The Sixth Extinction II: Amor Fati" (nom) by Gen. Quon. "The Sixth Extinction II: Amor Fati" is a 1999 episode of the American science fiction television series The X-Files. The second part in a trilogy, the episode follows FBI agents Mulder and Scully after the former falls into a coma from contact with an alien spaceship. Inspired by The Last Temptation of Christ, it has been considered the series' best episode.
- Journey (2012 video game) (nom) by PresN. Journey is a PlayStation 3 game developed by Thatgamecompany in which the player controls an unnamed robed figure in a journey to a mountain. For the indie game, which began development in 2009, the developers sought to evoke a sense of smallness and wonder. Journey was a critical success, winning numerous video game awards and a Grammy Award for Best Score Soundtrack for Visual Media.
- Elephant (nom) by LittleJerry. Elephants are large herbivorous mammals found in Africa and Asia. The largest extant land animals, they can be as tall as 4 m (13 ft) and up to 7,000 kg (15,000 lb). Two species are generally recognised: the larger African elephant species is classified as "vulnerable", while the Asian elephant is smaller and "endangered". Their relationship with humans dates back thousands of years.
- Fortress of Mimoyecques (nom) by Prioryman. The Fortress of Mimoyecques was a Second World War military complex built underground by Nazi Germany between 1943 and 1944. Intended to house a battery of V-3 cannons aimed at London, the complex was never completed: the Allies considered it a major threat and began bombing it in late 1943, destroying part of the complex. After the war it was further demolished and served as a mushroom farm. It is now a museum.
- 2005 Qeshm earthquake (nom) by Ceranthor and Mikenorton. The earthquake on Qeshm Island off Southern Iran (November 27, 2005) was the second major earthquake to strike the country that year. Measuring 5.8 on the moment magnitude scale, it was followed by 400 minor aftershocks. It killed 13 people and devastated 13 villages.
- John Le Mesurier (nom) by SchroCat and Cassianto. Le Mesurier (1912–1983) was an English actor best known for his comedic role as Sergeant Arthur Wilson in the sitcom Dad's Army. He became an actor while still a youth and studied at the Fay Compton Studio of Dramatic Art. During his fifty-year career he appeared in numerous theatrical and film roles, often taking smaller roles. A self-described "jobbing actor", Le Mesurier took a relaxed approach to his profession.
- Mycena aurantiomarginata (nom) by Sasata. M. aurantiomarginata is a species of fungus widely distributed, but most common in North America and Europe. The mushrooms, which can measure up to 6 cm (2.4 in) in height, are named after the bright orange gill edges on the underside of their bell-shaped or conical caps. They can be distinguished from other mushrooms in the genus by differences in size, color, and substrate.
Featured lists
Four featured lists were promoted this week:
- List of international cricket five-wicket hauls by Sydney Barnes (nom) by Harrias. English cricketer Sydney Barnes claimed 24 five-wicket hauls during his fourteen-year career. He made most of them towards the end of his career.
- Latin Grammy Award for Best Urban Music Album (nom) by Status and Hahc21. The Latin Grammy Award for Best Urban Music Album has been presented annually since 2001 for "vocal or instrumental merengue house, R&B, reggaeton, rap and/or hip hop music albums containing at least 51 percent playing time of newly recorded material."
- List of international cricket centuries by Allan Border (nom) by Vensatry. Australian cricketer Allan Border scored centuries in 27 Test and 3 ODI matches during his fourteen-year career. His highest score was 205, against New Zealand.
- List of Billboard Social 50 number-one artists (nom) by Status. The Billboard Social 50 is an American popularity chart which ranks musicians based on their social networking. Since its inception in December 2010, Justin Bieber has spent the most weeks at number one.
Featured pictures
Eleven featured pictures were promoted this week:
- Map of the Battle of Jutland, 1916 (nom; related article), by Grandiose. The Battle of Jutland, between the Royal Navy of the British Commonwealth and the Imperial German Navy, was the largest naval battle of World War I. It lasted for two days; 14 British and 11 German ships were sunk.
- The Lady of the Camellias poster (nom; related article), created by Alfons Mucha, restored and nominated by Adam Cuerden. The Lady of the Camellias,a novel by Alexandre Dumas, Jr., has been adapted for the stage and film numerous times. This poster is for a late 19th century production starring Sarah Bernhardt.
- Buff-banded Rail (nom; related article), by Toby Hudson. The Buff-banded Rail (Gallirallus philippensis) is a bird found in much of Australasia and the south-west Pacific. It is highly territorial.
- Barrow Offshore Wind Farm (nom; related article), created by Andy Dingley and nominated by Elekhh. The Barrow Offshore Wind Farm, located in the East Irish Sea, consists of 30 turbines which provide a total of 90 megawatts.
- Young Woman Drawing (nom; related article), created by Marie-Denise Villers and nominated by Crisco 1492. Villers (1774–1821) was a French painter best known for her portraits. Young Woman Drawing is her best known work and may be a self-portrait.
- Schloss Johannisburg (nom; related article), created by Rainer Lippert and nominated by Tomer T. Schloss Johannisburg is a castle in Aschaffenburg, Germany, which was constructed in the early 17th century. It is considered a landmark of the city.
- Cupha erymanthis (nom; related article), created by Jkadavoor and nominated by Ceranthor. Cupha erymanthis is a species of butterfly found in forests of tropical South and Southeast Asia. It sometimes drinks liquids from carrion.
- Harlequin beetle (nom; related article), created by Archaeodontosaurus and nominated by Alborzagros. The harlequin beetle (Acrocinus longimanus) is a tropical beetle most common in Central and South America, native to the Americas. They are quite large, measuring up to 76 millimetres (3.0 in) in length.
- Space Shuttle Enterprise (nom; related article), created by NASA and nominated by Hahc21. Enterprise was the first Space Shuttle, designed by the American space agency NASA in the 1970s to perform test flights before Columbia became the first Shuttle in space. Here it is pictured while gliding.
- Malaysian Plover (nom; related article), by JJ Harrison. The Malaysian Plover (Charadrius peronii) is a small bird which is found at beaches and salt flats in Southeast Asia. It is classified as near-threatened.
- White-necked Laughingthrush (nom; related article), by JJ Harrison. The White-necked Laughingthrush (Garrulax strepitans) is a species of bird found in forests in China, Laos, Myanmar, and Thailand.
Featured portals
Four featured portals were promoted this week:
- Society (nom), by Cirt, with 20 SAs, 20 SBs, 20 SPs, 20 DYK sets, 20 selected quotes, 20 selected sounds, an ITN section, and an in this month section. Society relates to interpersonal relations in large social groupings.
- Bollywood (nom), by Bill william compton, with 15 SBs, 16 SAs, 15 SPs, and 7 DYK sets. The informal term Bollywood is popularly used for the Hindi-language film industry based in Mumbai.
- Cheshire (nom), by Espresso Addict, with 32 SAs, 23 SBs, 34 SPs, 14 FLs, 45 sets of DYK hooks, 16 selected quotes, 20 selected sounds, and an in this month section. Cheshire is a ceremonial county in the North West of England.
- Massachusetts (nom), by Sven Manguard, with 20 SAs, 20 SBs, 11 SPs, 10 sets of DYK hooks, 15 selected locations, and an in this month section. Massachusetts is a state of the United States.
Reader comments
Star Trek Into Pedantry
Star Trek Into Darkness capitalization controversy ridiculed
External image | |
---|---|
The xkcd cartoon on the Star Trek Into Darkness dispute |
“ | When it comes to world class pedantry, few groups can challenge the prowess of Wikipedians and Star Trek fans. So when the two come together it's little surprise they create a swirling maelstrom of anal retention from which no common sense can escape.[1] | ” |
The comic xkcd drew attention to one of Wikipedia's bitter title debates on 30 January 2013 with this cartoon. The topic: the capitalisation or non-capitalisation of the word "into" in the title of the upcoming Star Trek film, Star Trek Into Darkness. The question had generated tens of thousands of words of discussion on the article's talk page, as well as various subpages.
Given that Wikipedia's Manual of Style directs that prepositions with four letters or less should not normally be capitalised in titles, the discussion hinged on whether "Star Trek into darkness" should be understood as a single phrase, like "the journey into space", or whether the word "into" marked the beginning of a subtitle whose first word should be capitalised. Another factor was that the film's makers and most media reports capitalised the word "Into" in the film's title. A minority view also advocated that it should be seen as a subtitle (like other Star Trek movies) and therefore needed a colon, i.e. "Star Trek: Into Darkness".
In his Daily Dot article, titled "Wikipedians wage war over a capital 'I' in a 'Star Trek' film", Morris summarized the entire affair in the quote above, and cited User:Frungi to give his readers a brief summary of arguments in favour of an upper-case or lower-case spelling, saying he did not want want his readers to experience in "excruciating detail the main arguments from both sides. They are exhaustive and pedantic to such an extent that 'pedantic' no longer seems a suitable adjective."
Frungi's summary, compiled on 11 January, read:
- Arguments for the lowercase I
- “Into Darkness” may not be a subtitle, and “Star Trek into Darkness” may have been intended to be read as a sentence.
- Assuming it’s not a subtitle, the MOS dictates a lowercase preposition.
- Treating “into Darkness” as a subtitle without punctuation would be original research.
- Allowing it to be interpreted as a subtitle would play into the studio's marketing.
- The creator said that the title would not have a subtitle with a colon.
- Arguments for the uppercase I
- “Into Darkness” may be a subtitle, in line with the precedence of every Star Trek movie title longer than two words.
- Assuming it is a subtitle, the MOS dictates the first word be capitalized.
- Treating “Into Darkness” as part of a sentence would be original research.
- Capitalizing the possible subtitle would allow it to be interpreted either way.
- Every official, and the vast majority of secondary, sources capitalize it, and Wikipedia should follow this real-world use.
- The sentence “Star trek into darkness” makes no grammatical sense.
- The creator said that the title would have a subtitle rather than a number, and that the subtitle would not have a colon.
Morris chose to use a capital I throughout his article, saying he agreed with the passionate sentiments of an anonymous vandal who told Wikipedians to read the official website.
The Independent weighed in on the controversy a day later, on 31 January 2013 ("Trekkies take on Wikis in a grammatical tizzy over Star Trek Into Darkness"), asking its resident grammarian Guy Keleny to adjudicate.
Keleny acknowledged the ambiguity introduced by the missing colon, which allowed an interpretation of the title along the lines of "This is the story of the Star Trek into Darkness", but concluded:
“ | There’s only one thing to do. Follow the preference of the film-makers. It is their title, after all. They call it Star Trek Into Darkness—so that is what it is. In the same way, for instance, everybody accepts that the singer is called k d lang. Her typographical peculiarity may be pretentious and irritating, but her name belongs to her. | ” |
Science fiction news site Blastr took much the same view. The title of the Wikipedia entry was changed from lower-case to upper-case spelling on 31 January.
What if the Wikipedia "revolution" was actually a reversion?
On 30 January 2013, Rebecca J. Rosen, senior associate editor of The Atlantic, reported on a paper by Jeff Loveland and Joseph Reagle which argues that rather than being a break with the past, Wikipedia and Wikipedians are actually part of a long tradition of "obsessive compilers" that created "not just encyclopedias, but dictionaries, medical texts, histories, and even object collections, such as herbaria". Loveland and Reagle note a commonality between the methods used to build Wikipedia and various "encyclopedias of old":
“ | ... the process of building upon existing work bit by bit, or what the authors call "stigmergic accumulation." Now, that's a mouthful, but it's also a great metaphor. "Stigmergy," they write, describes "how wasps and termites collectively build complex structures by adding to the product of previous work rather than by communicating directly among themselves." | ” |
Piracy and other types of "borrowing" in such endeavours were common. Ephraim Chambers' 1728 Cyclopaedia "borrowed heavily from the Dictionnaire de Trevoux," and in turn was reprinted in full by Scottish "pirates". Chambers himself confessed that the Cyclopaedia contained "little ... new, and of my own growth."
“ | Men like Chambers were always a bit author, a bit compiler, a bit borrower, a bit editor. If Wikipedia complicates the notion of "authorship," it's not as though that notion were ever simple to begin with.
Even Wikipedia's open ideology has antecedents during this period. Diderot announced that people were free to reuse the art from his Encyclopedie—"a stance," the authors note, "probably meant to justify his and his colleagues' appropriation of illustrations from the 'Description des arts et metiers.'" |
” |
Wikipedia's collaborative approach, too, is really a function of the size of the task, and has its precedents in previous projects of comparable magnitude. Something very much like a crowdsourcing approach was used to compile the Oxford English Dictionary for example. Thousands of people contributed to the effort, sending in slips of paper noting words in their context. Diderot's and d'Alembert's encyclopedia had over 140 different contributors.
Jeff Loveland, a historian of encyclopedias, had previously reviewed Reagle's book Good Faith Collaboration: The Culture of Wikipedia and criticized it for having "one major weakness, namely in historical contextualization" (see report in the 28 November 2011 issue of the Signpost). The ensuing discussions between Loveland and Reagle led to this collaboration.
In brief
- Wild East: A 28 January 2013 press release hosted for example on the Wall Street Journal "Market Watch" website and the website of the Sacramento Bee warns that "Russian business conflicts spill over into Wikipedia". The press release says the story originally appeared in Wild East, "a business conflicts blog published by Russia! Magazine". It essentially reports on the actions of a single editor in the Russian Wikipedia who appears to be editing in favour of one side in a high-profile dispute involving a Russian cement company.
- Wikipedia "may be collapsing under its own weight": In late January 2013, the Minnesota Daily and Popular Science were among publications picking up on a 23 January University of Minnesota press release announcing publication of the Wikipedia study by Halfaker, Geiger, Morgan and Riedl, "The Rise and Decline of an Open Collaboration Community". The study was published at the end of December in American Behavioral Scientist. The content of the study has been publicly available for a while, and was reviewed in the 24 September 2012 issue of the Signpost.
- Interview with Sue Gardner: Kai Ryssdal interviewed Wikimedia executive director Sue Gardner for American Public Media on 31 January 2013. The conversation touched on similarities between public radio and Wikipedia, the success of the most recent Wikimedia fundraiser, the lack of minority representation in Wikipedia, the gender gap, the unlikelihood of the site's ever featuring advertising, and Wikipedia's accuracy.
- Wikipedia reaches 3 billion monthly mobile views amid concerns about contributor content: An article in the International Business Times on 2 February 2013 reported that January 2013 was the first month in which Wikipedia had more than three billion mobile page views. "14.5 percent of Wikipedia page views now are to the mobile site, up from 9.9 percent a year ago," Amit Kapoor, Wikimedia’s senior manager of mobile partnerships, explained. "Mobile page views rose over 75 percent in 2012, while desktop traffic grew at just under 20 percent. It is clear that much of Wikipedia’s growth is happening on mobile." The article contrasted the growth of Wikipedia's readership with the ongoing erosion of Wikipedia's editor base, and its lack of diversity.
- Wikipedia aims for billion users with mobile spread: In a related story, multiple news outlets are reporting that Wikipedia hopes to double its reach to around one billion users on the back of mobile phone expansion in the developing world. But according to Wikimedia press spokesman Jay Walsh, mobile phones as a tool to access Wikipedia have a downside—"the constraints of the mobile phone as a tool for editing remain a big hurdle."
Reader comments
Wikidata team targets English Wikipedia deployment
English Wikipedia to get centralised interwiki links on 11 February
Following the deployment of the Wikidata client to the Hungarian Wikipedia last month, the client was also deployed to the Italian and Hebrew Wikipedias on Wednesday. The next target for the client, which automatically provides phase 1 functionality (surfacing interwikis stored on the central wikidata.org repository), is the English Wikipedia, with a deployment date of 11 February already set. Barring any unforeseen problems, all other Wikipedias will get the client by the end of the month (non-Wikipedia projects not being the focus of phase 1).
Perhaps more importantly, the much more adventurous "data repository" phase of the project remains firmly on course to be completed (and deployed) before the original project completion date of 31 March despite the significant delays to phase 1. With that deployment, users will "be able to create a property 'child'... [and] add a statement to the item for Marie Curie using this property to say that she is the mother of Irène Joliot-Curie and Ève Curie. ... [In addition,] you can support all of these statements by adding references to them." Communities will be left to decide whether and how they wish to use these statements onwiki, but the expectation is that they will be used to turn wikidata.org into what amounts to a central repository for infoboxes.
Some preliminary work from phase 2 went live on Wikidata.org on Monday (example; accompanying blog post). As of time of writing, the eventual fate of the planned third phase (dynamic lists) remains more uncertain.
In brief
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.
- Pick of the blogs: Wikimedia technology bloggers (both WMF and community) had a busy week. Highlights include a commendably thorough discussion of the professionalisation of the WMF Operations Team over the past two years and how it fits in relation to the data centre migration (Wikimedia blog; follow-up about ongoing monitoring and management) and a commentary by board member SJ Klein on the trouble into which ArticleFeedback version 5 has run (personal blog; see also this week's "News and Notes" for full coverage). There was also a useful roundup of the challenges and opportunities posed by the increasing number of visitors accessing Wikipedia through their phones on the Wikimedia blog following the news that the mobile site recorded 3 billion page impressions for the first time last month.
Reader comments