Wikipedia:WikiProject Infoboxes/2013 RfC

Scope

Problem

Quality editors are adding and removing infoboxes across a number of articles including, but not limited to, Classical music composers. This is at times reaching the point of edit warring which can lead to discontent, a lack of harmony, and even blocks and discretionary sanctions. A secondary problem may be the concept that there is a disagreement on community consensus vs. local consensus. (an agreement developed at a project level)

Proposed solution

To have a discussion to determine the best way forward, hopefully just an informal agreement and consensus, but if a formalized WP:RFC needs to be presented to the global community, then so be it. I'm not a big fan of instruction creep myself, and would rather not get into even more policy setting that reduces any flexibility, but the constant bickering isn't an option either. IF a "formal" RfC with "options" absolutely needs to be created, then I suppose that can be worked out here and we'll draft one.

Viewpoints Discussion

Pro-infobox

(placeholder: may expand on this later) I like to see an infobox as a quick reference guide to a topic when I don't have the time or desire to read the entire article. — Ched : ? 16:25, 5 April 2013 (UTC)[reply]
Infoboxes are a way, but not the only nor necessarily the best way, to provide metadata.
Expectations: The norm in Wikipedia articles is to provide a very brief summary of key items in an infobox in the top right of an article. For the casual reader, an infobox has the same relationship to a well-written lead as that lead has to the rest of the article: if a lead provides a 2-minute summary of the article, then an infobox provides a 20-second overview of the lead. Redundancy is necessarily built in to an infobox, just as it is in the lead. --RexxS (talk) 22:05, 5 April 2013 (UTC)[reply]
Recognisable element: An Infobox provides a consistent framework element for re-users like Google to automatically extract information - see Intelligence in Wikipedia. In short, Google uses infoboxes' label-data pairs to dramatically improve the accuracy of its natural language reading algorithms when extracting information from Wikipedia. An infobox is an "intelligent structure" for them. --RexxS (talk) 22:05, 5 April 2013 (UTC)[reply]
Metadata: It also marks up many items with standard classes that can be recognised by others who scan our articles to collect information in microformats such as vCard. --RexxS (talk) 22:05, 5 April 2013 (UTC)[reply]
Standardisation: Infoboxes help to create standardised structures across a diverse set of articles, which helps fulfil the reader's expectations as well as provide both microformats and intelligent elements, even though those working on those articles often have no ability to assess the value of those.
They're just plain attractive. (sorry - couldn't resist. Obviously this is a completely subjective opinion that's not going to be resolved here) — Ched : ? 00:15, 6 April 2013 (UTC)[reply]
Infoboxes are a way to standardize appearance of wikipedia articles, providing basic information at a glance to the casual reader, allowing for rapid comparison of basic information between articles in the same general subject area (for example, gems and minerals) and yes, when properly done, add to the attractiveness of the article layout. Montanabw^(talk) 16:48, 9 April 2013 (UTC)[reply]

Con-infobox

If an article is just a stub, an infobox can be overwhelming or a distraction as far as formatting. Even if an article is further developed, an infobox can disrupt the formatting of a page.
Can make the article difficult to view on small devices such as phones and tablets.

However, I personally think that as Wikipedia is primarily a computer presentation, then the solution here is the development of various apps to provide some sort of screen reader rather than to remove function from the full platform. — Ched : ? 16:25, 5 April 2013 (UTC)[reply]

Infoboxes oversimplify information and/or mislead readers
Infoboxes focus on quantifiable details rather than the most significant facts about a subject
There is an inherent tension between the desire for a short-and-quick reference (=short box) and the desire for more metadata (=long box)
They're just plain ugly
Infoboxes present a barrier to editing for newbies
They are redundant and inferior to a well-written lead, which is also meant to be a quick overview of the topic
They lack flexibility
There are, of course, many reasons why either an infobox or some of its contents may not be appropriate in a particular article, but each needs to be examined on an individual basis: sometimes the précis will oversimplify and be misleading; sometimes the amount of information in the infobox overwhelms a short article; but the case needs to be made. --RexxS (talk) 22:05, 5 April 2013 (UTC)[reply]
Infoboxes tend to spawn narrower versions of more generic infoboxes that increase the task of maintenance and provide difficulties for intelligent extraction of data by Google, et al. --RexxS (talk) 22:05, 5 April 2013 (UTC)[reply]
Infoboxes prioritize the creation of a database over the actual mission of the Wikipedia project: creating an encyclopedia that anyone can edit
Infoboxes attempt to impose a single structure across a diverse set of articles, some of which are not suited to that type of structure, against the judgement of those working on those articles with knowledge of the topic. This is contrary to both the principles of our project and common sense.
Conflict over boxes (both their general use and the specific implementation where they are used) is destructive
Where a box is used, the lead image can only be increased in size by increasing the size of the box, further disrupting formatting and wasted space

Infoboxes involve repeating information that is already in the article, and may be repeated in the lead. This causes extra work to editors and leads to contradictory articles when people, especially newbies, correct or update the text but leave the coded stuff alone. Ϣere SpielChequers 09:41, 8 April 2013 (UTC)[reply]
Infoboxes require a lot of code on the page, and this changes wikipedia editing from something any internet user feels comfortable with to something that only programmers are comfortable with. Ϣere SpielChequers 09:41, 8 April 2013 (UTC)[reply]

Comment

The same reasons for an infobox exist in every article; while the reasons against will vary and often do not exist. It is true that the weight of argument will be against an infobox in many cases, but the onus is on the person wanting to remove an infobox to make that case. --RexxS (talk) 22:05, 5 April 2013 (UTC)[reply]

That appears to be a question Ched wants this discussion to answer; your assertion certainly isn't an accepted principle as it stands. Nikkimaria (talk) 22:09, 5 April 2013 (UTC)[reply]

Then feel free to raise the counter-argument: Explain why the same reasons for an infobox don't exist in every article (expectations, microformats, intelligent structures - what article doesn't benefit from these?); Tell us what reasons against will exist in every article. --RexxS (talk) 23:49, 5 April 2013 (UTC)[reply]

The reasons for an infobox certainly do not "exist in every article". It's not hard to give examples of situations where infoboxes wouldn't be appropriate; infoboxes were intended to be used when an article formed part of a series and an infobox presupposes that the subject can be summarized into infobox format, which isn't always the case. The example I generally use is Charles Domery, where neither the dates or places of birth and death are known ("Benche" is given in sources as the birthplace but there's no town of that name and he almost certainly made it up to hide his real identity in the event of recapture), and where a standard infobox-style summary of his undistinguished military service would give a seriously misleading picture of why Dickens et al considered him noteworthy. The same goes for pretty much any biography of someone with significant accomplishments in more than one field (see the doomed efforts on Brian Cox to cram his triple careers as pop singer, TV presenter and particle physicist into a single infobox, for instance), for a building which has been substantially rebuilt and thus doesn't have a defining style or a single construction date, a song that's been recorded by multiple occasions (see the stack of four infoboxes on Hallelujah or Long Tall Sally—the effect is particularly impressive on a mobile phone) and so on and so on... "It's needed for the metadata" is a very weak argument for demanding something be included on all articles even at the cost of misleading readers. (Mabbett and co generally set up a straw man that those opposed to infoboxes on particular articles want to get rid of them completely, but I've never seen anyone seriously argue this; the current situation in which they should only be used where appropriate, and the onus is on someone adding/removing one to demonstrate why it is/isn't appropriate, seems to work fine.) – iridescent 01:01, 6 April 2013 (UTC)[reply]
My comment was actually in response to your second sentence, as we've previously discussed, but Iri also makes a strong argument. Nikkimaria (talk) 02:46, 6 April 2013 (UTC)[reply]

Yet the three reasons I give (expectations, microformats, intelligent structures) still seem to be present in all the articles you point to, Iri. Google would like to know that Brian Cox is a physicist, rock musician and TV personality by reading it in our article. The algorithms they use to extract that sort of information from the article are rather weak; but they are made much stronger when reading label-data pairs like "occupation: physicist, rock musician, TV personality" that they can find in infoboxes because it emphatically confirms what they think they read in the text. (By "they", I mean the AI system used to extract data from our articles - Google regards us as the largest source of data on the web). I want Google to be able to make associations for Charles Domery like "nationality: Polish"; "occupation: soldier"; "years active: c. 1778 – after 1800". Their reading algorithms can gather that from an infobox-less article, but they would be much more certain of getting it right if they find the same data in an infobox. So I return to my assertion: having an infobox will always have at least those three advantages, and I don't agree these are weak advantages. Now what is important is to see that those reasons are not overriding - indeed in the cases you cite (and many others), an infobox would be difficult to construct, and quite probably would yield misleading data. I am completely amenable to the argument that certain articles are particularly unsuitable for infoboxes - so much so that the argument against outweighs the three reasons I have proposed, and then we are surely better off without the infobox; Google and other re-users will have to make do with what information they can glean from the article. And if Giano tells me that he spent hours fixing the image sizes and placement in Buckingham Palace, so that a sprawling infobox would destroy that aesthetic, I'm not going to argue - his judgement in those issues makes a compelling argument for me. Yet I maintain that if Buck House had an infobox it would meet expectations for a very brief summary; it would create microformats; it would provide strong assistance to Google's data-mining tools. Those reasons would still be present, but stronger reasons tip the balance against having an infobox. What I don't accept is that there exists a whole class of article where the reasons against having an infobox automatically prevail. For sure, there are groups of articles where many of them will require too much nuance in a particular parameter for an infobox to be a good choice, but each article needs to be examined on its merits. What I want to see is that examination taking place in a civilised atmosphere: no more reversions without explanation; no more uninformed, unconstructive commentary like "infoboxes are useless" (a demonstrably untrue assertion). I want to see give-and-take on both sides of the debate with the intention of finding a mutually acceptable agreement - or at the very least an acceptance that each side has a valid point-of-view that should be engaged with respect. --RexxS (talk) 22:04, 6 April 2013 (UTC)[reply]

General question: Why does the group (at a project level) Classical Music, it seems especially at the "Composers" level feel so strongly anti-box? — Ched : ? 03:12, 6 April 2013 (UTC)[reply]

To answer your question, the Classical Music and Composers project feel that the infoboxes there are "counter-productive" to be used in articles without discussing it first on the talk page. To quote the Composers WikiProject's stance on biographical infoboxes, "We think it is normally best, therefore, to avoid infoboxes altogether for classical musicians, and we prefer to add an infobox to an article only following consensus for that inclusion on the article's talk page. Particular care should be taken with Featured Articles as these have been carefully crafted according to clear consensus on their talkpages. (See the Request for Comment about composers' infoboxes and earlier infobox debates.)" There were numerous discussions about these matters at these projects. The infobox debates date way back to 2007. The following differences is a set of discussions on the use of the infoboxes from some of the WikiProjects in question: [1], [2], [3], [4], [5] (scroll down) [6] [7], [8], [9],[10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26] [27]. Lord Sjones23 (talk - contributions) 04:01, 6 April 2013 (UTC)[reply]

I've read multiple infobox discussions and most seem to be circular in the sense that one group (Wikipedia:WikiProject Composers) claim WP:IDONTLIKEIT and others claim WP:ILIKEIT, and it goes on endlessly. I'm hoping we can avoid that to some extend here. My question is however, WHY the "I don't like it". What is the why? If it's simply because a group of people got together and decided that for their own little group, I'm sorry - that is against policy (per WP:LOCALCONSENSUS). Tell me WHY you/they don't like it. — Ched : ? 04:19, 6 April 2013 (UTC)[reply]

I don't really think that infoboxes should be used in those particular articles because not only is it counterproductive, but it could make readers hard to understand. Lord Sjones23 (talk - contributions) 17:15, 9 April 2013 (UTC)[reply]

Question 2: @Iridescent. You say that the metadata is not that important; and as far as a human reader goes I would agree. However, one of the things that helped Wiki projects back in early - mid 2000s was the interlinking of articles to each other. At that time Google factored in the "what link here" quite prominently in their search algorithms, which is one factor in the rise of popularity. Now, if <li> tags and such are feeding metadata to search engines, doesn't it benefit us to stay ahead of the curve and keep up with how the data is being used to rank relevance when a person does use search engines? — Ched : ? 03:23, 6 April 2013 (UTC)[reply]

That's a fundamental (albeit widely shared) misunderstanding of how Google works. The old Pagerank system had nothing to do with metadata but is based on Markov chains and scores the probability that someone randomly clicking links will land on the page, and hence pages with a lot of "true" incoming links (that is, from elsewhere on the web) score highly while walled-gardens with a lot of incoming links but which ultimately circle back on themselves are penalized. Google now operates on the Google Panda system, in which Pagerank score is far less important and human input ("Did you find this site useful?" questions, and using Chrome and the Google toolbar to track repeat-visits) is of primary importance; again, metadata has nothing to do with it. Wikipedia has a top score on the Panda system, and thus even pages with few incoming links score near the top of Google searches. Whether or not a page contains metadata has no bearing at all on how Google (or Bing, or Yahoo...) ranks it; this is snake-oil sold by SEO companies. (If you really want to push a page up the ranks, get it mentioned on high-trust-high-repeat-readership sites like the New York Times or BBC News.) If anything, a page emitting a lot of metadata in comparison to its size is likely to trigger the Penguin automated filter which blocks attempts to manipulate search rankings, and make the page less visible to search engines. – iridescent 18:53, 6 April 2013 (UTC)[reply]

Hehe, I can remember a time when it was essential for a website to fill in the metadata in the header (keywords and description) if it wanted one of the numerous search engines to notice it, but those days are long gone. Google now dominates searches and, as Iri says, it has moved on a very long way from those primitive algorithms. Nobody on Wikipedia should be thinking of trying to artificially influence page ranking for our articles. In fact, just making the content as comprehensive and clearly expressed as we can is exactly what Google rates highly: we don't have to do anything different from what we would do anyway. The metadata that pro-infoboxers might talk about is not what used to be the sine qua non of search engine optimisation in the old days. Metadata really means "data about data" and isn't really what we're talking about. What we're trying to do is provide machine-readable data as unambiguously as possible so that re-users can extract it easily and accurately. One of the ways is by marking up microformats as we do in infoboxes and persondata, etc. It also happens that the very inflexibility that is complained of in infoboxes is actually advantageous to machine-reading when the infobox uses a common label like "date of birth" because it can extract the date following with near 100% certainty that it represents a date-of-birth (that's not to be confused with whether the value is accurate, but that problem is not confined to infoboxes). The more infoboxes that contain a given label, the more certainty a data-mining algorithm can give to its accompanying data. Incidentally, that is part of the argument for using the same generic infobox wherever possible, rather than spawning specific sub-classes of box. From that point of view {{Infobox person}} is to be preferred to {{Infobox artist}} for example. Hope that helps, --RexxS (talk) 22:34, 6 April 2013 (UTC)[reply]

Possible items to be addressed

the infobox in general - good or bad (should an infobox be considered default or status-quo?)
info box at Composers and or Classical music projects
local consensus vs. community consensus
input of major contributor

is the term "infobox" correct, or is the term ??? ...

Possible solutions

1RR on infobox addition/removal: where an infobox has been established in the article and its removal is reverted, it stays in the article pending new consensus on talk; where an infobox is newly added to the article and its addition is reverted, it stays out pending consensus on talk
Talk-first policy regarding infoboxes: no infobox may be added or removed until consensus is reached on talk
Defer to author's choice regarding inclusion of infobox
Establish a set of articles where infobox is default and another where no box is default (allowing deviation either way by per-article consensus)
Infobox inclusion or exclusion is determined on per-article basis (which is basically the current policy)
Better enforce current policies and guidelines regarding infoboxes, particularly the "purpose" section of MOS:INFOBOX
Where box is particularly long or article is particularly short, encourage partial collapse
Develop an alternative means of providing structured data that does not involve a visible infobox
Develop an alternative means of providing structured data that does not involve an in-article template (this might eventually be fulfilled by Wikidata?)
Impose a moratorium on addition/removal of boxes when article is or is about to be on the Main Page
Use metadata-emitting templates at the bottom of the article

Discussion

From Ched

How do you define "established", a fixed length of time that the article has existed? A certain number of editors have edited it?
How does that work? It goes back to how do you define "established"
Original author? Most prolific author? Many articles were started by editors that are no longer here.
OK .. I can see that. Do we use category or project or something else. How do we resolve the WP:LOCAL issues in policy?
agree
MOS is a guideline, or rather a set of guidelines - not policy. I know it's nit-picking, but I think it's a valid point.
agree
That may resolve the metadata questions, but it doesn't solve the "what the viewer sees" issues.
same as above - and it seems to be a solution that isn't available yet.
agree — Ched : ? 03:41, 6 April 2013 (UTC)[reply]