Wikipedia talk:No original research/Archive 58

This is an archive of past discussions on Wikipedia:No original research. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 55

Archive 56

Archive 57

→

Content in templates

I have started a discussion of how to provide citations for content that is contained in templates at WT:CITE#Content in templates. I believe this is relevant to this policy because I have found templates frequently contain uncited content that is not at all obvious and could be original research. Jc3s5h (talk) 14:17, 5 September 2012 (UTC)

Propose change to sentence regarding primary sources

I propose to change one sentence about the use of primary sources as follows.

Existing text: A primary source may only be used on Wikipedia to make straightforward, descriptive statements of facts that can be verified by any educated person with access to the source but without further, specialized knowledge.

Proposed replacement: A primary source may only be used on Wikipedia to support straightforward, descriptive statements that any educated person, who has access to the source and understands the language and notation of the source, will be able to verify are supported by the source without the need for interpretation or additional specialized knowledge. (Modified, see below)

Justification. Technical articles in Wikipedia, such as articles on mathematics and physics, cannot be accurate and up-to-date unless they are able to cite research papers from the professional literature. Indeed, such papers are routinely cited to the great benefit of Wikipedia. However, since research papers are primary sources, and a typical "educated person" has no chance of understanding them enough to verify that they support a claim, this practice is a technical breach of the rule. The same could perhaps be said of sources written in an unusual language; we shouldn't demand that they be verifiable by someone who can't read the language. Of course we can fall back on WP:COMMON, but it isn't healthy when a large number of articles (I'm talking thousands) owe their excellence to that fact that they ignore the letter of a policy. My understanding of the intention of the rule is that what gets into a Wikipedia article must come directly from the source without interpretation, not that it can be understood by an "educated person" no matter what subjects they are educated in. Hence my attempted rewrite.

A similar proposal was briefly discussed last year. Zero^talk 13:11, 8 August 2012 (UTC)

From my experience in cold fusion, this opens the doors to lots of obscure research papers that only a handful of people understand. I predict that a lot of physics professors will start adding their favourite fringe theories using primary sources that only them understand. If you try to remove it, you will be told that you don't have enough specialized knowledge to make a decision about the source. It also encourages experts to place more weight on obscure primary sources, when they should be using secondary sources. --Enric Naval (talk) 15:52, 8 August 2012 (UTC)

There is a potential danger, however favoured fringe theories usually don't get published in reputable/prestigious journals, i. e. in most cases no acceptable "primary" sources will be available anyway (despite the cold fusion case).--Kmhkmh (talk) 16:31, 8 August 2012 (UTC)

The bigger danger to me is that this opens the door to the use of editorial opinions rather than factual statements from primary sources. After all, an opinion can also be a "straightforward descriptive statement". I don't think that's a good idea. —David Eppstein (talk) 18:46, 8 August 2012 (UTC)

It's my understanding that peer-reviewed journal articles are considered secondary sources. Notes and manuscripts aimed at developing such articles are primary sources. This may be a slightly controversial point in itself, but if this understanding is used, then there is no need for the proposed change. --Trovatore (talk) 19:03, 8 August 2012 (UTC)

They may be primary or secondary depending on the topic they're used in conjunction with; it depends on context. They don't always get classified as one or the other. --MASEM (t) 19:07, 8 August 2012 (UTC)

Imho that is an unsolved problem within WP, where the notion of primary and secondary changes from field to field and editor to editor. I don't think there is any fix to it other than various projects/portals coming up with guidelines determining what is considered primary and secondary in their particular domain.--Kmhkmh (talk) 21:55, 8 August 2012 (UTC)

Whether peer-review papers are primary or secondary sources strongly depends on the type of claim that is being supported by the source. For original results presented in the paper, the paper is obviously a primary source. This of course, is especially true for the interpretation of those results.

However, peer-review papers will often also discuss the results from others. For such statements the peer-review paper can often be used as a secondary source.T R 13:25, 9 August 2012 (UTC)

It seems to me that this is about subject-specific literacies. For example, I have no understanding at all of FORTRAN. Therefore, if I were to examine an article like GROMOS, there's absolutely no way I could verify or check its contents in any meaningful way. I would have to rely on (and trust) editors familiar with FORTRAN to keep GROMOS free of original research. But, I read German fluently, so if I examine an article like Antje Boetius, I can check every cited fact in the article and know that it's well-sourced and free of original research, but someone else who doesn't read German wouldn't have that capacity because none of the sources are in English.
I think that one of the things Wikipedia can do well is keep up to date with the bleeding edge of research. Our coverage of, for example, extrasolar planets, or dinosaurs, is very up-to-date because editors are prepared to bend the rules when the research is good and reputable but also brand new. (I've personally used primary sources in HIP 56948 for exactly this reason.)
What I think we need to do is document this practice, which is actually already quite widespread, but to make it clear that it only applies where there's a subject-specific literacy that at least a few other editors share—so you could do it with foreign-language sources or formal logical notation or whatever—but only with new research that's yet to appear in secondary sources where there's an expectation that secondary sources will get around to covering it eventually, and (importantly) that it's policy-compliant for good faith editors with good faith concerns to remove such content until a secondary source does publish it.—S Marshall T/C 19:28, 8 August 2012 (UTC)

I would prefer a slight rewording - replace "language and notation" with "subject matter". The reason is that the word "language" has many different meanings, one of which is the "English language" (which I do not think is what the original author meant). Another area of concern is that in many cases, a primary source is the authoritative text because it is the primary source - this is particularly the case with standards and legislation including ISO standards, the Laws of Cricket and so on. Martinvl (talk) 19:49, 8 August 2012 (UTC)

I think there should be some kind of sliding scale of how much knowledge we should expect from a reader or editor who seeks to verify that a primary source "make[s] straightforward, descriptive statements of facts" and that the statements in the Wikipedia article express the same facts as the primary scale. However, I don't think the sliding scale should be based on how difficult it is to understand the source, it should be based on how hard it is to understand the Wikipedia article. If an article is advanced and considerable knowledge is needed to have any hope of understand the article, we can expect the reader or editor to apply that same knowledge to understanding the sources. But if it is an easy article, but a purported fact is put in from a highly advanced source, most readers and editors of that article will be unable to verify the purported fact. Jc3s5h (talk) 23:18, 8 August 2012 (UTC)

An editor who is able to examine an advanced source and summarize it in a Wikipedia article in a way that anyone can understand is an excellent editor. We should encourage it. So I don't agree. Zero^talk 01:23, 9 August 2012 (UTC)

Zero's comment is not without merit. My concern is when someone sticks a statement in a relatively easy article, and justifies it with a citation to a highly advanced source that seems only slightly related to the Wikipedia article. A hypothetical example: the eNotary article mentions a new Virginia law that allows that state's notaries to notarize over the web. Suppose someone comes along and claims the law is unworkable, and cites a number theory journal that can only be understood by people at least working on their PhD in number theory, if not already a professor. Now, number theory is related to public key cryptography, which could be related to electronic notarization, but how can a normal mortal tell the difference between a genius and a crank in this sort of situation? Jc3s5h (talk) 04:11, 9 August 2012 (UTC)

The comment of David Eppstein (who does excellent work in citing mathematical research papers) makes me realise that it was an error to remove the word "fact". It wasn't my intention to open the door to opinions, but only to normalize existing good practice. So:

Proposed replacement: A primary source may only be used on Wikipedia to support straightforward, descriptive statements of fact that any educated person, who has access to the source and understands the language and notation of the source, will be able to verify are supported by the source without the need for interpretation or additional specialized knowledge. Zero^talk 01:00, 9 August 2012 (UTC)

Jc3s5h correctly identifies the words "language and notation" as the part of the sentence which needs the most thought. I don't think that "subject matter" is right though, as it can be interpreted as specific to the particular source and fringe theorists will leverage it. I think the source should be verifiable by anyone who has sufficient training in the field of study to which the source belongs, not just by someone claiming expertise on that particular source. Zero^talk 01:23, 9 August 2012 (UTC)

Your expression here is totally munged, too many clauses, the subject and object are lost. Fifelfoo (talk) 01:35, 9 August 2012 (UTC)

It reads as a perfectly well-structured sentence to me, but you are welcome to suggest an alternative. Zero^talk

"A primary source may only be used on Wikipedia to support straightforward, descriptive statements of fact. These statements of fact would have to be able to be verified by any educated person with access to the source, and who understands the language and notation of the source. Such statements of fact need to be verfied as supported by the source without the need for interpretation or additional specialized knowledge." Don't mung three sentences into one when you're using multiple subjects and multiple objects. Fifelfoo (talk) 04:36, 9 August 2012 (UTC)

Ok, but two sentences ought to be enough:
"A primary source may only be used on Wikipedia to support straightforward, descriptive statements of fact that are provided by the source without the need for interpretation or additional specialized knowledge. These statements of fact must be verifiable by any educated person with access to the source, and who understands the language and notation of the source." Zero^talk 05:49, 9 August 2012 (UTC)

I think that "without the need for interpretation" is an unworkable requirement. It's hard to read anything in English (or most other languages) with absolutely no interpretation. Jowa fan (talk) 08:03, 9 August 2012 (UTC)

Do you see what that clause is getting at? If not, we'll explain, but if so did you have clearer wording in mind?—S Marshall T/C 08:16, 9 August 2012 (UTC)

Comment - the existing text is far more concise and is clearer than the replacement. I understand what is being got at but frankly the original is better.
Furthermore there is an original research issue with wikipedia being the first to include information on primary sources/novel ideas/foreign-language primary research papers. Foreign language secondary research is fine and is covered by WP:NOENG and that should be sufficient re the foreign language concerns. But if en-WP is the first with something we're moving from relying on secondary sources to being a secondary source--Cailil ^talk 13:40, 9 August 2012 (UTC)

Concern Both the current and the proposed statements are very vague and open for interpretation. Let me try to explain by means of an example. Compare the following two statements: (related to the recent high profile announcement about the possible discovery of the Higgs boson.)

"On July 4 2012, the CMS and ATLAS experiments announced that they had discovered a new particle"
"The CMS and ATLAS experiments have discovered a new article"

Both these statements can be described as "straightforward, descriptive statements of fact that are provided by the source without the need for interpretation or additional specialized knowledge". However, the first statement can quite safely be sourced by simply linking to the CMS and ATLAS press releases (which clearly are primary sources). In fact, those would be the preferred sources for this statement. The second statement, on the other hand, is much stronger in that it also asserts that CMS and ATLAS are correct and have not made a mistake. This statement really needs a secondary source as a reference. I think we need a sharper wording of the policy the distinguish between these two situations.T R 13:45, 9 August 2012 (UTC)

Actually, the current wording captures this perfectly. The primary source (i.e. the press release) is the "announcement" - and the fact that this announcement factually exists can easily be checked by looking at the press release.

The second sentence does indeed require a secondary source, as based on the press release by CMS and ATLAS we can only factually confirm that they CLAIM they have found evidence, but nothing more. Arnoutf (talk) 14:18, 9 August 2012 (UTC)

Apparently, it does not capture that perfectly, because that is not how people are interpreting it.T R 14:46, 9 August 2012 (UTC)

Personally I think none of these policies works without giving some leeway where it is appropriate. I doubt we can come up with a formulation that will cover every conceivable appropriate case and so that it can always be applied in a strict literal sense.--Kmhkmh (talk) 15:01, 9 August 2012 (UTC)

Who published these two statements? Why did teh second statement not include the word "claim"? Was it

They believe that CMS and ATLAS have actually discovered the Higgs boson abnd that CMS adn ATLAS are being over-cautious.
They have additional information that CMS and ATLAS did not take into account.
They are sloppy journalists.

All too often it is the last of these - in such cases the primary source is the better source. Martinvl (talk) 15:09, 9 August 2012 (UTC)

The two statements are hypothetical statements that could be included in Wikipedia. Based on the press release both statements are true. (And I don't think anybody actually doubts that they have actually found a new particle, although it is not 100% certain that it is the Higgs.) My point was that WP policy should prevent editors from including the second statement based purely of a primary source, but should instead be required to use a reliable secondary source. (e.g. a PDG report). (Obviously, items in the popular news outlets do not qualify as reliable sources in this context, and should never be used for the subject. That however is a different policy WP:RS.)T R 15:53, 9 August 2012 (UTC)

This example is off-topic since it does not capture the problem I am trying to solve. It can be perfectly well handled by the policy as it stands already. The situation I am considering is like this: before long the CMS and ATLAS teams will publish their findings in a physics journal. Someone who is able to understand the paper will summarize parts of it at Higgs boson. This will be a clear breach of the policy that primary sources have to be verifiable by "any educated person". The paper will for sure not be verifiable by "any educated person" because it will be written in a technical jargon that only physicists understand. Alas, press reports will not go to the depth that Higgs boson aspires to and will inevitably contain errors, while reliable secondary sources like advanced textbooks will not appear for years. The best thing is to allow editors who can understand the jargon to read the paper and report it on Wikipedia. That's what they already do, to our immense gain. All I'm asking is that we make it legal. Zero^talk 02:30, 10 August 2012 (UTC)

The example is certainly on-topic since it relates to possible detrimental side-effects of the change you propose. As an editor of technical articles that understands and has access to many primary sources, I agree that some additional leeway should be given to editors to translate technical statements from primary sources to common English. However, we need to be careful that the wording does not encourage editors to source claims with primary sources. I think, if we can be clearer about the type of statement that could be sourced from a primary source, then we can also be more relaxed about the requirement that anybody should be able to verify the statement.

I think the core of the issue I am raising is in the use of the term "fact". A primary source often will present many "facts". However, statements of these "facts" on wikipedia should typically not be sourced from a primary sources. Statements that the source reports these facts on the other hand can. I suggest something along the following lines.

proposal: A primary source may only be used on Wikipedia to make straightforward, descriptive statements about the source that can be verified by any person with access to the source and who understands the jargon in which it is written.

(Emphasis given to indicate the core of the change.)

That makes it much clearer what type of statement could be sourced from a primary source. Consequently, we can be less restrictive on the people that should be able to verify that the description is accurate. There can probably be made some improvements though.T R 07:48, 10 August 2012 (UTC)

off-topic:Note that the PDG already has a review out [1] covering the CMS and ATLAS discoveries. So, no, the appearance of reliable secondary sources will not take "years".T R 07:59, 10 August 2012 (UTC)

What this illustrates is that the primary/secondary division doesn't work all that well in the case of research papers, since I can't see why the inability of most people to verify that something is in this review is a different problem from that inability in the case of a paper by the Higgs discoverers. Alternatively, that research papers satisfying some high standard of WP:RS should not be treated by rules that were designed for primary sources like intelligence reports and interviews. Simply excluding them from the "primary sources" rules would be another plausible approach to this problem. Zero^talk 09:20, 10 August 2012 (UTC)

This is getting more off-topic, but I disagree. The division between primary/secondary sources works very well for research papers, but has nothing to do with how easily they can be verified by a general reader. Primary research papers presenting novel results will often exaggerate the impact of their results while missing some nuances to their statement. Typically, these will be pointed out by other research papers, and the entire body of work will get summarized in review papers. In practice, Wikipedia should not take a primary research papers word for it, when they claim something.T R 11:33, 10 August 2012 (UTC)

"About the source" won't be understood the way you intend. The content of a technical paper is about some topic (e.g., about Higgs bosons), not about the paper itself. I know that's not you mean, but lots of people won't know. Zero^talk 09:20, 10 August 2012 (UTC)

I am not sure that I understand what you mean (or that you understood what I meant). I indeed mean that a technical paper about some topic should not be used as a source to reference statements about that topic. (e.g. the primary articles about the Higgs boson should not be used as sources for statement about the Higgs boson) They can however be used to reference descriptive statements about the source. (e.g. "CMS claims to have found a new boson" or "ATLAS used method X to analyse their data").T R 11:33, 10 August 2012 (UTC)

Did you know that most newspaper articles are primary sources? If we adopted your approach, most of our political articles and many BLPs would be reduced to a series of sentences like "The Washington Post published a news article about the President's speech". WhatamIdoing (talk) 04:35, 22 August 2012 (UTC)

No they wouldn't.T R 06:46, 22 August 2012 (UTC)

Sure it would: "CMS claims to have found a new boson", "The Washington Post claims the US President gave a speech last night"... There's no difference there. WhatamIdoing (talk) 00:55, 30 August 2012 (UTC)

Oppose Understanding language and notation of the source (the jargon) means, in many cases, subject expertise. WP is edited by non-experts, and we have no way to check expertise even of those editors who claim some. 14:56, 13 September 2012 (UTC)

Importance of tertiary sources

Why doesn't wikipedia say somewhere clearly and definitively in its guidelines that it relies on secondary sources for its details, but on tertiary sources (other encyclopedias, specialist dictionaries (such as Oxford Dictionary of Sociology), and widely used textbooks) for the emphasis and balance within an article?

I have been looking at the article Caste (see this version), where some editors had managed to create a long-winded essay on the phenomenon of caste (or more correctly "caste like behavior") as practiced across the world. They have thereby relegated India, especially Hindu India, which happens to be the classic and most frequently cited ethnographic example, to just another subsection. I have noticed such defensive "universalism" displayed in other article on the social ills of India, which too emphasize that these ills are found world-wide (and India is just one example). Fowler&fowler «Talk» 12:36, 4 September 2012 (UTC)

I think a part of the problem is, that the term "tertiary source" is simply too broad. As it covers academic textbooks and specialized reference works one hand (which are among our most common (desired) sources) but on the other hand general audience (sometimes "low quality") reference works as well (such as Encarta, Britannica, etc., which we generally try to avoid as sources).--Kmhkmh (talk) 12:59, 4 September 2012 (UTC)

Well, does Wikipedia say somewhere clearly that academic textbooks and specialized reference works are our most common or desired sources for assessing emphasis and balance in a topic? A link would be very helpful. Fowler&fowler «Talk» 13:45, 4 September 2012 (UTC)

This strikes me as a good idea. I think that WP:TERTIARY is really the place that it should be. Right now the language tends to emphasize that such sources (think: the much-maligned paper encyclopedias!) can be inaccurate or overly simplistic, and that's valid too, but adding a sentence or so, relating specifically to assessing due weight, strikes me as desirable. --Tryptofish (talk) 22:49, 4 September 2012 (UTC)

Yeah, nah. It is too field specific. Some fields, ie scholarly fields, provide their own meta analysis of their conduct and conduct in specific topical areas. Review article and historiography for example. Scholarly tertiaries can be of some use, depending on the nature of the article and whether it conducts a field review or not. Fifelfoo (talk) 22:58, 4 September 2012 (UTC)

The problem with a topic like caste is that except during the 1950s and 60s, when there was a school in the US writing about caste (or caste-like) societies across the world, there are no long review articles in journals on the abstract topic of caste. There are articles in specialist dictionaries and encyclopedias, some of which, such as Veena Das's article Caste in the International Encyclopedia of the Social and Behavioral Sciences (2008) can be about 3 or 4 pages long, but no long review articles, because most anthropologists and sociologists today write about Caste in specific contexts, especially caste in India (and in societies affected by the Indian caste system, such as those in Bali or Fiji or Pakistan) and the review articles tend to be about these specific contexts. Even Veena Das's article, though titled "Caste," is entirely about caste in India. Fowler&fowler «Talk» 23:43, 4 September 2012 (UTC)

Seems to me then we should have an article on the "Indian caste system" since that is how secondary sources treat it. One can add the tertiary source argument to show there is no WP:UNDUE issue with that. Churn and change (talk) 15:06, 13 September 2012 (UTC)

WP:SOURCES says something similar, but not specifically about DUE. WhatamIdoing (talk) 00:37, 5 September 2012 (UTC)

Yes, I think DUE is really what we might want to focus on, with respect to something to add here. I accept that tertiary sources vary in reliability for specific information from field to field, but I think we can say that they are sometimes useful specifically for evaluating due weight. --Tryptofish (talk) 00:45, 6 September 2012 (UTC)

This discussion has gotten a little quiet. I just made an edit that I think/hope reflects what we have discussed here. --Tryptofish (talk) 23:47, 10 September 2012 (UTC)

Thank you very much. This is very helpful. I'm delighted. Fowler&fowler «Talk» 02:18, 11 September 2012 (UTC)

Journal articles = primary sources for technical topics, right?

Since I have nagged various contributors on the topic, I want to restate my understanding that a journal article is ordinarily a primary source. If my understanding is correct, then I recommend that we sharpen (Wikipedia:No original research#Using sources) the definition of journal articles as being primary sources (and hence less useful here). --Smokefoot (talk) 17:57, 5 September 2012 (UTC)

There are two issues. First, the extent to which editors accept the definitions set forth at "Primary, secondary and tertiary sources" and whether those definitions correspond to usage outside Wikipedia. I understand that in some fields, journal articles are always considered secondary sources; only unpublished material would be considered primary.

The second issue is that some journal articles are review articles; everyone agrees these are at least secondary sources, if not tertiary. And even a journal article that reports new results will have sections that review the relevant literature, and these sections could be considered secondary even if you use a definition of "primary" that would include the new-results article. Jc3s5h (talk) 18:18, 5 September 2012 (UTC)

I agree with Jc3s5h. I think the existing definitions are sufficiently clear. While many journal articles are primary:

"a scientific paper documenting a new experiment is a primary source on the outcome of that experiment"

many other journal articles qualify as secondary:

"review article that analyzes research papers in a field is a secondary source for the research"

Finally it should be noted that even primary sources published in peer reviewed journals undergo a review process that increases but of course does not guarantee the reliability of these sources. In the medical field, I believe some what higher standards apply because of the impact of such sources on individual and societal decisions regarding health. Furthermore, the results of different clinical trials frequently contradict each other and in these cases it is especially important to use only articles that review the results of previous clinical trials. In the case of non clinical research, I think it is less critical to insist on secondary sources for areas of research for which review articles have not yet appeared (compare WP:MEDRS with WP:SCIRS). Boghog (talk) 19:12, 5 September 2012 (UTC)

Thank you all for clarifying. It seems that we are in consensus that most ordinary journal articles are primary. I have to tell you that some editors, with whom I happen to disagree on this point, find the wording WP:SECONDARY ambiguous and insist that ordinary journal articles are secondary. It would help if WP:SECONDARY were clearer on this point, so I propose to edit the guidelines slightly. --Smokefoot (talk) 23:16, 5 September 2012 (UTC)

You might want to post your proposed revision on the talk page before you edit the guideline. The concepts of "journal" and "ordinary journal article" might be hard to define. For example, if you look at the table of contents of Nature, there is a mixture of many kinds of news items, features, and articles. It isn't at all clear that the bulk of the issue would be primary. Jc3s5h (talk) 23:24, 5 September 2012 (UTC)

Oh never mind, when I re-read the rules, it looks pretty clear to me. I guess I am just paranoid because I get so much grief when I tell editors that journal articles (describing new results) are primary and less desirable here than reviews and books. As a tangent, in the WP-Chemistry project we are swamped with junior scientist types who love to cite journal articles, I guess because it makes them look smart. The practice really undermines the project in the eyes of a serious critic, but oh well ... THanks again,--Smokefoot (talk) 01:06, 6 September 2012 (UTC)

It depends on what the primary source is documenting. According to the policy:

"A primary source may only be used on Wikipedia to make straightforward, descriptive statements of facts that can be verified by any educated person with access to the source but without further, specialized knowledge."

If I use a primary source to document the chemical reaction A + B → C, that is acceptable. It is would not be acceptable to cite two different primary sources, one that documents A + B → C, and a second that documents D + E → C, and then conclude the secondary reaction is the best way to synthesize C based on the opinion of the authors in the later primary source. To support this claim, one would need to cite a review article that compares and contrasts the two reactions and documents an experts opinion of the superiority of the second reaction over the first. Boghog (talk) 03:47, 6 September 2012 (UTC)

In general I support the sentiment expressed by OP that we should rely more on tertiary sources for balance issues. However, for such a change to be implemented, we should have a RfC for the issue. I think the way forward is to word some specific proposals, and then call for a RfC on the issue. LK (talk) 04:43, 6 September 2012 (UTC)

The review article he is talking about is however a secondary not tertiary source in that scheme. Personally I think people waste way too much time on focusing on formal differences between primary/secondary/tertiary rather than focusing on the reputation/quality of the publication. Moreover it has been pointed out repeatedly than many articles are a mix of primary and secondary, so the assessment cannot be made on an article level but just on a specific content level.--Kmhkmh (talk) 05:40, 6 September 2012 (UTC)

The citation of primary journal articles usually starts innocently enough, to quote from above "If I use a primary source to document the chemical reaction A + B → C, that is acceptable." I maintain "that" is not acceptable. It is a form of OR or capriciousness (not intended that way, I know), at least in the chemical world where according to http://www.cas.org/about-cas/cas-fact-sheets 36 million patent and journal articles are citable. The editor who selects one of these is making an OR decision. Furthermore as a matter of practice, if someone (me?) were to replace that primary citation with a more general one, I would hear howls of reproach for second guessing someone's handiwork. So the motivation for avoiding primary journals is not only to lend greater credibility to the source, it is to suppress the tendency of (well-intentioned) editors to select one from thousands of plausible references, an action I maintain is a form of OR. --Smokefoot (talk) 13:12, 6 September 2012 (UTC)

Minor point: Patents are self-published and should therefore almost never be used. See WP:RSEX on patents. WhatamIdoing (talk) 19:50, 6 September 2012 (UTC)

I strongly disagree with this. Deciding what should be included in Wikipedia and what shouldn't is a process that goes on all the time. It's an editorial decision, something without which an encyclopedia wouldn't be possible. The reason for using secondary sources is to establish notability; it's nothing to do with original research. You still have to decide which source to use, whether it's primary or secondary. Jowa fan (talk) 13:57, 6 September 2012 (UTC)

Agree, all such decision are editorial. Adding a reference to a Guardian article, or an article on the same topic published in the Sun can also result in dramatically different outcomes. Arnoutf (talk) 15:24, 6 September 2012 (UTC)

I agree but that is only one part of the story. Secondary/tertiary sources are not only used for ensuring notability, but they also serve as a layer of error correction/ensuring correctness. Primary sources (as in original research publications) often contain errors and other imperfections, that will be corrected in later publications.--Kmhkmh (talk) 19:11, 6 September 2012 (UTC)

Your advice/views are helpful.

I guess that we all agree that the less a contribution is "editorial" in nature, the better for the sake of objectivity (extrapolating: eventually all articles would be written by robots guided by a set of operating principles defined by us).
What about your views on the practice of replacing primary citations with secondary ones. This process is one component of upgrading articles to FA status. Do you endorse this replacement process as generally desirable, even if a specific reference is supplanted by a more general source?

Thanks for your time, --Smokefoot (talk) 17:24, 6 September 2012 (UTC)

No we can't because we are talking about different meanings of the word "editorial", the required and desired editorial discretion/editorial decision has nothing to do with newspaper editorials (i.e. an editor voicing his personal opinion and analysis), but it is about picking the best sources and providing the best readable content representation.
It is always possible to replace an "inferior" source by a "superior" one. But you cannot simply determine the inferiority/superiority by a formal primary/secondary argument. For instance replacing a research article in a highly reputable academic journal by some famous academic by a review article of an not particular distinguished (or even qualified) author in non peer reviewed publication (some second rate journal or even a general magazine/newspaper) is not an improvement. Another thing which should be considered is, that often augmenting sources rather than replacing them is a better option. So for instance if you have a rather reputable primary source (famous research article), then instead of replacing it by a good secondary source (some reputable review article) you could simply offer both sources. Offering original primary sources in addition to the standard secondary source is often interesting to readers as well.--Kmhkmh (talk) 19:05, 6 September 2012 (UTC)

I agree with Kmhkmh here. "Secondary" is not just a fancy way of spelling "good". WhatamIdoing (talk) 19:54, 6 September 2012 (UTC)

Journal articles are quite often wrong, as they are, by their nature, at the frontiers of knowledge. See Publish and be wrong. We should therefore treat journal articles with suspicion and prefer more settled and established sources. Warden (talk) 19:18, 6 September 2012 (UTC)
WP:RECENTISM is a significant problem with these kinds of primary sources. WhatamIdoing (talk) 19:54, 6 September 2012 (UTC)
I agree with Kmhkmh. A secondary source is not automatically better than a primary source. In the field of chemistry, review articles are often little more than just a description of the articles that have been published on a certain topic within a certain time span with little or no critical commentary on the reliability of the primary sources that are reviewed. These types of review articles perhaps increase the notability of the primary source because one reviewer thought that the primary source was interesting enough to cite but do nothing to increase the reliability of the primary source. Also I think the complexity of the system that is reported has a major impact the reliability of the source. In chemistry, published reactions can usually but not always be repeated. In biology, it is more common that results cannot be repeated. Finally clinical trials quite often give conflicting results. Hence it is quite reasonable that WP:MEDRS is more stringent than WP:SCIRS. Boghog (talk) 19:55, 6 September 2012 (UTC)

Yes, (scientific) journal articles often don't tell the whole story and are indeed original research (that is after all their aim); yet as primary sources they are probably the best we have and much much better than many secondary sources of less qualified authors. At least a scientist should conform to rigourous standard of argumentation to get his/her primary research published, something many authors of secondary reports in the media do not need. In short: ~~Bad~~ Good primary sources are not worse than bad seondary sources, and scientific papers are among the best (primary) sources we have. Arnoutf (talk) 20:19, 6 September 2012 (UTC)

Did you mean to write "good primary sources are not worse than bad secondary sources"?--Kmhkmh (talk) 20:29, 6 September 2012 (UTC)

Oops, corrected above. Arnoutf (talk) 20:31, 6 September 2012 (UTC)

As a general rule of thumb, a good primary source may be better than a bad secondary source. But it depends on what you're doing. If you're trying to figure out whether something is DUE, then the bad secondary source is more useful than the good primary source. A primary source can't really tell you that it's important (or, if the authors do, you shouldn't believe them

). WhatamIdoing (talk) 20:30, 10 September 2012 (UTC)

very well said, WhatamIdoing! (see below) Maximilian (talk) 14:36, 13 September 2012 (UTC)

experiences with NOR in the german wiki

File:Vague and Uninteresting0.png

Vague

i've had the discussion like the one above (about primary sources, journals, original research etc.) in the german wikipedia many times and would like to add this POV:

i spend a lot of time in archives and libraries, on- and offline, and feed some of the facts into the respective articles - as long as i consider them reliable sources and read other sources pointing into the same direction. always with care. in many cases i find the original sources better for shedding light on certain aspects of articles than many secondary sources. for example i found a sourceless statement in a wiki article about early automobiles where, in 1650, a swedish king purchased an "automobile" from germany. while visiting stockholm, i consulted the national library and found no indications whatsoever about this automobile. i removed the statement from the article as unsourced. later i received an email from the national archives in finland who hosts several paintings with the swedish king and the german automobile in its collection. now, the fact is back in the article. primary source, original research?

let's look at a secondary source: there's a history professor in england who denies the holocaust. to use his bestselling books as secondary sources is definitely a no-go in the german wiki. we have many articles which rely heavily on news and articles published in major german newspapers and magazines. to some of my wiki-collegues, these articles serve as good secondary sources. in many cases, however, i think they actually don't, because they rely on unknown facts, they contain personal views (which are, so to say, primary sources) and in many cases the journals exploit other journals and therefore are third grade sources.

it's seen critical by some, when information from archives, even highly respected archives, is being fed into articles. reason: these facts are considered "original research" = taboo. is a concise collection of papers in a public archive public or a matter of original research and thus taboo? (if it's original research, is it original research by the librarian or by me, the reader? if it's original research by the archivist, and he hands his selection of papers over to me, are these papers still primary sources and part of original research?)

i vaguely remember some articles in the english wiki which carried a banner saying: this article is a stub. do some research to improve it. original research? is going to my private book shelf and selecting three books out of 300 primary research?

the german article about NOR is not called "no original research" but: "Keine Theoriefindung" (TF)- which translates into: "no personal theories please" - totally different thing. this odd translation often leads to misinterpretations. TF is used against lots of newcomers as a no-go street sign, interchangeable with "no personal view". it's used against anyone who plants an information into an article one doesn't appreciate.

well, NOR is vague, and it's pleasant to read in this discussion how flakey the guidelines are. cheers, Maximilian (talk) 15:03, 12 September 2012 (UTC)

Beatles RfC

You are invited to participate in an RfC at Wikipedia talk:Requests for mediation/The Beatles on the issue of capitalizing the definite article when mentioning the band's name in running prose. This long-standing dispute is the subject of an open mediation case and we are requesting your help with determining the current community consensus. For the mediators. ~ GabeMc ^{(talk|contribs)} 00:43, 24 September 2012 (UTC)

Disagreement over WP:CALC

I was wondering if anyone who had this page watched could help provide some input to a discussion within the tropical cyclone Wikiproject. There is a current disagreement over whether we can include the results of a particular formula known as Accumulated Cyclone Energy (ACE) in a season hurricane article. I'll provide an example for clarification. This Wikipedia page is used as the source for this article, with the argument that it is a routine calculation to use Table 1 in this document to get an ACE total of 4.425. The document has 47 data points, but only 11 are to be used to determine the value. The formula involves using only those 11 data points; it is the sum of the square of each of those data points, divided by 10,000. Does that still constitute a basic calculation? The current discussion is occurring here. Thank you for your time. --♫ Hurricanehink (talk) 16:30, 1 October 2012 (UTC)

If the calculation you're talking about is simply to square eleven numbers, add them all together and then divide by 10,000, then that's basic arithmetic; it's high school stuff.—S Marshall T/C 16:47, 1 October 2012 (UTC)
- The above is likely better suited for the Original research noticeboard, but ill take it a step further as I feel the wording of WP:CALC is vague and should be changed. In it's current wording; however, I am not sure if ACE should be removed. YE ^Pacific ^Hurricane 16:54, 1 October 2012 (UTC)
  - S Marshall, it isn't merely taking 11 data points and doing those calculations. That source has 47 data points, and you're only taking a portion of those numbers, and for each storm it is different. Furthermore, should all high school math be considered routine? What about calculus and geometry? I agree with YE, the current wording for CALC should be clarified. The current examples only indicate using one operation, be it addition or multiplication. Combining several operations (such as doing addition, multiplication, and division) does not seem to apply based on the current wording. --♫ Hurricanehink (talk) 17:01, 1 October 2012 (UTC)
- I agree with User:S Marshall; middle-school arithmetic (not calculus or geometry) is routine math. The next questions would be: how exactly are the 11 points picked from 47, and is there a reliable source and consensus on the method of picking the 11 sources, and the sum-square as the right calculation? Just the sum-square part alone isn't OR. Churn and change (talk) 17:05, 1 October 2012 (UTC)
  - Yes, we have a reliable source that indicates what numbers should be chosen (only when tropical cyclones are tropical storms or hurricanes). --♫ Hurricanehink (talk) 17:44, 1 October 2012 (UTC)
    - If you have a reliable source that says "these 11 numbers", so that all the Wikipedia editor has to do is square them, add them, and divide them... then I expect the average 12 year old to be able to do this, and yes, that is pretty much the definition of a routine, simple calculation that is acceptable under this policy. WhatamIdoing (talk) 18:18, 1 October 2012 (UTC)

The math/arithmetic may be simple, but the application of it may not. WP:CALC gives examples of routine calculations which are unit conversions and simple additions. The point of those is that such calculations are not likely to be challenged - simply because they are straightforward to compute and because any reasonable individual will immediately accept the premises (calculating a persons age based on his date of birth). If the ACE is a firmly defined and widely accepted calculation principle (and can be sourced as such) and the inclusion/exclusion of individual storms or their wind speeds are reliably sourced then it may fall under WP:CALC. But if there are any interpretations possible as to which cyclones fall under the criteria, if any editor has to make a judgement, or if the ACE calculation is controversial for some reason, then it is clearly *outside* WP:CALC. Note also that WP:CALC has a provision which is directly aimed at ensuring simplicity and transparency: If there is not consensus among editors that it is a routine calculation, then it is not a routine calculation. Consensus is not a majority vote. In theory, even a single editor can block consensus. Useerup (talk) 18:32, 1 October 2012 (UTC)
If there's any element of judgment in selecting the 11 numbers then WP:CALC clearly does not apply. If there's a simple algorithm that could select them, then I would think it does. I want to say that the applicability of WP:CALC is a little flexible by context... the convention seems to be that for most articles, WP:CALC is restricted to high school arithmetic, but for certain articles (e.g. those within the scope of WikiProject Mathematics), undergraduate-level mathematics is permitted. This does allow editors to perform geometry and calculus where necessary.—S Marshall T/C 19:00, 1 October 2012 (UTC)
- Even a calculation as simple as adding two numbers may fall outside WP:CALC. If it is not immediately clear what the numbers represent and precisely what the sum would represent, it is *not* for editors to engage in such calculation. Whether it is high school math or not is not the point at all. Taking the square root of a number may be appropriate if the calculation is used to calculate the diagonal of a room for which a source has given the dimensions. But if the source does not state that the corners are orthogonal the computation may not be appropriate. The point is that when applying *any* form of computation you make an assumption as to the computations applicability. You must support the applicability as well, through reliable sources. Focusing on the actual computation as "simple" is naive and misleading. Indeed, it is so misleading that it actually proves why WP:CALC must have the elevated requirement of consensus among editors. --Useerup (talk) 20:35, 1 October 2012 (UTC)
  - FYI, the data source does state the intensity/status of the cyclone, so yes it is directly state in the text Y E ^Pacific ^Hurricane 21:08, 1 October 2012 (UTC)
WP:CALC might be summed up as, "If the ancient Greeks proved it in 300BC, then it's not original research". It does allow some relatively complex calculations where necessary; otherwise we would have to copy examples from textbooks and there's too much copypasta in Wikipedia as it is. In common with every other Wikipedian rule, WP:CALC needs to be applied on the basis of good editorial judgment and consensus, but the fact that a calculation is difficult to follow doesn't mean it can't be done. Using a calculation is like using a source in a foreign language: it has to be checkable, but it doesn't necessarily have to be checkable by everyone. If you're struggling to follow a calculation, ask a third party editor who's a competent mathematician to check it for you (and I suggest going to WT:WikiProject Mathematics to find one).—S Marshall T/C 23:36, 1 October 2012 (UTC)

I'm not sure if the literal meaning of WP:CALC agrees with S Marshall's take on it, but I believe we should allow something to S Marshall's idea. In reality, there is a vast gap between what WP:CALC allows and what one would be able to get published in a reliable source. The type of calculations used in a university freshman calculus class might be questionable under WP:CALC, but there is no way you would ever be able to get a paper published in any decent journal that uses freshman calculus to re-express a well-known idea in a different form. From the journal's publisher's point of view, it wouldn't be nearly original enough. Jc3s5h (talk) 23:47, 1 October 2012 (UTC)

Secondary does not mean independent

I've added an explicit denial of the erroneous "secondary == independent" idea to PSTS. We keep encountering users who don't get it.

I believe what we really need to do is to ditch the sloppy first sentence that defines "secondary". "Second-hand" is not really an adequate definition of secondary source. That suggests that if I copy the US Constitution out of the back of a textbook, rather than directly from the original, then my second-hand copy is magically "secondary". It's not.

I think we would be wise to replace this frequently mis-interpreted phrase with something based on the statement from this library, like this:

A secondary source is an author's original thinking based on primary sources. It contains an author's interpretation, analysis, or evaluation of the facts, evidence, concepts, and ideas taken from primary sources. Secondary sources are not necessarily independent or third-party sources. [And proceed from there with the "For example"]

What do you think? Would this help people "get it"? WhatamIdoing (talk) 16:55, 26 September 2012 (UTC)

I think your version is better. I would tend to define a secondary source as 'a source that adds value to the information in primary source(s)', but I'm not suggesting that as any formal definition. I do agree that "second-hand account" is a fairly poor explanation - it's often true, but it's not really the defining feature. And it's funny that you post about this now, because I just started a similar discussion on this topic at WT:N. NTox · talk 20:08, 26 September 2012 (UTC)

I haven't seen the WT:N discussion yet. It's a particularly difficult problem for WP:N, because we decided to require "secondary" sources back when some high-profile editors were using secondary and independent interchangeably. I'm not sure that the community actually wants to require true secondary sources for notability. The latest version of this confusion that I saw was at WT:RS. WhatamIdoing (talk) 22:00, 26 September 2012 (UTC)

As a caution (just in case it gets wikilawyered) a secondary source can be based on primary, secondary, or tertiary sources, as long as there's some transformation of thought involved. --MASEM (t) 20:20, 26 September 2012 (UTC)

It might be more accurate to say "based on other sources", but the link I gave above specifically says "primary sources". WhatamIdoing (talk) 22:01, 26 September 2012 (UTC)

I like this... To tie it more solidly into the concept of NOR, I would stress the distinction between "an author's original thinking" (allowed) and "a Wikipedia editor's original thinking" (not allowed) Blueboar (talk) 22:39, 26 September 2012 (UTC)

I strongly disagree that "original thinking" is integral to secondary sources. In mathematics (and I'm sure the same thing happens in other disciplines) one can find many survey papers and books which summarise a collection of primary sources, and/or convey information that's well-known within the discipline, but which aren't usually considered to contain original work on the author's part. I think "interpretation, analysis, or evaluation" is a good description of secondary sources, but the word "original" shouldn't be part of it. Some secondary sources do contain originality, some don't. Jowa fan (talk) 02:19, 27 September 2012 (UTC)

If it's summarizing without analysis, that's a tertiary source (exactly what we do at WP). Secondary nearly always requires some type of original thought that is not included in the original sources. --MASEM (t) 03:52, 27 September 2012 (UTC)

For starters, I think the reference to "Wikipedia as tertiary source" should come out, since WP is not a reliable source, having user-generated content. So its nature is immaterial. Yes, summarizing without any analysis makes a source tertiary, however, for secondary sources also 'summarizing' is kind of integral. As to "original thinking," while you are technically right, the phrase is apt to be confused with "original research" (new studies, new calculations etc) and best not put in there. In normal usage, a review article isn't thought to possess originality. Churn and change (talk) 04:03, 27 September 2012 (UTC)

I would want the "original thinking" part taken out. Also, a secondary source could be based on anything (OR, other secondary sources, encyclopedias). Also the word "summarize" is often what first comes to mind when thinking of a secondary source, and so that should be added. Finally, it would be good to mention the "source" concept applies to individual phrases or sentences in a work and not to a work as a whole. The "Introduction" in many published psychology papers, for example, is a good secondary source because it reviews the state of the field. I think this is all mentioned somewhere down the road in that article, but way too down the road, in my opinion. I agree "second-hand" is a poor description; its connotation is actually negative. Churn and change (talk) 02:51, 27 September 2012 (UTC)

Can any of you produce a reliable source that says secondary sources do not contain any original thinking/interpretation/analysis? I've never seen a source that supports such a claim, but I suppose that one might exist.

A review article that "merely summarizes" prior works, for example, necessarily contains the author's own interpretation of what papers are important and valuable, and I believe that should count as the author's "original thinking" about the prior sources. WhatamIdoing (talk) 04:21, 27 September 2012 (UTC)

By that definition, there are no tertiary sources possible. There would be nothing published that doesn't have "original thinking"; clearly for anything to be written down, authors have to evaluate and decide on what to include. That isn't the common meaning of the word, and it is precisely because its meaning can be easily confused I recommend it be taken out. Churn and change (talk) 04:27, 27 September 2012 (UTC)

To clarify a few things:

1. I'm thinking of Wikipedia's classification of primary/secondary/tertiary which (as discussion elsewhere makes clear) isn't always the same as the classification used by historians. In particular, ideas from historiography break down when we're looking at sources dealing with ideas (scientific, philosophical, etc) rather than real-world events. So no, I can't give a reliable source concerning the nature of secondary sources; and I don't think such sources would be relevant to this debate. This is about having a consensus for Wikipedia policy, rather than documenting how historians or others would classify the sources.

2. To me there's a big difference between a survey paper or monograph summarising recent research at the leading edge of its discipline and an undergraduate textbook. The textbook is made of very well-digested material, and would be considered a tertiary source. The survey paper is a secondary source. I think this is already clear from the examples listed at WP:PSTS.

3. The word "original" is very much open to interpretation. I've seen many arguments, sometimes silly, sometimes unpleasant, about the confusion between "original research" and "original" editorial decisions as to what should or shouldn't be in a Wikipedia article. Whether or not the author of a secondary/tertiary source engaged in "original thinking" is partly a matter of how the reader interprets the word. I think that inserting the word "original" into the policy is a recipe for generating more arguments.

Jowa fan (talk) 06:41, 27 September 2012 (UTC)

Survey papers can go both ways. I have seen those that have 200+ cites, but make no attempt to resolve conflicts or explain differences between datum and theories given in those reference, only to organize the information. That's a tertiary source. I've seen textbooks become secondary due to the approach taken with its teaching method (this was organic chemistry). This is why it is dangerous to immediately categorize certain types of sources into primary, secondary, and tertiary automatically, and that reasoning nearly always depends on what the actual topic is, since works can be a secondary source for one topic and primary for another. --MASEM (t) 14:28, 27 September 2012 (UTC)

That last point is important... the Encyclopedia Britannica would usually be classed as either a Secondary or Tertiary Source (depending on the EB article being cited), but in our article about the "Encyclopedia Britannica" it would be classed as the Primary source (and subject to the cautions and restrictions we place on Primary sources). Blueboar (talk) 16:03, 28 September 2012 (UTC)

Jowa (and others), if you don't care for "an author's original thinking", would "an author's own thinking" be acceptable? WhatamIdoing (talk) 23:51, 28 September 2012 (UTC)

I would be ok with that. Churn and change (talk) 04:53, 29 September 2012 (UTC)

It seems redundant (how many works are written by authors who don't think? Maybe more than we'd like to admit) but I don't mind it too much. Jowa fan (talk) 07:28, 29 September 2012 (UTC)

Since there have been no further comments for about a week, I've made the change as discussed. I don't mean for this to indicate that we've achieved perfection, only a small improvement over the confusing "second-hand" language, so if someone has ideas about how to improve it further, then we can discuss that, too. WhatamIdoing (talk) 16:38, 5 October 2012 (UTC)

I made a trivial correction of noun-verb agreement, but I also made a substantive change, adding back a phrase about being at remove from the original event. I think that makes it easier to grasp how secondary sources differ from primary ones, but please check whether what I did was correct. --Tryptofish (talk) 20:15, 5 October 2012 (UTC)

I have my worries about that phrase, since I think it will encourage people to fall into the "Alice told Bob, and Bob told Chris, so Bob is a secondary source" error. But lets see what happens. We can remove it again if we actually need to. WhatamIdoing (talk) 20:47, 16 October 2012 (UTC)

Original sci/math papers as primary sources

I've seen quite a few WP articles on general mathematical subjects into which someone adds, something like, "~~Recently~~ [I usually delete this word whenever I see it] Random C. Cornpopper and Hikaru Denkinouchi presented a proof that under certain conditions almost all quiggly graphs are weakly incidental with exactly two hyperconcentrators" (with a ref to, say, the 2010 Midsummer Symposium on Combinatorics and Compuutation). Usually I let it go, with AGF to Cornpopper and the Symp. reviewers.

However I've run into a tough issue recently. There is this article than makes a claim which sounds sesational and added to WP, but google scholar search does not show anybody citing this article but the authors. Also, the claim appears to be a bit of bullshitting, with hype wordage to cover up a rather trivial result. It is mainly for the second reason I want it gone from wikipedia, but I cannot do it without OR. The only wikilawyer's hope is to question the reliability of the source.

So here comes my question: is a scientific primary source by (relatively) unknown scientists not critically avaluated by anybody else (simple mentions do not count) considered reliable in wikipedia? I'd like to see something to this end stated clearly in the policy. This could greatly help to weed out lots of trivia and bullshit. Staszek Lem (talk) 19:39, 1 October 2012 (UTC)

A single primary source with an original result is insufficient to demonstrate that the new result is significant enough for inclusion. Someguy1221 (talk) 19:42, 1 October 2012 (UTC)

In general, primary sources suffer from incomprehensibility to the lay editors, cutting-edge and non-mainstream research, and unestablished notability or even correctness (think of statistical flukes). Even primary sources from notable authors have many of these issues; research, by definition, is supposed to push the envelope, and practically every researcher gets a wrong result now and then. That is why replication is considered important. From WP's point of view, we should use secondary sources, with primary sources backing them up. WP:PRIMARY is clear on the issues with primary sources. Churn and change (talk) 20:02, 1 October 2012 (UTC)

I beg to disagree with "is clear on the issues with primary sources". I started this section precisely because it is rather vague in context of math /sci. Indeed, it can be readily "verified by any educated person with access to the source but without further, specialized knowledge" (I am citing the policy) that "H. Denkinouchi presented a proof that under certain conditions almost all quiggly graphs are weakly incidental", no? Also the phrase "may be used in Wikipedia; but only with care" is a loophope for the author of the addition to endlessly fight for his pet fact.

Sci/math are strict things, and in this area the policy just as well may be strict. Wor example, a new sci article may be treated as "news". If the "news" do not make a splash (in the respective professional branh/domain/industry), then no wikipedia for them. Staszek Lem (talk) 20:21, 1 October 2012 (UTC)

I don't think practically any math proof in a modern-day journal can be understood without specialized knowledge. Yes, I agree the standards should be stricter for math and science, since there is rarely a need to depend on such sources, and they often require more specialized knowledge to understand than papers in other fields. The reason why other fields need the loophole is illustrated by the example in WP:PRIMARY; a novel may be used as a primary source to summarize its plot. Churn and change (talk) 20:29, 1 October 2012 (UTC)

IMHO the fact that an article is published in a peer reviewed journal is sufficient to establish notability. Concerning reliability, the complexity of the system that is reported has a major impact the reliability of the source. In physical and chemical sciences, published results can usually but not always be repeated. In biology, it is more common that results cannot be repeated. Finally clinical trials quite often give conflicting results. Hence it is quite reasonable that WP:MEDRS is more stringent than WP:SCIRS. Boghog (talk) 20:23, 1 October 2012 (UTC)

Not being a mathematician, I have no idea how mathematical journals work. However publication of a mathematical proof in a peer reviewed journal would seem to imply that the proof has been checked by at least two referees and approved by the journal editor. If so, it would seem that the reliability of the proof has been fairly well established. Am I missing something? Boghog (talk) 20:38, 1 October 2012 (UTC)

I am afraid you are confusing the topics or too value in describing your opinion. "Notability" criteria are for existence of wikipedia articles. Nevertheless, thank you for pointing me to WP:SCIRS. Unfortunately it is an essay, unlike WP:MEDRS, and is of no use against a hardened wikilawyer. Staszek Lem (talk) 20:53, 1 October 2012 (UTC)

Considering the number of peer-reviewed papers which get published, that definition of notability dilutes the concept to meaninglessness. Churn and change (talk) 20:32, 1 October 2012 (UTC)

I disagree. It all depends on the journal and the authors. A publication in Nature or Science is almost by definition notable. In contrast, a publication in an obscure journal has questionable notability. A publication by a researcher with an an established track record is probably notable (see WP:SCIRS). Boghog (talk) 20:42, 1 October 2012 (UTC)

Churn's point exactly. Wikipedia has to review the peer-reviewed journals :-( Your examples of Nature and Science are irrelevant in the context of my thread: they do not publish original research. Also, I am affraid there a big misunderstanding about the nature of "peer review". Basically, peer review is to weed out clear bullshit and (if a reviewer is diligent) to make the article more readable (again, with an ulterior motive to make possible bullshit detectable more easily :-). Unless the journal is absolute top-tier in its domain and as such has a high rejection rate, it may publish correct but trivial results or non-trivial but error-prone results. Staszek Lem (talk) 20:53, 1 October 2012 (UTC)

Well first, both Nature and Science frequently publish original research. And the purpose of peer review of original research is to make sure that the methods are sound and the conclusions reasonable, both on their own and in the context of previously published work. No one should have the impression that the stamp of peer review means the conclusions or even the data should be regarded as true or even significant. Someguy1221 (talk) 22:03, 1 October 2012 (UTC)

Most studies, and that includes mathematical proofs, exist in a context. They have assumptions, sometimes stated and sometimes implied as being the standard assumptions of the sub-field. Many geometric, and particularly trigonometric, proofs apply to just the Euclidean plane but this is at times considered implicit. Many proofs apply under certain restrictions and are inapplicable in certain domains. Proofs for a cyclic number system, where the context of that system is established through a technical explanation in the introductory section, do not apply to other systems. The meaning of words researchers use often subtly differs from the common English meaning of the words ("carnivore" as an example from biology). And, yes, many original studies published in Nature and Science never get developed further because others are either unable to reproduce them or because they fail to generalize. No researcher, not even a Nobel laureate, publishes only papers still considered notable a few years down the road. We depend on secondary sources to do all this filtering for us. Picking primary sources based on our own criteria for notability (great author, great publication and so on) is a recipe for the encyclopedia turning into a science news magazine, on the lines of Nature News or Science News. Churn and change (talk) 21:47, 1 October 2012 (UTC)

Likewise, secondary sources do not guarantee accuracy or even notability (see above). A good primary source may trump a bad secondary source. Boghog (talk) 21:09, 4 October 2012 (UTC)

Your last statement is vague as to be not even false :-). The notability of a result from an original math article may be judged only by peers and only after reasonable consideration. Hence secondary sources. Period. Unless the author is eg Linus Pauling, and it would be very difficult to remove his result, despite all the convention that "notability is not inherited" blabla. On the other hand, if an original article says something about a result in another article, then it is a secondary source with respect to this another result and hence an acceptable evidence of notability, if not just mentions in passing. In the latter case no matter how the text is good, its weight is low. Staszek Lem (talk) 02:05, 5 October 2012 (UTC)

False. A secondary source is not automatically better than a primary source (see above). Boghog (talk) 06:42, 5 October 2012 (UTC)

Again, your statement makes no addition to this chaotic discussion. Of course nobody argues that a random secondary source is better than a very good primary source. Staszek Lem (talk) 15:29, 5 October 2012 (UTC)

Again, if a article is published in a high impact journal, it is probably notable since one of the requirements for publication in these journal is notability. For example:

"Peer-review policy". Nature Publishing Group. Is the paper likely to be one of the five most significant papers published in the discipline this year?
"Information for Reviewers of Research Articles" (PDF). www.sciencemag.org. Selected papers should present novel and broadly important data, syntheses, or concepts.

Furthermore as part of the peer review system, these articles are judged by peers after reasonable consideration. Boghog (talk) 07:05, 5 October 2012 (UTC)

You are talking about "extremely high" impact journales with very high standards. We may easily make a list of them and accept without question. Nevertheless if an article was "one of the five most significant papers published in the discipline this year" then certainly there have already been circles of noise about the expected results, either before or right away after publication. So while the primary source will be excellent for exact description, still secondary sources are for judgement of notability. I cannot imagine that a primary source in a "high impact journal" will boast "my result is a real breakthrough and turns everything upside down, even if the author really thinks so. Staszek Lem (talk) 15:29, 5 October 2012 (UTC)

Anyway, colleague, let me remind you that this chat is not a pissing contest, but its purpose is to improve the policy. I started this talk because the current policy does not adequately address technical/scientific primary sources. Wait... <click-click>,<browse-browse>, I am taking my words back. It does not address tham at all! So rather than continue bickering, let us make a list of suggestions, starting with your remarks:

New results from primary sources published in high impact technical/scientific journals which have high standards for notablility and correctness of publications are presumed to satisfy wikipedia requirementns of WP:NOTE and WP:DUE. Journals of this type include Nature, .... . (per Boghog comments) The list of journals automatically accepted for the purposes of this guideline is expandable basing on the review of their policies. Also, it would be good that these policies of highs standard are described in wikipedia articles about these journals. (Surely it is a notable fact that a journal accepts only notable facts :-) Staszek Lem (talk) 15:29, 5 October 2012 (UTC)
What else? Staszek Lem (talk) 15:29, 5 October 2012 (UTC)

This isn't a NOR or PRIMARY problem. This is a WP:DUE problem. If the only source that says Cornpopper and Denkinouchi published this paper is the paper itself, then it's probably too trivial a fact to bother mentioning. WhatamIdoing (talk) 23:29, 1 October 2012 (UTC)

Our colleague Boghog would disagree with you: If it is published in Nature, then it is probably nontrivial. Staszek Lem (talk) 15:29, 5 October 2012 (UTC)

Fortunately, I know Boghog's views well enough not to believe your distortion of them. Nature publishes dozens of pieces each week, which means hundreds a year. He does not believe that every item in every issue of Nature is WP:Notable (==deserving of its own, separate, stand-alone Wikipedia article). WhatamIdoing (talk) 16:32, 5 October 2012 (UTC)

Well, you know them not well enough (or your knowledge is obsolete, or Boghog did not express them clearly here): Citing Boghog from just above: Again, if a article is published in a high impact journal, it is probably notable since one of the requirements for publication in these journal is notability (with specific example of the Nature.) So the only my 'distortion' is replacement of "notable" with "nontrivial". Also, Boghog's text clearly didn't mean that each 'Nature' article deserves a wp article about it, hence your rebuke is off mark. Therefore let the person whose mind we are reading speak for themselves. Staszek Lem (talk) 16:42, 5 October 2012 (UTC)

P.S. Nevertheless please let me repeat my request of focusing on the improvememnt of the policy. (Unless you think it is OK as it stands. I don't.) Staszek Lem (talk) 16:54, 5 October 2012 (UTC)

Therefore, continuing my answer to the top of your sub-thread, even WP:DUE still boils down to sources, rather than to wikipedian's opinions. Actually, I am not in disagreement with Boghog: As I wrote somewhere above, if something found its way to Nature, then most probably it is making lots of splash elsewhere as well, hence my proposal above to put this into the policy. There are primary sources and primary sources, and I am suggesting to clarify this. Staszek Lem (talk) 16:54, 5 October 2012 (UTC)

"A book review too can be"

There is a something fishy with the appendage "A book review too can be an opinion, summary or scholarly review", followed by a clumsy footnote. I am creating this section as a placeholder, a reminder to myself, as I don't want to distract people from the previous discussion. I will return here later. Staszek Lem (talk) 00:21, 17 October 2012 (UTC)

Location independent

On the current edits about interviews, Msnicki is right: a given piece of text is either primary or secondary because of its internal, inherent nature, not because of the style of publication. If Bob interviews Alice about her life, Alice's description of her life is a primary source, no matter whether that interview is published in Bob's personal blog or in the most stringently edited academic journal. WhatamIdoing (talk) 20:49, 16 October 2012 (UTC)

A self-reflective interview (done via a reliable source) can be potentially secondary, but it depends on a lot of factors, particularly if the interviewee is critiquing their own life after a long period of time had passed. For example, an actor contemplating that a much earlier role may have stereotyped them for the rest of their career and were he to do it over, would have declined the role. But, usually, this is will be secondary about a different topic and not about the person being interviewed themselves. We have to remember that works can be primary to some topics, and secondary to others. --MASEM (t) 20:53, 16 October 2012 (UTC)

Thanks, WhatamIdoing. Speaking to Masem's points, I think even a "self-reflective" interview is still primary, but I agree that an interview can be primary on some topics and not on others. For example, if Bob interviews Alice about her analysis of Macbeth, it would be primary to a discussion of Alice and her views but secondary to a discussion of Macbeth. Similarly, if a larger work contains an interview with Alice about her life, the portion consisting of the interview would be primary to a discussion of Alice, but if the rest of work consists of the interviewer's own analysis and discussion of what Alice said, that additional analysis and discussion by the interviewer would be secondary. Msnicki (talk)

What is interesting, this splitting of hairs about Alices' words is almost irrelevant in application to wikipedia, i.e., in context of WP:CITE/WP:RS: in an article about Alice it is admissible event if it is priomary source; while in an article about smething else it is secondary, hence admissible if not WP:UNDUE, i.e., if in this context Alice is a notable reliable source. Staszek Lem (talk) 22:31, 16 October 2012 (UTC)

It's not irrelevant to Wikipedia because the question comes up constantly in AfDs, where the argument is often made that as long as an interview was published in a reliable source, it's not really primary and thus can be used to support notability. In support of this, editors will commonly argue that the guidelines don't actually say interviews are primary. The argument, of course, is incorrect, and characteristic of the common confusion between reliable and secondary, namely, the misconception that if an article is reliable enough, that's the same as secondary. It overlooks the fact that for a source to be secondary it must offer the author's own thinking based on primary sources. It was this confusion that I hoped to clarify with my example. Msnicki (talk) 22:53, 16 October 2012 (UTC)

The text which Msnicki insists on inserting despite opposition, and with less that 2 hours of discussion, is " an interview with individual in which he describes his life, accomplishments or ideas would be a primary source on those topics." It says nothing about the source in which the interview appears, or how the source is used in the Wikipedia article. If, for example, a writer Y interviews Dr. X, and also interviews several people with differing points of view, and reads half a dozen journal articles related to the point, and comes up with a conclusion that "although Dr. X said in an interview the medicine he invented is the most important advance of the decade, most authorities consider it a marginal improvement" then the writer Y's article is most likely a secondary source. Jc3s5h (talk) 22:26, 16 October 2012 (UTC)

Footnote 3 already says an interview is a primary source. That means the "interview" part, the answers from the subject, are a primary source. I don't see a need to modify that in the text. Churn and change (talk) 22:43, 16 October 2012 (UTC)

So far as I can tell, very few editors read the footnotes and if they do, I believe they're confused by the phrase "depending on the context", interpreting "context" to refer to the reliability of the publication rather than the use to which the source is being put. I think this needs clarification in mainline text. Msnicki (talk) 22:53, 16 October 2012 (UTC)

It is very possible for a subject's answers in an interview to be secondary for that subject, if they are considered as far removed, and critical or analytic of the subject's past actions. But context is everything. It is improper to sweep interviews as "primary" as is being done now. --MASEM (t) 23:15, 16 October 2012 (UTC)

I disagree. I don't think that's ever possible. I don't think you can find support for your position anywhere in our definitions of the terms or in the definitions used anywhere in academia. An interview in which someone talks about himself is always a primary source on the topic of that individual and doesn't become secondary just because it's particularly self-critical or self-analytic or because the individual has spent many years in reflection. At best, it merely becomes a very good primary source. A secondary source has to be someone else offering his own thoughts. Period. Msnicki (talk) 16:30, 17 October 2012 (UTC)

The footnote says "Further examples of primary sources include...(depending on context) interviews". I agree with the footnote. A reliable source that just reports the contents of an interview without any analysis by the author of the reliable source is a primary source. A reliable source that analyzes a subject by reviewing various primary and secondary sources, which might include an interview, is a secondary source unless the reliable source is itself involved in the matter. I believe people do sometimes confuse reliability with whether a source is primary or secondary, but a blanket statement that interviews are primary won't help people to understand or accept the distinction.

I also agree with Masem's view that an interview subject's comment may be secondary for the subject matter being discussed in the interview. For example, if a scientist writes a review article about global warming in a journal and then summarizes the article in an interview on a TV show, the the scientist's interview would be secondary with respect to global warming. Jc3s5h (talk) 23:29, 16 October 2012 (UTC)

I don't see how your remark contradicts Msnicki's text, keeping in mind my correction below. It rather augments and clarifies Msnicki's. To split a hair, if the article in question is a "review article" as you said, then it is a secondary source, and the interview summarizing it soes not change its status. However if the article in question was an original idea of the interviewee, then it was a primary source, and rendering it via an interview does not change the "primary source" status of information originated in this article. See my text below. A mere retelling does not change the "originalness" of the information. Only when this information serves as a basis for further analysis, then this further analysis becomes secondary source. Staszek Lem (talk) 23:52, 16 October 2012 (UTC)

And to be consistent in this respect, in this "twice-removed case" the correct wikipedia piece, with reference, must go as follows:

J. Random Smart reports that 95% of whiggles are 17% shmoggles.<ref> Smart, J.R, "On biased whiggle dispersion", as mentioned in an interview of J.R.Smart to MSN Yesterday</ref> Staszek Lem (talk)

Two points:

"unless the reliable source is itself involved in the matter": Um, actually, no. WP:Secondary does not mean independent. A meta-analysis of my own previously published experiments is still a secondary source.
"the scientist's interview would be secondary with respect to global warming": That's not really "if an individual describes his life, accomplishments or his ideas" any longer, is it? WhatamIdoing (talk) 23:58, 16 October 2012 (UTC)

"unless the reliable source is itself involved in the matter"; if the reliable source was involved in the matter, it might turn it into a primary source. For example, the justices US Supreme Court do not perform scientific experiments on evidence and did not witness the events they rule on, but because their ruling creates new law, the ruling is a primary source.
An individual's ideas include his ideas based strictly on reading sources, so when he describes such ideas he is describing his ideas. 00:16, 17 October 2012 (UTC)

The court document is primary for the fact that the court made a particular finding of fact, and secondary (or even tertiary) for the fact itself.
My description of my idea is always a primary source for the fact that Idea X is an idea that I subscribe to. WhatamIdoing (talk) 21:41, 17 October 2012 (UTC)

My objection to Msnicki's text. "Location independent" is a very good section title to the issue I have with the phrasing of the addition. In the context of primary/secondary classification, the "source" is an utterance or writing or other "generation/recording of information" (e.g. painting a picture), by the "information generator". Keeping this in mind it is clear that:

(a) in an interview, utterances by the interviewer and interviewee are different sources in a single publication.
(b) It is irrelevant how the information produced by "information generator" reached wikipedia: be it his memoir, his utterance in an interview, a quotation from this interview in a magazine, a chapter in the book which has a telltale about the magazine article quoting thit poor interviewed man, etc., if whatever source directly renders the words of the interviewed "information generator" ("directly" may mean summarizing, but no additional conclusions), then the piece of info coming from this original "information generator" is coming from a primary source. The only concern is a possibility of "chinese whispers", i.e., distortion of the original words.

Therefore the suggested addition must be something like this:

if an individual describes his life, accomplishments or his ideas, this description would be a primary source on those topics regardless rendering of this description, be it a memoir, an interview or a quotation, retelling, or summarizing thereof.

I may expect that some would not like inclusion of "retelling or summarizing". I would suggest adding extra caveats to exclude synthesis in "summarizing", but I would like the idea conveyed that a "primary source" remains primary regardles the path information comes from it to us, as long as we are looking on the information itself, a not at any reasoning based on this information. Staszek Lem (talk) 23:37, 16 October 2012 (UTC)

I think the additional wording here is accurate, and if it were thought necessary, I'd go further to specify "in a magazine or newspaper article or in a television show" as well. But I kind of hope that it's not necessary. WhatamIdoing (talk) 23:58, 16 October 2012 (UTC)

I disagree that a short sentence, suitable for the guideline, can adequately summarize the status of an interview. What if the interviewer adopts the interviewee's words to state the interviewer's conclusion? What if the interviewee's ideas are not based on the interviewee's direct experience and actions, but rather based on reading primary and secondary sources? Jc3s5h (talk) 00:00, 17 October 2012 (UTC)

The question isn't whether a short sentence can adequately summarize the status of interviews. The question is whether a short sentence can adequately summarize the status of interviews about the interviewee's own life, accomplishments or ideas". We're talking "Barbara Walters asks WhatamIdoing about her childhood", not "The President of the United States answers questions about the economy". WhatamIdoing (talk) 00:19, 17 October 2012 (UTC)

To go along with the thread title, "Location independent", if an interviewee formed his/her ideas through reading sources, those ideas are secondary in nature, and remain secondary when relayed by an interviewer. Jc3s5h (talk) 13:15, 17 October 2012 (UTC)

~~Nope. You and Staszek Lem (who wants to add a lot of stuff about rendering and retelling) are both confused.~~ I agree with you and disagree with Staszek Lem, who I believe is confused. But the fact this confusion is so rampant is exactly why the guidelines should be clarified with an example, at least for the most common case of an interview where someone describes his own life, accomplishments or ideas. Re-read my examples above. If Bob interviews Alice about her analysis of Macbeth, the interview is a primary source for an article about Alice or her ideas but a secondary source for an article about Macbeth. If Bob then writes his own analysis of what Alice said, that would be a secondary source to articles about Macbeth or Alice and a primary source to an article about Bob and his ideas. If Bob merely summarizes what Alice said without adding his own thinking, that would be a tertiary source to articles about Macbeth or Alice (unhelpful in establishing notability of Alice because we require secondary sources for that) and a primary source to an article about him (useful only to establish that he did indeed publish this summary). Msnicki (talk) 15:24, 17 October 2012 (UTC)

If you say so, then you are confused about what you wrote yourself, because I only made an important clarification which, e.g., excludes from an interview the possible babbling of the interviewer (which may or may not be of the same "primarity sourceness" status as Alice.) Further, "her analysis of Macbeth" are "her ideas", and per your sloppy text it is primary source. And now you are saying it is sometimes secondary. Make up your mind, man. I mean, fix your addition so that "you and Staszek Lem" will not be both confused. Your addition managed to confuse lots of people. Does this say something about it? Staszek Lem (talk) 16:02, 17 October 2012 (UTC)

I don't believe there's anything at all "sloppy" about my example, nor do I find your repeated use of that disrespectful characterization helpful to the discussion. I don't think I'm confusing anyone. What I think is happening is that I'm telling you something that conflicts (as it should) with your misconception of the terms and that's hard for you to accept. Please re-read my examples and compare them carefully to the definitions of secondary and tertiary sources and see if you can tell me where where my (deliberately simple) examples are wrong. Msnicki (talk) 16:14, 17 October 2012 (UTC)

My take on all of the above is that a quotation is a primary source. If you lift a person's exact words from a source -- no matter whether that source is in the format of an interview or essay or general news article or whatever, then that item is, generally, primary. But an interview taken as a whole, which is a product of editing and choice, by the writer and their editor, is secondary. Interviewers choose questions, and they choose which questions and which answers to publish. They edit quotations for length, clarity, and whatnot. They may edit the subject's answers in a flattering or unflattering way. A quality source has an editor looking over the writer's shoulder, and has fact checking. A poor source lacks these and other hallmarks of professionalism.
An interview can be brutal on a subject; the interviewee may be cross-examined, and fact-checked and the interview format can, in some cases, be the best, most objective, way to shed light on a subject. Or it can unfairly malign the subject. Or an interview can be pure puffery; the interviewer can let the subject control the dialog, and the published version can be misleadingly positive.
Then again, you can say that about any book, or news story, or other source; there's nothing so unique about the interview format that makes it more or less objective than other sources. A non-interivew may be written by a paid flack. A non-interview may be the result of quid pro quo between the subject and the publication.
Some interviews, and some non-interviews, can help add weight to notability. Others, not so much. Many aspiring celebrities would love to be interviewed by a quality publication, but only those deserving of attention are chosen. A published interview by a professional journalist in a reputable publication is evidence of notability. A source which is non-independent of the subject, whether it's interview or not, does not add weight to notability. Though non-independent sources may be acceptable as reliable sources for some facts, such as describing the subject's opinions.
AfD Notability questions and Reliable source questions are two very different things, and we appear to be conflating the two here. But in either case, we are forced to use our judgement and we can't simply reduce it to a question of whether it's an interview or whether the writer rephrased the subject's statements into their own prose. It's never that easy. --Dennis Bratland (talk) 16:54, 17 October 2012 (UTC)

Talking about interviews "taken as a whole" is like talking about video "taken as a whole". It's a format, and format is irrelevant to PSTS determinations.

An interview that mostly runs through questions like, "WhatamIdoing, please tell our listeners about your childhood. Where were you born?" is a primary source for my childhood.
An interview that mostly runs through questions like, "WhatamIdoing, you're an expert, so what is all this stuff about Roman morality?" is a secondary source for Roman society (or possibly even tertiary).

Editing and selection are issues of independence and self- vs proper publishing, not PSTS status.

If I conduct a meta-analysis, and I publish it myself on my blog, in between unfiltered blither about what I ate for lunch (yum) and what I want to do for the afternoon but can't (nap) and why (telephone), then that meta-analysis is a secondary source.
If I conduct a meta-analysis, and you edit it for grammar, content, clarity, and sense, and Journal of Academic Pretensions begs me to let them publish it, it's still a secondary source.

The PSTS distinction is about the relationship between the source and the material underlying the source. If I take some other source's material and transform it into something new, then we've got a secondary source. If I some other source's material and repeat it, without creating something new, then it's just a secondhand primary source.

If you and I each conduct experiments that show all foo are bar and all baz are bat, our descriptions of our experiments are both primary sources (regardless of when, where, how, or by whom they are published).
If Alice takes our published results and repeats them ("Dennis and WhatamIdoing showed that show all foo are bar and all baz are bat in a series of studies"), then Alice's paper is a primary source.
If Alice takes our published results and transforms them through the addition of analysis ("In a highly flawed study of baz–bat characterization, WhatamIdoing published ridiculous conclusions claiming that all foo are bar and all baz are bat. She failed to notice that her equipment was pulsing blue due to Chernokov radiation, and also her sample size was pathetic.") then Alice's paper is a secondary source.

Your comment highlights what I think is the fundamental problem, though: The WP:GNG demands a secondary source, and we don't really mean it. We added that requirement back when a lot of people thought that "secondary" was a fancy was to spell independent or good source. "Secondary" has nothing to do with these qualities, and someday we may need to drag the GNG back to reality, which probably means centering it on the concept of "attention from the world at large". If Barbara Walters wants to interview someone about his childhood, that's both a primary source and an excellent indication of notability. WhatamIdoing (talk) 21:41, 17 October 2012 (UTC)

Exceptions?

My impression is that in some sections exceptions to this content policy are (or should be) allowed. I give here two examples:

Recent developments of aaa research

One study showing a connection between bbb and aaa suggests that bbb might be one of the factors leading to the development of aaa." (reference: a study that is original research)
Researchers have started to look at ccc as a possible cause for aaa.(reference: a study that is original research) This connection is not yet confirmed by other studies.

Possible causes of aaa

The xxx theory hypothesizes that xxx may cause aaa in some cases (reference: a study that is original research)
The yyy theory hypothesizes that xxx may cause aaa in some cases (reference: a study that is original research)

In both of these examples, it should be clear to the reader that this is not accepted knowledge but new developments.
Please comment! Lova Falk talk 11:27, 17 October 2012 (UTC)

You're misunderstanding--that's not "original research" by the meaning of this policy. Both of those are references to sources. Original research is when a Wikipedia editor says, "aaa" is caused by "xxx" and there is no reference to prove it. However, just because information can be soured doesn't mean it should be included. Always in medicine, and often in the case of the sciences, including new, cutting edge research violates WP:NPOV, specifically because it gives undue weight to theories that are not widely held. It tends to depend on who makes the claims, where they published them, and how far away from current standards they are. In medicine, for example, where our rules are the strictest, we're never supposed to reference new primary studies directly--we're looking only for secondary coverage, ideally review articles in high quality journals (or the like). Other cases may vary. As another example, one I work on--every year, some new European group creates a study that says that Christopher Columbus was not actually from Genoa, but is, in fact the long lost son of their ancestors. Theories include him being from Norway, Portuguese royalty, Spain, and other places. But the strong, clear consensus among historians across the world is that he is Genovese, and, until that consensus changes, all our article is going to say is that "other theories have been proposed". We're not provide a one line summary of every academic theories that gets published in some journal somewhere. Qwyrxian (talk) 13:06, 17 October 2012 (UTC)

Aha! Thank you for explaining this. But in that case, is there a difference between an "original research"-tag and a "citation needed"-tag? Lova Falk talk 15:31, 17 October 2012 (UTC)

I am sorry, colleague Qwyrxian, you did not check facts about Columbus before answering. Wikipedia writes much more than "other theories have been proposed." It fact, there is a whole article about them, Origin theories of Christopher Columbus. WP:UNDUE implies that a theory, even if it is wrong, will be described in wikipedia when it gains sufficient notoriety (multiple independent sources), and only then. Therefore if the only source about this theory is a primary source (i.e. only its author(s)), then it is most probably undue. Staszek Lem (talk) 16:19, 17 October 2012 (UTC)

There is also an important difference between teories on where Columbus came from, and theories on what can cause a certain disorder. Columbus can only come from one place, whereas several different factors each can result in the disorder, such as genetics, lead poisoning, traumatic brain injury. In a section about possible causes, I think it can be relevant to mention even studies that only have primary sources. Lova Falk talk 16:33, 17 October 2012 (UTC)

Sorry, disagreed. Someone publishes an article that cucumbers cause DID. We add it to wikipedia. Devastating effect on cucumber sales. Three months later an article pops up "Sorry it was not cucumbers, it was squash". What now? Wikipedia to cause squash market crash? Only resonably well established or at least well discussed facts in wikipedia please. Staszek Lem (talk) 19:26, 17 October 2012 (UTC)

The information, if sufficiently important (see: WP:DUE) can be included. The original report is a primary source. Try to find a high-quality source (WP:Secondary is not another way to spell good), and be sure to present it accurately and tentatively, as in your examples above. WP:MEDMOS suggests a section called ==Research directions== for some purposes.

Note that many active editors believe that new research reports are almost never sufficiently important to justify their inclusion, especially for a major disease. WhatamIdoing (talk) 21:53, 17 October 2012 (UTC)

Newspapers

Is a contemporary newspaper report (say, from an event in 1933) a primary or secondary source? I am considering, in particular, a New York Times report about the nazis from 1933. Risingrain (talk) 17:26, 29 August 2012 (UTC)

Sounds clearly secondary to me, if you are writing about the Nazis, but it might be primary if you are making a statement about the politics of the New York Times, --Boson (talk) 18:16, 29 August 2012 (UTC)

This is one of the perennial questions (and disagreements) on Wikipedia. The standard in history and related fields is that newspaper stories written at the time of an historical event are primary sources for the study of that event, while works of other historians who analyze the newspapers at a later time are secondary sources. However, some people use a different definition in which newspaper stories count as secondary sources. Still others will say it matters whether the reporter witnessed the event themselves or interviewed someone about it, or whether the event happened recently or not.

In any case, if you are writing a history article about the Nazis, the best secondary sources will be recent books and academic journal articles. Worrying about whether an article from 1933 is primary or secondary is less important because we have better sources anyway - you'd have to ask yourself why you need to use a newspaper from 1933 rather than a more recent source. — Carl (CBM · talk) 19:15, 29 August 2012 (UTC)

It may also depend the nature of the article. An article that may summarize events of a story from the prior week, say as part of a Sunday edition, even though written close to the event, may actually be secondary if it incorporates both facts and analysis. (Eg: today, I would consider most articles published in the weekly Newsweek or US News or Time have a good chance of being secondary). --MASEM (t) 20:01, 29 August 2012 (UTC)

See WP:PRIMARYNEWS. It depends on both the nature of the article and on how you use it, but the simple answer is that newspaper stories should be assumed to be primary sources until proven otherwise.

The ongoing source of the on-wiki confusion is that WP:Notability demands secondary sources, or your article gets deleted. So for recent events, people fudge things and try to pretend that newspaper articles are secondary sources. What they really means is that they are "please don't delete my article, because I'm sure this is important and some proper secondary sources will turn up someday". (If they do, then all's well; if they don't, then we delete those articles later.) WhatamIdoing (talk) 00:53, 30 August 2012 (UTC)

One of my frustration with Wikipedia is that we spend a lot of time and energy discussing primary/secondary/tertiary sources, when what we should be discussing is appropriate sources and inappropriate sources. It is usually far more important to ask: "Is this source appropriate in the context in which it is used?" than it is to ask "Is this source primary/secondary/tertiary?" Blueboar (talk) 03:54, 30 August 2012 (UTC)

The problem with appropriate sources and inappropriate sources is it is harder to quantify than primary/secondary/tertiary. -- PBS (talk) 14:17, 1 September 2012 (UTC)

I agree with both of you: harder, but more important. WhatamIdoing (talk) 21:29, 1 September 2012 (UTC)

Thank you for the responses. I agree it would be wrong to base an article entirely out of contemporary newspaper reports, but I think it would be right to use them to add detail to an article, if they do not contradict the secondary sources. My specific thinking: Hitler said that he 'stamped out the atheistic movement', which is quoted as an example of his anti-atheism, but the article doesn't say what he actually did: namely turn the headquarters of the movement into a church. Risingrain (talk) 18:58, 2 September 2012 (UTC)

On the other hand, an obituary about the mayor of Detroit printed in 1903 would be a very good secondary source, and not a primary one, because the reporter undoubtedly got all his information from the mortician, or undertaker, or family, or interviewing somebody (the primary sources), and he (Note: few women reporters in those days) just put it all together. GeorgeLouis (talk) 03:32, 28 October 2012 (UTC)

Exceptions to WP:NOR in Math/logic/philosophy

I have made some remarks on this subject here: http://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Mathematics. The topic of the note is related to both WP:NOR and editing articles in math/logic/philosophy. I'm including this note so that readers of both talk pages can comment. 173.239.78.54 (talk) 00:25, 27 October 2012 (UTC)

Reliable photography (not distorted or manipulated) : reference or comparison with it.

Definition of "original research" for the reference to a reliable photography : Hello, I can't speak english very well : If I post a reference link leading to an internet photo of a banana and affirm in the corresponding article of Wiki that such a fruit is YELLOW, is it an "original research" to say that it IS its colour, indeed..? If I post a reference link leading to an internet photo of a recognisable Coca Cola can with a standing double decimeter next to it and affirm in the corresponding article "Coca cola" that "such" a can is XX cm high, is it an "original research"? If I post a reference link leading to an internet photo of a tower with its levels numbered with boards posted on the real tower... itself and next to it, at the very same scale, the corresponding plan section of exactly the same... levels, as well numbered on it WITH their HEIGHTS indicated and VISIBLE..., given by the architect himself... (with his professional stamp if you want ;+)) and if I just indicate consequently that the height of level 5 of the REAL tower is 20 m high, according to the architect's plan and add (if you want between... both photographies but not directly on them...) a little horizontal red :+) line to proove it without a doubt, is it an "original research"? In one word, is a simple visual comparison with an UNDOUBTABLE photo an "original research"? Isn't it self sufficient without the need to be published first in the... Encyclopedia Britannica :+)? Simply because a photo has by definition a special PROOF status ("directly and explicitely supported by the source" as said by Wikipedia on this page), which is different from an ordinary "subjective statement" or "litterary demonstration"! Thanks!Erdnisloed (talk) 16:29, 1 November 2012 (UTC)

Yes, that's a violation of WP:NOR. See WP:OI. "Original images created by a Wikipedian are not considered original research, so long as they do not illustrate or introduce unpublished ideas or arguments, the core reason behind the NOR policy." If your photo is used as proof of something that can't be verified in some other source, then it's original research.

The context here is at Talk:Mercury City Tower#Definition of "original research". The forum citations, Skyscrapercity forum, in Mercury City Tower need to be removed per WP:USERGENERATED. --Dennis Bratland (talk) 17:20, 1 November 2012 (UTC)

A few months ago I brought up a similar question at the OR noticeboard, and the answers there had a bit more detail regarding how to handle situations like this. WP:SELFSOURCE provides additional guidelines here, particularly around the use of the architect's own photo. Regards, Orange Suede Sofa (talk) 17:43, 1 November 2012 (UTC)

WP:PERTINENCE deals with questions like whether a banana is yellow. The architect's photo is more complicated, though. On the other hand, if the architect has a blog, he can just post an assertion of its height on his blog, and skip the photo or any sort of proof entirely. WhatamIdoing (talk) 06:39, 5 November 2012 (UTC)

Statistic examples and routine calculations

Per Jc3s5h's suggestion, I wanted to see what others thought about mentioning things like "statistics" or examples of it like averages in the routine calculation section. The reason I wanted to include or mention some statistical tools explicitly was because some contributors may confuse statistical info like mean, or average, as being a simple calculation and confuse with arithmetic. Not everyone will bother to look up what arithmetic means and if some do, then they may argue that since an average is common and does "use" basic arithmetic then that it would fall under the routine calculation category. In reality, taking an average, for instance, would be considered data analysis. I think this point needs to be mentioned to reduce confusion to those who read that part of the policy. What do you guys think?--Ramos1990 (talk) 22:33, 1 November 2012 (UTC)

Per WP:BEANS, do you know of a recent case when there has been confusion over this? --Dennis Bratland (talk) 22:35, 1 November 2012 (UTC)

Greetings Denis Bratland. Yes, see the following edit: [2] which had a plot which was created form 3 sources. The plot image can be found here: [3]. The fact that this plot did not exist in any of the 3 sources on the plot meant that the author of the plot compiled it and argued that this plot was simple routine calculation. None of the sources argued the caption point either. This is why I think making an explicit mention of averaging or using medians on info from one or multiple sources is not routine calculation, but data analysis. What do you think?--Ramos1990 (talk) 22:43, 1 November 2012 (UTC)

The edit pointed out by Ramos1990 appears to be original research because it is advancing a position that is not present in any of the sources. But I don't have enough access to the sources, or the inclination, to figure out whether the person who created the image even used averaging; maybe it is a matter of grouping.

The problem with interpreting "arithmetic" too narrowly is that there is a gap between that which can be figured out with simple arithmetic and that which is original enough to get published in a reliable source. When sources present similar information in different units or formats, it may require more than simple arithmetic to present the information in a single manner for the convenience of our readers, but such reformatting would be beneath the notice of any reliable source and unpublishable.

An example for mean might be a reliable source written in 2010 which takes presents mean values from 1900 to 2010, and clearly explains that it calculates the values from a certain annual publication in a specified manner. The 2011 edition of the annual publication has appeared, and has no changes in policies or procedures from prior years. It wouldn't be unreasonable for an editor to calculate the 2011 mean and add it to a graph in Wikipedia. Jc3s5h (talk) 22:59, 1 November 2012 (UTC)

But the synth there isn't from the mean or median calculations; it is from putting together information from disparate sources (which might have had their own assumptions, error bars and so on). Mean and median calculation of a single source of data is indeed a routine calculation, assuming no filtering of the data is done. Data analysis and routine calculations are not mutually exclusive. Any significance assigned to the mean or median could well be WP:OR, but stating just the mean is xxxx based on the data in one source is allowed. It may fail the notability test, but that isn't relevant on this talk page. Churn and change (talk) 23:05, 1 November 2012 (UTC)

But where's the confusion? You cited the current WP:NOR page and the issue was resolved. I was asking if you could cite a talk page discussion in which two or more editors were confused as to the meaning of WP:CALC, such that it need to be reworded. Seems like WP:CALC did it's job here and all is well. --Dennis Bratland (talk) 23:04, 1 November 2012 (UTC)

I agree with Churn and change that the NOR violation in the example was in making a WP:SYNTH from multiple sources, rather than in anything statistical. Although, in principle, I can see how a statistical analysis by an editor of sourced data that had not been analyzed that way in any source would, indeed, violate NOR, I'm leaning against adding anything to the policy about statistics on the whole. I could certainly consider an editor looking at data and concluding that something was or was not "statistically significant" would be OR. But simply calculating a mean or median? To me, the calculation itself is just a routine calculation, and what one does or does not attempt to argue based on that calculation is where NOR would come into play. Given that one would have to differentiate amongst different kinds of statistical analyses in order to draw a really useful line, I kind of think that the proposed addition would be WP:CREEP as well as confusing. --Tryptofish (talk) 23:19, 1 November 2012 (UTC)

Hey all, thanks for the input. I corrected to what the plot used - median. However, I am thinking ahead to averages which could be confused easily also. Jc3s5h, the mean example you have noted should be ok since the sources would explicitly note that these were means and this is just adding on to the previous years results. This would be a continuation of updated data from a similar family of sources. In terms of arithmetic, the presentation of the info in a convenient way would not be a problem. Churn, I agree with much of what you have said since if a source used a mean then there is no issue. Tryptofish, appreciate your input as always. This case was a little of many things. I explain in paragraph below my idea.

Denis Bratland, yes this issue was resolved, but the author of the plot was the one who mentioned that his plot was routine calculation on his talk page and tried to justify it that way. The talk page discussion is here: [4]. Aside from the lack of source for the caption, and more direct studies which show the opposite in terms of personal attitudes on the topics, I think that a little more info in the CALC section should be included about compiling of data and the limits of what one can say (maybe provide a simple example to illustrate further?). I think that the selection of a median on all three sources to make a plot is already being more selective of particular data and already filtering more than is needed of the data in each of parts taken from the sources. It would not be basic arithmetic when more analysis is done than is found or represented in the original sources. Choosing medians to present info when a source does not use medians on that data set is already going beyond routine calc, no? Similarly, if one chose to use an average when none of the sources do averages on the data used, then this could get problematic. What do you guys think? --Ramos1990 (talk) 23:57, 1 November 2012 (UTC)

I read the discussion on that user's talk page, and I guess I'm still saying what I said before. The OR doesn't reside in the calculation of medians/means. It lies in the decision to combine data from sources in order to make those calculations. If, instead, there had been one source that said there were 3 of something, and a second source that said there were 2 of that thing, and an editor wanted to make an edit saying that there were 5 of that thing, the OR would be in combining sources in that way (also the fallacy of assuming that the 2 were not included within the 3), not in making the simple arithmetic addition. So the problem you encountered here really isn't about calculations of any sort, arithmetic or statistical. It's already covered by WP:SYNTH (as well as by WP:DUE). --Tryptofish (talk) 20:35, 2 November 2012 (UTC)

Hey thanks for your input Tryptofish. We can leave the section as is then. --Ramos1990 (talk) 06:45, 4 November 2012 (UTC)

Or perhaps we should add it, since taking the average itself isn't a problem Averaging incomparable items is a problem, but averaging itself isn't. So, for example, if the numbers come from the same publication, same authors, or same agency, or if the sources explicitly say they're using the same definitions or methodology; or if there simply aren't any plausible ways to define things differently (e.g., the number of years that a politician holds a particular office), then averaging is just fine.

In general, very straightforward descriptive statistics are acceptable. I'd add determining the range as another perfectly acceptable type of statistics (and one often more appropriate than the mean or median alone). WhatamIdoing (talk) 06:47, 5 November 2012 (UTC)

Agree that these things usually involve OR beyond the math itself. For example, if I am on a quest to say that California-Nevada is actually one state, and added up the areas from Rand McNally and used it as a source that "The area of California-Nevada is xxx,xxx square miles" my statement is more that just simple math, it is an implicit claim that a source treated them as one state. North8000 (talk) 10:50, 5 November 2012 (UTC)

What we are discussing now is kind of the opposite of the original post in this thread. The original discussion was about whether to say that simple statistics are OR; what WhatamIdoing suggests is to add something indicating when they are, instead, acceptable as simple calculations. Given how complicated it can get to draw the line, given how the real issues are already covered by WP:SYNTH, and given that we don't really seem to be having a widespread problem, I'd lean against making any such addition, on the grounds of WP:CREEP. --Tryptofish (talk) 19:54, 5 November 2012 (UTC)

I would agree... this sounds like one of those things where it is better to not expand... the line between acceptable calculation and synthesis is often best judged on a case by case basis. Borderline OR can be discussed at NORN. Blueboar (talk) 21:36, 5 November 2012 (UTC)

Agree with Tryptofish and Blueboar. My discussion was more just making that observation. And I did see that occur, but not with states). The observations basically says that with proper analysis, current policies cover it. North8000 (talk) 22:05, 5 November 2012 (UTC)

"Timeline of ..." history articles

I came to a conclusion that the "Timeline of X" articles contain a serious potential of original research. There is no strict criteria for inclusion or exclusion of historical events having a relation to the "X" topic, so, for example, "Timeline of the events that lead to WWII" may include formation of Entente Cordiale (indeed, the two members of this alliance, France and Britain, were two initial Hitler's opponents, so, formally speaking that is correct). As a result, it is possible to implicitly convey any arbitrary idea, which is not explicitly stated in any reliable sources. I propose to discuss the rules that regulate inclusion of the items in such lists.--Paul Siebert (talk) 14:32, 8 November 2012 (UTC)

I would be willing to bet that there is at least one source that that discusses how the Entente Cordiale (and other events from pre-WWI times) directly lead to WWII... but your point is still valid. After all, one could argue that the Treaty of Verdun (843 A.D.) should be in that timeline... since that was the root cause of conflict between France and Germany over Alsace... which continued on and off until WWII. However, I doubt there is a historian who has ever stated a direct connection between the two events. So, I would agree... for an event to be mentioned in a timeline, we do need at least one source that makes the connection and directly relates the event to the timeline's topic. Blueboar (talk) 14:53, 8 November 2012 (UTC)

I'm slightly alarmed by this proposal because I wrote/maintain various timeline-style lists such as List of Asian dinosaurs. The timeline at the bottom of that list is "unsourced", if you like—I can prove when each dinosaur lived, but I can't show you a source that "directly relates" (to use Blueboar's phrase) the lifespan of each species to the timeline's topic. Do those articles count as WP:OR?

As a separate point, couldn't a timeline include unrelated events, for reasons of context and background? If, for example, we had an article about the Timeline of the Liao Dynasty, wouldn't most Westerners' understanding of the topic be enhanced by mentioning the Battle of Hastings happening two thirds of the way through?——S Marshall T/C 15:31, 8 November 2012 (UTC)

Re: your dinosaur timeline... No, not OR... there are lots of sources that discuss the eras in which when various dinosaurs lived. so the event (the era in which Foobarsauris existed) directly relates to the timeline's topic (the eras in which various dinosaurs existed)

Re: Mentioning Hastings in the Liao timeline. It probably depends on presentation... if the reader who knows nothing about Hastings or the Liao Dynasty could be misinformed by inclusion (and come away thinking that the battle of Hastings relates in some way to the Liao dynasty) then I would call it unintended OR. If, however, the timeline presented it in a way that made it clear that the battle was being listed as a western history reference point for what was happening in other parts of the world (say presented in a different color, or with Liao events above a line and non-Liao reference points below the line), then there would not be any OR involved. Blueboar (talk) 15:54, 8 November 2012 (UTC)

This discussion is quite timely for me: recently a new timeline was created by a single-purpose account. Many entries are quite POVish: Chronology of Ukrainian language bans, and I even don't know how to start here. It's been tagged unreferenced for month already, but the author is gone / does not respond. I am ready to start a cleanup, but I expect a revert war and don't want to fight it alone. Staszek Lem (talk) 21:55, 8 November 2012 (UTC)

When is summarizing five critics' reviews an example of synthesis?

Hi. Adding material on the critical reception to movies, TV episodes, comics, etc., is a frequent part of my editing, and I'm sometimes uncertain of how to summarize them without engaging in synthesis. For example, I add material from reviews to articles on South Park episodes. For example, I added material from four critics' reviews to Reverse Cowgirl (South Park), and today, another editor added the comment to that section that the episode received generally positive reviews. Is four reviews sufficient to make that statement? It's not like I combed over every available press outlet or publication to glean as many reviews as I could. Is such a statement synthesis? Nightscream (talk) 23:23, 12 November 2012 (UTC)

For films and other works there is definitely sites like Rotten Tomatoes, Metacritic, etc. that aggregate a broad range of reviews, so if the work got 90/100 from one of these, it's not OR nor POV pushing to say "it generally got positive reviews" using that as a source. But when you are talking about a TV episode where I'm not aware of sites that aggregate reviews on that, the issue I think becomes if the 4 reviews you picked are a broad selection of the reviews, or if you are purposely being selective to make the episode work better. Unless you have a reliable aggregater site that can be references, I would avoid trying to claim something like "generally positive reviews". --MASEM (t) 02:38, 13 November 2012 (UTC)

Question about original research

Hello. I had a question about the original research policy. I've recently been involved in a debate over whether the Differences between James Bond novels and films article should be deleted. One user suggested that because the information was not attributable, it violated the original research policy. Leaving aside that there are dozens of books on the history of the James Bond franchise to site this material, it occurred to me that you could site this from the source material itself. For instance, in the book Moonraker, the Moonraker is a nuclear missile. In the film adaptation, it is a space shuttle. Could we not simply site the Ian Fleming book when discussing the missile, and the Roger Moore film when mentioning the space shuttle? What could be a more reliable source than the source material in question? Or would that be considered original research? Thanks! -Fogelmatrix (talk) 20:07, 7 December 2012 (UTC)

Yes, straightforward facts about what the the book or film says can be sourced to the book or film. A factor that does not fall within the verifiability policy, but rather the notability guideline and related subpages is whether certain straightforward facts about books or films are notable or not. An article devoted entirely to differences between the Moonraker book and film would require reliable independent sources to establish that these differences are notable enough to have a Wikipedia article about them. Jc3s5h (talk) 22:09, 7 December 2012 (UTC)

Jc3s5h is right, but I'd be surprised if you were unable to find sources that directly present at least some of this information. For example, this book:

Burlingame, Jon. The Music of James Bond. Oxford University Press; 2012-10-11 [cited 7 December 2012]. ISBN 9780199863303. p. 136.

says that Moonraker was originally a nuclear missile and became a space shuttle. WhatamIdoing (talk) 23:44, 7 December 2012 (UTC)

If you are referring to the deletion discussion, your post missed the main point there. The gist of the comments there was that the real question might is for the purposes of establishing whether there is wp:notability-suitable coverage of the topic. Basically, that there is not suitable coverage of the synthesis-type-topic represented in the title rather than saying that the article content violates wp:nor. This requires substantial coverage of the topic, which is the differences. Particularly for this purpose, you can't just find a sourced statement about "A" in the novel and about "B" (that it turned into) in the film and the state or imply the differences. You need to find a suitable source that made that comparison. The bar for mere presence of the material in the article may be a bit lower, but it sounds like the main discussion is wp:notabiity-related coverage. North8000 (talk) 12:53, 10 December 2012 (UTC)

The standard is higher for notability (=deciding whether to have an entire article about something) than for merely sourcing a single fact (=complying with WP:V and WP:NOR). WhatamIdoing (talk) 23:01, 13 December 2012 (UTC)

References

I was looking at a biography page of a living person. As I looked at the source I found that many were from the same place or agency. Is there any thing setup to ensure that "sources" don't come form the same source, example: either one news agency or it subsidiaries. Sourcing the same mainstream news company seems to me like a biased unfair view of the "Wikipedia" page. Why doesn't their research fall under "original research"? Is it simply because it is published in their newspaper or on air on their News station? Most of what I'm referring to are video that aren't something that really could be debated but it seems like a lot of faith is put into "sources" just because it came from a mainstream media source. (not related but I don't want to sign and make my I.p. address public) — Preceding unsigned comment added 05:04, 2 January 2013 (UTC)

It's not really treated as an original research issue, but you are correct. Mostly, it's about where WP:BASIC requires that articles about persons be supported by multiple sources that are independent of the person and of each other. --Tryptofish (talk) 16:16, 2 January 2013 (UTC)

Prohibited "Original research" is about us prohibiting our volunteer editors from doing their own original research. A published source can never violate this policy about the behavior of Wikipedians.
"Mainstream" is a good word on Wikipedia. We want all of our articles to reflect what mainstream sources say about the subject.
We do want articles to reflect a WP:Neutral point of view. The definition of "neutral" is weird on the English Wikipedia, though: an article is neutral if it presents mainstream views as being the accepted, normal, mainstream point of view and presents alternative/minority views as generally not being accepted. WhatamIdoing (talk) 04:49, 9 January 2013 (UTC)

synthesis

If two pieces of information from the same source are used together to make a point, can this constitute synthesis? Particularly, would this be synthesis if the 2 pieces of info are not used together by the author, are taken out of context, and occur on very different page numbers / chapters (eg page 13 & page 372)? Would it be possible for us to add a short paragraph about this to the synthesis section? Charles35 (talk) 17:48, 29 December 2012 (UTC)

I'm undecided about whether we need to add anything about it, but yes, it's possible to make a WP:SYNTH violation from a single source. If the source does not treat two things as being related, then editors should not claim that they are. See also WP:Cherrypicking. --Tryptofish (talk) 19:29, 29 December 2012 (UTC)

Why are you hesitant? Charles35 (talk) 01:03, 30 December 2012 (UTC)

Honest answer: because I hadn't spent a lot of time thinking about it, and I wanted to see if anyone else would express an opinion.

That said, we do have the "cherrypicking" essay, and it's not clear to me just how unclear SYNTH is about the subject, so I really would like to see some more discussion. Would you like to propose specific wording? --Tryptofish (talk) 19:52, 30 December 2012 (UTC)

"The definition of original synthesis is not strictly related to material from more than one source. In some cases, material from a single source can constitute original synthesis if multiple points are taken out of context and used to draw an original conclusion. These points must not have been used together by the author, and usually occur on very different page numbers or chapters."

Of course, the above paragraph should be revised and added to for optimization before adding it to the policy.

Oh, and this is definitely different than cherrypicking. Cherrypicking might imply a conclusion that was not intended by the author(s) of the source, but original synthesis would explicitly state a conclusion. Charles35 (talk) 03:49, 31 December 2012 (UTC)

I'm going to take that as a "go for it". If anyone has anything to say, I would love to hear it. I think this is actually a very important detail and I am surprised it is not already here. I don't see a problem with it, but as I said, I'd love to hear any ideas you might have. Charles35 (talk) 13:32, 8 January 2013 (UTC)

This is not the traditional definition that the community has used for years, and such a major change requires more agreement than passive acquiescence by one person.
SYNTH is about multiple sources. The rest of the policy already covers the scenario that concerns you. We just don't call it "SYNTH". 04:39, 9 January 2013 (UTC) — Preceding unsigned comment added by WhatamIdoing (talk • contribs)

Tryptophan agrees. He edited the addition and refined it. I don't know if I'd call this a major change. I was just being bold. I didn't assume that anyone agreed with me. Relax.

I appreciate sentiment and tradition as well, but this isn't a wikipedia "past time" or anything, it's just one detail in one policy. Changes to the policies happen everyday. Also, covering this "scenario" under synth is the most logical and intuitive way to go about it. This literally is synthesis of published material to advance a new position. It's identical to what is being done with multiple sources, but is limited to one source instead of 2+.

If a newcomer were to come to wikipedia and read these rules, this "scenario" would almost certainly intuitively seem like a case of original synthesis. That is, if they were concerned with the spirit of the policies, not the exact wording (The spirit of the rule trumps the letter of the rule) - . Such an approach is reflective of the philosophy behind wikipedia's policies (ie work in progress, ignore all rules) - the rules are imperfect and everything is a work in progress.

Covering this under synth is the most simple, intuitive, and logical way to do it. Some roundabout way that requires extensive knowledge of the past and the history of the wikipedia community and relies on deduction of various bits and pieces of the guidelines is simply inexpedient, unwelcoming to new editors, and just flat out confusing. Why not just make it obvious and easy? Most contributes do not thoroughly peruse the rules before editing, and that isn't necessarily a bad thing. We should do whatever we can to make editing more convenient. Honestly, I am very surprised that this wasn't already the guideline. Charles35 (talk) 05:37, 9 January 2013 (UTC)

I guess the way I (not tryptophan, by the way) see it is that it really is synthesis to take one piece of information from one part of a source, and combine it with something from elsewhere in the source in a manner that the source itself does not do. Let's use the first example that is now on the policy page, but instead postulate that a single book says, at one point, that "The United Nations' stated objective is to maintain international peace and security." In another chapter, the author writes that "since [year] there have been 160 wars throughout the world", but in a context having nothing to do with the UN. If an editor were to create either of the two sentences given in the example, it would indeed be SYNTH. Claiming "but I got all of it from the same source" would be a bogus excuse. Now, it's true that other parts of the policy page do caution against going beyond what a source says, so I guess there's an argument that it's already covered by other parts of the page. But I'm not that persuaded by the argument, because it's quite reasonable to think, as Charles35 did, that the existing language is explicitly limiting the definition of synthesis to two or more sources. WhatamIdoing, can you explain in more detail how the scenario I just described is fully covered elsewhere on the policy page, in a manner that is not given to misunderstanding by inexperienced editors? --Tryptofish (talk) 01:21, 10 January 2013 (UTC)

Let me offer an alternative approach. The last sentence of the opening paragraph of SYNTH is: ""A and B, therefore C" is acceptable only if a reliable source has published the same argument in relation to the topic of the article." How about putting a footnote at the end of that sentence, saying the following: "The source must state that argument explicitly. A source that says "A" in one context, and "B" in another, does not satisfy this requirement." That way, we do not actually open the can of worms of whether or not there is synthesis from a single source. Instead, we simply make the non-controversial clarification that such a source does not provide what is needed to avoid synthesis. Would that work? --Tryptofish (talk) 01:29, 10 January 2013 (UTC)

At the end I assume that you mean "to avoid "C" being considered to be synthesis"? North8000 (talk) 01:51, 10 January 2013 (UTC)

Yes, that should be made clearer. --Tryptofish (talk) 02:03, 10 January 2013 (UTC)

I think you need to open that can of worms. It is not made explicitly clear anywhere in this policy or any other that I have read that the idea of "synth" can occur in relation to only a single source. I think this would make things a lot simpler all over wikipedia. You could intuitively say "oh, that is synth" in these single-source scenarios instead of some roundabout statement. Also, it is certainly not clear to new editors. There is no better place to make it explicitly clear to everyone than under the idea of "synth". It is simply the most intuitive.

Anyway, since there's only 3 of us here, I think I might put in an RfC for this. I am going to reinstate the paragraph just for the sake of being bold and hopefully attracting new opinions (and because WhatamIdoing hasn't given us the policy she referred to). If that is to no avail, I will seek an RfC. Charles35 (talk) 01:50, 10 January 2013 (UTC)

Please don't! You'll be edit warring over a core policy. We are in the "D" stage of BRD, so don't create an additional "R". --Tryptofish (talk) 02:02, 10 January 2013 (UTC)

PS, Tryptofish, my bad! :) Charles35 (talk) 01:50, 10 January 2013 (UTC)

Policies are general statements. The don't state all of the cases that they apply to. That is not a flaw or shortcoming. Also examples are just examples. Because the policy gives a "2 source" example does not mean that it is limiting is scope to the conditions of that example. So, IMHO, your proposed change is addressing a non-existent flaw, and thus is not needed and thus not a good id4ea. Sincerely, North8000 (talk) 01:58, 10 January 2013 (UTC)

~~I don't necessarily disagree with that~~ (your comment said something else when I wrote this), but I don't think it could hurt to add 2 sentences to the policy. In fact, I think it would be extremely beneficial, improving talk page discussions, making wikipedia more welcome to new editors, and just generally improving and simplifying a policy. Charles35 (talk) 01:58, 10 January 2013 (UTC)

Because the policy gives a "2 source" example does not mean that it is limiting is scope to the conditions of that example. - well, it never says anything at all (even the material that is not part of the example) about a one-source scenario. So it kind of is limiting it to 2+ sources. There are plenty of things it doesn't say. Does that mean that we are free to interpret it to mean anything? There's no harm in adding 2 sentences to the article, especially if you think they are already implied. And since most wikipedians act like they aren't, the rule is unclear and un-explicit. Who's to say they aren't right and you are wrong? No one, because the rule does not explicitly say that.

Your opinion that this change doesn't address a flaw it not necessarily something I disagree with. But I think you're focused on the negative, whereas I'm just trying to say that this addition of 2 sentences will add on to the policy in a positive way (even if not "correcting a flaw"). It can't do any harm (at least, you haven't argued that), so I don't see why it's a problem. Please remember that wikipedia is a work in progress and that there is nothing wrong with improving a policy, regardless of whether or not we consider it "flawed".

Also, the fact that it is common practice amongst wikipedians to consider a synth strictly limited to 2+ sources proves that in its current state, it is effectively limiting it to 2+ sources. Charles35 (talk) 02:12, 10 January 2013 (UTC)

I was just giving my opinion. Which is that saying so is essentially redundant, but not wrong. Sincerely, North8000 (talk) 02:55, 10 January 2013 (UTC)

I'm not sure what the best way to address this concern is. In general, though, I'd expand Charles' statement above: almost nothing in this policy is completely clear to new editors. WhatamIdoing (talk) 05:18, 10 January 2013 (UTC)

Well, I think the title is rather confusing and probably archaic (although I was not around for the early years of wikipedia), but if you just read the first sentence, the general aim of the policy is pretty clear:

The term "original research" (OR) is used on Wikipedia to refer to material—such as facts, allegations, and ideas—for which no reliable, published sources exist.

However, say you're right whatamidoing, that the entire policy is confusing to new editors. That is no argument against this addition. Just because this isn't a perfect solution doesn't make it worthless. Please remember that wikipedia is imperfect and a work in progress and that any positive addition that makes the policy more clear to new editors is worthwhile. Charles35 (talk) 16:57, 10 January 2013 (UTC)

Charles35, IMO your are barking on a wrong tree. IMO the whole section has wrong focus,

It failed to pinpoint exactly how original research arises from combining two sources. I will discuss the issue in the coming section below #"There were no buts in the sources".Staszek Lem (talk) 17:59, 10 January 2013 (UTC)
The term "source" is vague and must be clarified: are two chapters in a book the same source? Is a person is the same source, even if he put his ideas in to two books? is the article published as "Part 1" and "Part 2" the same source or two? IMO there is no need to split hairs here and assume that for the purposes of this policy a "source" must be understood as "a source" from which a particular statement of wikipedia article was derived. Under this convention the "two sources" may be in the same publication. Clearly, being them in the same "physical" place is purely accidental: the author may say "A" in one article and say "B" in its sequel, (and say "A+B=C" in the coming monograph,) and so on. My suggestion is to add the highlighted phrase into wp-law: WP:RS and wikipedia:Terminology. Staszek Lem (talk) 17:59, 10 January 2013 (UTC)

"There were no buts in the sources"

The section WP:SYNTH is basically correct, but IMO it fails no convey the essence of "synthesis" wikipedia don't want and why wikipedia does not want it. Unfortunately I have no time right now, but briefly:

Why OrSynth bad: Drawing conclusions is not a 100% reliable process, even in simple logical arguments. To a great degree it is influenced by ahthor's worldview, even subconsciously, even if the author genuinely believes in their N of POV. There is even no common understanding what may be a "simple argument". Everybody knows a joke about a math (or phys) professor who chalked a blackboard with equations and triumphantly concluded "now it is evident that A=B". Then stopped for 15 minutes of deep thought and repeated "Yes, it is indeed evident". Summary: we don't want wikipedians to draw conclusions themselves, since there is no simple, reliable process to verify if wikipedian's logic is correct. Allowing wikipedians' conclusions into wp-articles is a gateway to endless edit conflicts and talk pages (not that we don't have them anyway :-)Staszek Lem (talk) 18:37, 10 January 2013 (UTC)

Where Synth is: Consider the first example, with a single word deleted.
The United Nations' stated objective is to maintain international peace and security.[ref1] Since its creation there have been 160 wars throughout the world.[ref2]

There is nothing evil here. The text may e.g. proceed to say: The UN was instrumental in quenching 137 of them.[ref3].

However the three letter-word changes the picture drastically: the word "but" signifies something contrary, i.e., it presents the wikipedian's conclusion that UN's efforts were inadequate. This conclusion may come from wikipedia's opinion that 160 wars is too much. Now we have a can of worms here. What if without UN there were 1600 wars? What if these 160 wasr were "good wars" ("freedom fighters", you know)? and so on.

Let me repeat again: the wikipedian may be even didn't realize all this: the just thought "whew, what a hell of a number of wars! The UN screwed up!" And what is more: most of readers skimming the text would see nothing suspicious here.

Now, suppose we managed to sweep the "160 wars is too much" under the rug. Will the following text be OK?

The United Nations' stated objective is to maintain international peace and security.[ref1] Since its creation there have been 160 wars throughout the world.[ref2] J.Random Treehugger said ""135 wars is too much",[ref3] (now a wikipedian does a simple math (160>135), allowable by this policy and happily concludes) so the UN failed its objective.

Now, I am leaving you an exercise to pinpoint the OrSynth in the latter. Staszek Lem (talk) 18:37, 10 January 2013 (UTC)

<Sorry, have to go, but it is a starting point>. Staszek Lem (talk) 18:37, 10 January 2013 (UTC)

If I were up for exercises, the first thing I'd say is that that agglomeration at the end is too random to provide an answer to. Maybe you should rewrite it when you are not in such a hurry. :-) North8000 (talk) 22:41, 10 January 2013 (UTC)

That is exactly the kind of agglomeration which happens when someone does OR by SYNTH. However it is not random: it is perfectly logical. Perhaps you were confused by the inline comment. I removed it below:

The United Nations' stated objective is to maintain international peace and security.[ref1] Since its creation there have been 160 wars throughout the world.[ref2] J.Random Treehugger said ""135 wars is too much",[ref3] so the UN failed its objective.

As for "up for exercises" it was a rhetorical figure.

What is more, the last example of SYNTH is especially difficult to nail down: the combined sentence is logical; also a reader may assume that the very last piece is also available in [ref3]. Staszek Lem (talk) 02:48, 11 January 2013 (UTC)

RfC on synthesis

Should we insert the following 3 2 sentences to the bottom of the "Synthesis of published material that advances a position" section of the project page?

The definition of original synthesis is not strictly limited to material from more than one source. In some cases, material from a single source can constitute synthesis if multiple points are taken out of context and used to draw a conclusion not drawn by the source. Synthesis occurs if these points were not treated as directly related by the author, particularly when statements occur far apart in the source material. Charles35 (talk) 04:28, 12 January 2013 (UTC)

The definition of synthesis is not limited to material from more than one source. Material from a single source can constitute synthesis if multiple points, not treated as directly related by the source, are taken out of context to draw a conclusion not drawn by the source. Charles35 (talk) 20:33, 23 January 2013 (UTC)

Some of us believe this would make things a lot simpler all over wikipedia. In edit summaries and talk page discussions, you could intuitively just reference the policy and say "oh, that violates WP:SYNTH" in a single-source scenario instead of writing 2-3 sentences to explain the problem. Also, this is the most intuitive and logical way of looking at it (IMO) and would be more clear to new editors, who don't exactly peruse the policies before editing, than the current policy (which, as far as I can tell, doesn't explicitly address this scenario). Considering it's only 3 short sentences, I don't see how it can harm the policy. Even if it isn't a perfect solution to all of wikipedia's problems, it can only help make things easier and simpler IMO. Sure, it might be slightly difficult for the community to adjust to, but in the end, I think it's worth it! Charles35 (talk) 04:28, 12 January 2013 (UTC)

Survey

Support inclusion of the paragraph because it is simpler and helpful to the editor. Myself, of course. Charles35 (talk) 03:50, 14 January 2013 (UTC)
Comment. ~~Charles35, in case you don't know, it's pretty unusual for anyone to both support and oppose an RfC proposal, or to oppose something they themselves proposed.~~fixed Anyway, I think it is true that SYNTH can take place within a single source, and I think it's very desirable to revise the policy page to state that, but I'm not sold on the exact wording above. I also wonder whether the recent addition at the end of the opening paragraph of SYNTH accomplishes what is needed here. --Tryptofish (talk) 01:27, 14 January 2013 (UTC)
Oppose because it implicitly defines a source as a work when the concept is not about the definition of source. If the "do not combine A and B to create C" rule is read strictly literally, Wikipedia is just an aggregation of copyvios and adding a "do not combine A and A to create C" caveat to it just encourages this literalism. This "far apart in the source" stuff points the reader to contingent markers instead of getting them to think more abstractly. The point of this policy is to address subtraction from or negation of what's accepted, not addition and creation per se. Addition and creation are subject to WP:Verifiability. This policy is a necessary residual to deal with those situations where a "citation needed" won't help resolve the problem. The opening sentence says OR refers to "material... for which no reliable, published sources exist" and the footnote to that expands on the point that the issue here is the logical possibility of a source not the contingent absence of a source ("Articles that currently name zero references of any type may be fully compliant with this policy"). This policy applies to an A priori problem and WP:V to an a posteriori problem. Why do we have an A, B, and C here? Because it's a logic problem: a logical operator can only exist between two elements (an A distinct from B) and the real problem is not, in fact, applying the logic but applying the logic wrongly. Wikipedians combine A and B to create C all the time but it usually isn't disputed because the implicit logic is accepted. Notice that in the examples of OR given in this policy the logic is not correct. The United Nations example calls attention to the use of the logical connectives "and" and "but", and the reason why it's problem is because there is no logical relationship between the two elements, not that the logical connective is used period. Editors use "and" all the time without "and" appearing in the source. Note that replacing the "and" or "but" with a period wouldn't necessarily solve the problem because a logical connective could still be implied by the proximity and/or order of the two sentences. Indeed, theoretically one could slap a "citation needed" on an explicit logical operator in the text but you can't do that when it's implied.
An article created by an editor who drew on her expertise concerning the topic is going to have more separation from the exact words of the sources than an article created by a know-nothing who just concatenates near quotes from the sources. The expert does not get accused of "original research" if everyone agrees that the logic is sound. What the know-nothing did, however, is disputed if his concatenation creates something that doesn't logically follow, even if the know-nothing was completely devoid of "originality" and didn't realize that a dubious inference had been created.
Bottom line is that avoiding the "introduction of unpublished" material is not, in fact, "the core reason behind the NOR policy." That's the core reason behind WP:V. In every case where there's an allegation that this policy was violated and it's agreed that WP:V was not violated, it's an argument over how the reader will interpret the combination of fact claims. Whenever WP:SYNTHESIS is alleged it's because someone suspects that the synthesis has been done wrongly, not that it's been done per se (with the exception being that the conclusion is being attributed to a source and there's reason to believe that source doesn't agree with it even if the synthesis was done correctly).--Brian Dell (talk) 03:10, 15 January 2013 (UTC)

Please be concise. I don't even know where to begin in responding to this. You seem to be caught in analysis paralysis. You keep separating this proposal from the policy. This addition is not meant to start a new idea in the policy. It's to use an idea that's already here and expand it to include single-source scenarios. You made a lot of arguments for why this addition is poor or invalid, but you don't seem to recognize that your arguments apply equally to the policy as it exists before this addition. This proposal should not be any different than the current state of WP:SYNTH, except for the expansion to include that new element.

For example, you began with - Oppose because it implicitly defines a source as a work when the concept is not about the definition of source. - however, the policy already does this, if not more so than my proposal. I am offering a more "spirit of the law" approach by saying that it is irrelevant whether the synth is from one source or two or three - either way, it's still synth. If anything, this proposal ignores the definition of a source. We are certainly not saying "Do not combine A and A to create C". By claiming this, you are doing exactly what you condemn this proposal for doing - taking the "definition" of a source too literally. Here, we are still saying "...A and B to create C", considering "A" and "B" to be pieces of info from the source(s) instead of individual sources themselves.

However, your argument is way too hypothetical and abstract to be effective here. A lot of this might get lost in translation (ie when myself and others read it). I think you should try being less technical and more concrete, and then we might be able to have a more effective discussion towards a consensus.

For example, this is not very clear: "The point of this policy is to address subtraction from or negation of what's accepted, not addition and creation per se." Your reference of a priori / a posteriori is also very unclear. Those logical principles do not distinguish material that already exists on wikipedia from material that might exist on wikipedia in the future. It distinguishes empirical from rational knowledge. This policies do not neatly boil down to a priori / a posteriori the way you think it does. Your issue with the logic and the whole "and/but" thing objects to the current state of the policy just as much as it applies to this proposal, so I'm afraid you would have to try to remove the entire policy in order to keep your arguments logically consistent.

And I notice that you keep implicitly referencing the what synth is not essay. These concerns apply equally to the NOR policy as it already exists. This proposal would not change a single thing with respect to that essay; all of the same rules and principles are preserved and still apply, except for the expansion of the limits to include the new element. Quite frankly, it seems like you're looking for an opportunity to exercise your logic skills, because regurgitating that essay by definition is irrelevant to this proposal, which would preserve all of the same principles that the essay is trying to outline. I would say that your distinction between WP:OR and WP:V is equally irrelevant, but the meaning of much of that is very unclear to me, so I would like to request your clarification before making that statement. Indeed, if you have that issue with the 2 policies, then you would be objecting to the policies themselves as much as this proposal.

Lastly, I have a hard time believing that you completely contradicted the policy word for word, so I'm assuming that you misspoke (or maybe I misunderstood?) with: Bottom line is that avoiding the "introduction of unpublished" material is not, in fact, "the core reason behind the NOR policy." That's the core reason behind WP:V. Copied from the project page: so long as they do not illustrate or introduce unpublished ideas or arguments, the core reason behind the NOR policy. Charles35 (talk) 03:45, 15 January 2013 (UTC)

For those who are counting, that was a 750-word reply (including quotations) that began with a complaint about the length of a 600-word comment.

WhatamIdoing (talk) 03:36, 23 January 2013 (UTC)

I needed to adequately address his comment. I would have preferred to have done so in <100 words, but a long comment requires a long response. I'm surprised you'd go out of your way to make calculations simply to make a smart aleck comment that is counter-productive for preserving civility. What are you trying to prove? Charles35 (talk) 20:28, 23 January 2013 (UTC)

I agree that "it is irrelevant whether the synth is from one source or two or three" but that's why I oppose. The proposal says the physical distance between statements in the source material is relevant. That's adding letters to the law instead of moving towards the spirit. What's currently written takes us down the rabbit hole and your proposal takes us down even further instead of going in the opposite direction. I didn't misspeak when I claimed what the policy is about because I was talking about what the policy must be about if it's not just a duplication of WP:V. In other words, I simply disagree with what's written on this project page unless the "ideas or arguments" part is emphasized to highlight the fact that's its the logic that's the issue, not the "unpublished" part when publication is entirely possible or even inevitable. As for "regurgitating that essay" I hadn't seen it before. I'll be concise and present a straightforward and specific counterproposal. In the section below where I recommend replacing "mention" with "attributable," also replace "either of the sources" and "one of the sources" with just "a source." Why? Because at issue here is ALL sources, you could divide source A from source B based on them being different works, or different authors, or different pages, or different paragraphs but all of these divisions are arbitrary because the real issue is can it be attributed to ANY reliable source. Your proposal just calls more attention to the abitrary dividing point by moving it around, from between works to within a work. What my counterproposal acknowledges is that it's not attributability to a particular source that matters but attributability to a source. Why are we already indifferent with respect to whether it's attributable to source A or source B? Because we've already assumed both A and B and reliable sources. No more than that. Consider leaving "one of the sources" but adding "cited in the article" right after that. Why would it be impermissible to expand the choices from two to twenty if the article cites twenty reliable sources? If that's OK why not expand it to all reliable sources? It "is irrelevant whether the synth is from one source or two or three" because what matters is whether it is sourced period. If the party disputing the material leans towards the view that it CAN be sourced but isn't (ie you could add "citation needed") then it's an "empirical" problem and WP:V should be cited. If the party disputing leans toward the view that it CAN'T be sourced then it's a "rational" problem and this policy applies. Once this policy is rewritten to more explicitly reflect this people will "get it" and stop misapplying it. Your proposal treats the symptom instead of the disease and accordingly postpones the necessary reckoning.--Brian Dell (talk) 05:24, 15 January 2013 (UTC)

I think you make a good point that there is too much "letter" in this law (not just this proposal, but the entire policy). However, I think you're falling into some traps. First and foremost, I think you're misdirecting your objections to the policy into objections to this proposal. As stated above, this might not be a perfect solution to all of this policy's problems, but any beneficial contribution is worthwhile. This proposal does contain a "letter" with the proximity thing. But people sometimes need some letter in order to make sense of the world. It isn't a bright line rule like WP:3RR; instead, it is just a suggestion. We aren't saying that anything from separate page numbers is synthesis. We are only saying that synthesis occurs particularly when the page number difference is large. You're an idealist. Sometimes you need to be a pragmatist because our society (and wikipedia) is built on such pragmatism and the "letters" are just one element of it.

However, I think there's a bigger issue. You believe (seemingly for certain) that you are correct in knowing the problems with this policy and you believe you know how to fix them (eg, you "simply disagree with what's written on this project page"). That might be true, and it might not be true. I don't know. But in order to find out, I suggest you seek a consensus (probably in a different section that can more effectively focus on your issue). You are presupposing this belief when you apply your logic in your arguments against this proposal. For instance, you claim that the point of WP:NOR is exactly the opposite than the current policy explicitly states. You also claim that "people will "get it" and stop misapplying it" if we instate your ideas. These are pretty bold claims, and it seems that the current consensus disagrees with you. However, consensus can always change, and if you feel your ideas can fix the article, I strongly suggest you seek consensus in a separate section. It is pretty obvious to me that your concerns are beyond the scope of this RfC. For example, your "specific counterexample" is about material that was here and exists independent of this RfC (ie "mentioned...either of the sources").

I wouldn't object to you being bold and "lessening" the letter (ie the suggestion about the proximity in page number) part of the addition we have instated, providing you adequately preserve the idea. However, if you gain a consensus to eliminate the letter altogether, I of course cannot object to you doing so.

Good luck, Charles35 (talk) 06:51, 15 January 2013 (UTC)

Well I think your proposal would be an incremental improvement if it read this way and this reading was considered an amendment as opposed to a different proposal:

The definition of synthesis is not strictly limited to material from more than one source. Material from a single source can constitute synthesis if multiple points are combined to draw a conclusion that's disputed. Synthesis occurs if a source cannot be found that treats the points as directly related. Original synthesis is particularly likely when the connection between the statements is obscure.

This reading would acknowledge that it doesn't matter if the points are located an inch part physically if they are a thousand miles apart logically, just as it doesn't matter if they are a thousand miles apart physically if they are on the exact same spot conceptually. If a reliable source right in front of my face says "A is B's wife" and a reliable source on the moon says "B and C are brothers" then the claim "A is C's sister-in-law" cited to both sources should at a minimum be assessed on an individual basis instead of presumptively dismissed as a policy violation. The fact is that editors draw a conceptual connection every time they write two sentences in a row. Likewise if a source claimed that "If Bill Gates owned Fort Knox, he'd be rich" and another source claimed that "Bill Gates is rich," it would be an incorrect SYNTH to then add that "Bill Gates owns Fort Knox" even if the two prior claims were from the very same paragraph. It's not the fact that the paragraph doesn't also mention that "Bill Gates owns Fort Knox" that makes that conclusion incorrect, it's the fact it doesn't logically follow. If the claim that "Bill Gates owns Fort Knox" could be saved by the same paragraph stating that as well, it could be similarly saved by any reliable source saying that. It further obscures this reality to ask the reader to look for a "mention" in a particular source - e.g. by the [same) "author" - instead of for a source period, which would get them thinking about the possibility of a source (the conceptual possibility). The reader shouldn't be directed to the wrong remedy just because it works in practice most of the time. This policy necessarily requires a deletionist remedy while WP:V does not. In other words, if a problem can be solved by continuing up the hill and adding a source, then it's a WP:V problem. If one has to back up one would have to resort to this policy. That's what I meant earlier by the difference between negation and creation.--Brian Dell (talk) 08:39, 15 January 2013 (UTC)

Again, I don't see how much of this is relevant to the source. You seem to again be reciting the synth essay (even though I understand you have not read it before). These issues that the synth essay warns us of apply equally to the policy before this addition. For instance, your example about Bill Gates is called affirming the consequent (although I suspect you already knew this). It is logically false and should not be used anywhere on wikipedia. Affirming the consequent should not be done with multiple sources or with a single source. This concern is no more relevant to this proposal than it is to the policy prior to this proposal.

The reader shouldn't be directed to the wrong remedy just because it works in practice most of the time. - again, I will tell you that you are an idealist. While I do agree with you on a theoretical level, you need to understand that most people need some sort of letter in order to make sense of the world. And again, it is just a suggestion, not a bright line rule.

I don't agree with your removal of "In some cases..." I think it is important to provide limitations so that people do not abuse the rule.

I don't agree with your change to "...combined to draw a conclusion that's disputed". This is an idea more suited for WP:BURDEN (ie "any material challenged or likely to be challenged"). "taken out of context" is important in my opinion and sums the idea up quite well. "...used to draw a conclusion not drawn by the source" in my opinion better expresses the aim of WP:SYNTH.

"...if a source cannot be found..." reflects your view on the difference between WP:V and WP:NOR. Unfortunately, your opinion is in the minority and you do not have consensus to instate it. "cannot be found" reflects the reason the policy is called verifiability instead of verified. According to consensus, this is not the correct policy for this idea.

I disagree with your deletion of "by the author" because this specifies author instead of authors, which is a very important specification in my opinion. I see "obscure" as redundant and already covered by "taken out of context", "not drawn by the source", and "treated as directly related by the author".

Look, I don't enjoy disagreeing with everything you say and I feel like a contrarian doing so, but I think the reason for this is because your concerns are about the entire policy but are being misdirected towards this proposal, and because you are trying to support an extreme minority opinion. Again, I very strongly encourage you to seek consensus for you views in a separate section that can allow you to focus more effectively on your issue, and then, if you gain consensus, come back here and object to this proposal again. Charles35 (talk) 15:54, 15 January 2013 (UTC)

re "Affirming the consequent should not be done," it's great to see your opinion about that here but that's not, in fact, relevant because what's relevant is not someone's aside on a Talk page but what Wikipedia policy says or should say. On that point, instead of endorsing a Wikipedia rule that prohibits this (WP:V can't do that job, hence it has to be this policy, and at present your proposal implies that a false relation is somehow less so if the things being related are physically proximate), you talk about how consensus supports the status quo found on this Project page, which happens to prohibit all reasoning regardless of soundness (such as my sister-in-law example) beyond "basic arithmetic." You keep talking about that essay but what I'm talking about here is policy and Wikipedia:Essays are not policy.

"In some cases" is redundant because you've already got a "can" in there that indicates that it isn't necessarily synthesis. If it didn't have that "can" before "constitute" and just said "material from a single source constitutes synthesis" then, yes, you'd need the qualifier. I suggest moving "In some cases" to the next sentence, where you don't appear to allow any exceptions.

Why should anyone care if the "conclusion is not drawn by THE source", so long as the conclusion is drawn by A reliable source and/or nobody disputes the conclusion? You're calling for a restriction that does not advance any identifiable Wikipedia objective. Calling this an "extreme minority opinion" doesn't say anything about its merits, moreover I haven't seen community opinion weigh in on the matter when expressed like this. You appear to claiming that the consensus view is that WP:V refers to verifiability whereas this policy demands verified, but I haven't seen the community weigh in on precisely that characterization either. See again the note that says "Articles that currently name zero references of any type may be fully compliant with this policy". I humbly suggest that we are both doing some interpreting re the "spirit" of these policies.

In the same vein, re my "synthesis occurs if a source cannot be found that treats the points as directly related" versus your "synthesis if multiple points are taken out of context and used to draw a conclusion not drawn by the source" why is it no defence to be able to point to another reliable source drawing the conclusion? Also, a dispute that pores over not one but two source contexts to establish whether or not the points were "taken out of context" would be a distraction and a focus on the letter of the law if everyone admitted or agreed that the two points are in any case not related or relatable which is what the spirit of the law seeks to prohibit.

You speak of consensus, yet you declare a single "author" as opposed to multiple "authors" to be "a very important specification." This is a highly idiosyncratic view, because 1) the policy as currently written quite clearly indicates that what's allegedly SYNTH may be saved by pointing to "either of the sources", sources that may be by different authors plural and 2) your proposal starts off implicitly defining source as a work, yet now you shift from same work to same "author." Previously you agreed with me that "it is irrelevant whether the synth is from one source or two or three" but now it's "very important" that source be one author not two or more? --Brian Dell (talk) 21:31, 15 January 2013 (UTC)

The only relevant objection I am really hearing from you is "at present your proposal implies that a false relation is somehow less so if the things being related are physically proximate". For lack of motivation to find a better word, most people are not as smart as you! People need little suggestions. They need qualifiers, as you put it, in order to make sense of these things. Again, you are too idealistic and you need to understand the distinction between the concept of original synthesis and a policy on original synthesis. Many people will not understand your vague theoretical logic and if you are not concretely specific, they will probably misconstrue them.

And I "care if the conclusion is not drawn by THE source" because the point of this RfC is to introduce a new element to this policy. It isn't to just generally improve it. We are trying to make an addition that specifically addresses single-source synthesis. "You appear to claiming that..." No. I am not even trying to about WP:V, you are. This isn't some broad theoretical analysis of wikipedia using fancy philosophy-speak. That will get us nowhere. We are focusing only on this issue. "...why is it no defence...?" Because this RfC is specifically about single-source synth. We don't want it to get swallowed up and go unnoticed in a broad general description of synth.

Actually, the lack of consensus for your minority view does say something for its merits. It doesn't say everything, but it says something. It doesn't necessarily mean that it has no merits, but it means that it more likely has no merits than if you did have consensus. And again you fail to understand the purpose of consensus. You may think that your viewpoint is omniscient, but you might be wrong. Of course everyone thinks they are right, and of course, many of us are wrong. It happens everyday. This is why we seek WP:CONSENSUS. Considering you seem to enjoy philosophy, you might find these articles interesting.

You are getting very nitpicky in that last paragraph and making some logical errors. It seems like you want to turn every single little piece of content into a philosophical argument. (error #1) I speak of lack of consensus about your view, not in any way related to my "idiosyncratic" view. (error #2) My previous agreement with you was about a general stance on the concept of synth. My view about "author" instead of "authors" is, again, concerned with the actual policy itself. You are taking my statements out of context. I want the wording we use to be noticed by people. Slightly altering the policy is useless if people continue as if the change never occurred. Minor changes will likely be overlooked and/or misconstrued to mean something else. Saying "author" makes it explicit that we are broadening the scope to include single-source scenarios. Charles35 (talk) 04:42, 16 January 2013 (UTC)

Bdell555, I am afraid that if you are not concise, relevant, and less broad and theoretical, then I must cite Wikipedia:Silence means nothing. This is not because I don't want to reach a consensus with you. I do want to do that. But in this manner, we are going to get nowhere. Trust me. Charles35 (talk) 04:50, 16 January 2013 (UTC)

I think that if you were more concise, kept relevant, conversed more concretely and specifically, and tried to achieve consensus, then I would be much more receptive and likely even agree with some of your views. For instance, please see these edits. Tryptofish expressed some views that are very similar to yours, and I agreed with much of it. Charles35 (talk) 05:11, 16 January 2013 (UTC)

I have to take exception to your continued insistence that I don't respect consensus by pointing out that you are the one editing the Project page as you see fit, not I. Your last edit changed "A source that says 'A' in one context, and 'B' in another, without connecting them, does not provide an argument of 'therefore C'" to "If a single source that says 'A' in one context, and 'B' in another, without connecting them, and does not provide an argument of 'therefore C', then 'therefore C' cannot be used in any article". This was a major alternation, an alternation that wasn't supported by consensus (Wikipedia:Silence means nothing), and an unsound one. Applying either or both of logic and policy, in fact 'C' (and 'therefore C') may be used, and used in any article, if there is a reliable source for it. Let me say it one more time so that it's as concise, relevant, concrete, and specific as possible: in fact 'C' may be used, and used in any article, if there is a reliable source for 'C' (subject to WP:WEIGHT and the rest of WP:NPOV). I've tried to explain why, but you've dismissed it as "fancy philosophy speak", so I suppose I'd best just move on, having registered my protest.--Brian Dell (talk) 09:30, 16 January 2013 (UTC)

The text is obviously saying that the only relevant source is the one mentioned in the sentence. You don't need to apply multiple disclaimers (if you apply this, that sets a precedent) that take care of obvious little things like that. If there were another separate reliable source that unquestionably says "therefore C", then we wouldn't even bother to consider whether synthesis applies in this situation. If you want to add this disclaimer to the sentence, then in order to be logically consistent, you'd have to do the same to nearly every other sentence in the entire article. I understand and appreciate your desire to be exact and leave no stone unturned, but you don't seem to recognize that you are not omniscient and there are more stones than you (or I) are aware of and that it is impractical and redundant/unnecessary to try to cover them all. Charles35 (talk) 13:21, 16 January 2013 (UTC)

And Tryptofish agreed with my last edit, so... And there is nothing wrong with being bold. That's how you (should, if you want to be effective) find out what the consensus is, hence my edit summary "please feel completely free to revert and discuss". Charles35 (talk) 15:28, 16 January 2013 (UTC)

Oppose I have two main reasons for opposing this:
The first is that the community really does use the term synth to mean multiple-source-original-research, and the community does not use that particular bit of wikijargon to refer to single-source-equally-bad-original-research. We should respect the community's jargon. The community says that it's OR if you use one source to support a claim that isn't in any reliable source, and it's SYNTH if you use multiple sources to support a claim that isn't in any reliable source. I wouldn't necessarily mind saying something like 'doing the same thing, but with a single source, is called plain old OR and is still prohibited'.
A minor sub-point: If your claim is published in any reliable source—any, as in "any reliable source ever published, in any language, in the entire history of the world", then it's not OR of any type at all. A lot of inexperienced editors and a lot of POV pushers compare the claim solely against the cited source without considering that the OR policy doesn't make any such requirement.

The second is that I think this is an invitation to wikilawyering and obstructionism. So let's first get rid of the silly position (which nobody here seems to hold), which is that a newspaper with a hundred different articles is "one source": it's not one source. It's a hundred. If you take a fact from "Peace talks in the Middle East have failed" and add it to "Football star has stomach flu", to get "Peace talks derailed by stomach flu", then you have engaged in impermissible, multi-source SYNTH.

But imagine that some unfortunate person has written an entire book about Wikipedia's policies. In Chapter 1, "Verifiability", the author talks about independent sources, which he defines as a published source that is written by someone without a conflict of interest. That chapter says nothing about notability. In Chapter 57, "Notability", the author defines notability as "qualifying for an article on Wikipedia" and says that non-independent sources never support a claim of notability, but says nothing about what that term independent means or anything about verifiability.

So the proposal says, "Synthesis occurs if these points were not treated as directly related by the author, particularly when statements occur far apart in the source material."

The source here did not treat his definition of independent sources as having anything to do with notability. The statements "Occur far apart in the source material". And yet anyone who understands the text can easily see that these statements are intimately related, and that a claim in an article that said something like "Qualifying for an article on Wikipedia is determined by considering sources written by people without a conflict of interest" is not SYNTH or original research of any type. If we added the proposed text, however, I think we would see a lot of false claims along these lines. In fact, very basic, plainly WP:NOTOR work would become a source of disputes whenever someone tried to summarize the overall gist of a longer work, rather than citing isolated, single-sentence facts out of it. WhatamIdoing (talk) 04:19, 23 January 2013 (UTC)

Well, first of all, this is the community. Right here. The policy pages are the heart of the community. They define the community. Optimization of policies is good for the community. I don't really see this as a valid or relevant argument.

To address your second concern, which I do see as both valid and relevant, Tryptofish offered a good revision to the proposed text, which I think would satisfy this concern:

The definition of synthesis is not limited to material from more than one source. Material from a single source can constitute synthesis if multiple points, not treated as directly related by the source, are taken out of context to draw a conclusion not drawn by the source.

Notice the "occur far apart in the source material" bit is gone. Charles35 (talk) 19:23, 23 January 2013 (UTC)

Oppose. I support the goal, but not the wording. Instead of prohibitions, which are rarely if ever effective in producing either good writing or good thinking, I would prefer that we find a way to encourage representing sources accurately and in context (which is what the proposal seems to be getting at toward the end). I haven't seen the anti-synth guideline invoked as much in the last year or so, but in the past, it was often deployed in content disputes by those who knew little about the subject matter and couldn't recognize that there was no originality of intent or expression in the way the article was compiled, and no new synthesis. (If you don't know something, it seems novel to you.) This wording would encourage reading parts of a source in isolation, aka cherrypicking, for fear of 'synthesizing' points, rather than understanding how the pieces of the scholar's argument fit together as a whole, including in the context of the broader scholarly discourse. Removing the spatial metaphor from the proposal hasn't entirely removed the invitation to wikilawyer. I can still imagine a disputant arguing "but he said that on p. 186, and you're putting it together with something he said on p. 897; the two points are 'not treated as directly related' in the book." It may be that if one reads the book as a whole, putting the two points together is precisely what the scholar intended. I don't see how it's a bad thing to have a disputant articulate what's logically wrong with the passage in question, instead of just waving a hand and saying "Oh, that's synth, 'cuz I say so." The un-doer should have to expend as much effort as the doer in a content dispute. The key is the context clause, but it's buried. So I would like to see the goal stated more strongly than the prohibition, something like: Be sure to represent each point accurately within the context of a source's overall argument; do not synthesize individual points taken out of context from a single source to imply or draw conclusions not advanced by that source. I agree generally with what WhatamIdoing says about OR/synth, though I suppose I see the point about single-source synth. Cynwolfe (talk) 15:07, 25 January 2013 (UTC)

Considering your point, maybe it would be a good idea to say something along the lines of "If information is taken from separate parts of a book, that does not necessarily imply synthesis. One should understand the context of the material in the source in order to challenge it as synthesis." What do you think of that? Charles35 (talk) 18:02, 25 January 2013 (UTC)

To me such a guideline would sound as if it were encouraging challenges on grounds of synth. It's written from the perspective of someone looking to oppose somebody else's contributions. I believe that our guidelines should be written to help and instruct editors on how to contribute or edit content effectively, not how to argue on talk pages. So again, I would rather express what the editor adding content should do. Another editor who objected to the content could then show how it doesn't meet standards. Cynwolfe (talk) 04:53, 29 January 2013 (UTC)

You make a very good point. I'm afraid that I might be limited to such a perspective, and if that's the case, I won't be able to come up with a better way to phrase the sentence. Would you please propose an example of how you think it should be phrased according to your concern? Thanks, Charles35 (talk) 05:13, 29 January 2013 (UTC)

Threaded discussion

Summary of my views:

This literally is synthesis of published material to advance a new position. It's identical to what is being done with multiple sources, but is limited to one source instead of 2 or more. Thus, I see this as the most logical way of looking at it. Since a single-source synth scenario is definitely more common than a 2+-source scenario, I find it odd that the 2+ is explicitly addressed by the policy while the single-source scenario isn't.

If a newcomer were to come to wikipedia and read these rules, this "scenario" would almost certainly intuitively seem like a case of original synthesis (although this might be bad for my rhetoric, I'm going to say it anyway in fact, this was my experience). That is, if they were concerned with the spirit of the policies, not the exact wording (The spirit of the rule trumps the letter of the rule). Such an approach is reflective of the philosophy behind wikipedia's policies (ie work in progress, ignore all rules) - the rules are imperfect, nothing is ever complete, and change is always welcome. Charles35 (talk) 04:28, 12 January 2013 (UTC)

Something is needed because synthesis is combining fact A with fact B to conclude C—it is irrelevant whether A and B come from a single source or from multiple sources. The problem is this sentence in the first paragraph:

If one reliable source says A, and another reliable source says B, do not join A and B together to imply a conclusion C that is not mentioned by either of the sources.

The fix may be some careful rewording of the first paragraph as it would not be satisfactory to add the proposed text a pagedown away, particularly when the addition essentially rewrites the opening. Would the following work?

If a reliable source says A, and the same or another reliable source says B, do not join A and B together to imply a conclusion C that is not mentioned in one of the sources.

Johnuniq (talk) 23:36, 12 January 2013 (UTC)

Yes, I believe that would help. But the fact is that the wikipedia community considers WP:SYNTH to apply only to scenarios with 2 or more sources. While, of course, if we were to insert that sentence, the policy would technically include single source scenarios and you and I and the others who read this discussion can cite the rule in talk page discussions, which would "spread the word". But my goal is not to be able to use the rule myself, it's to make it easier for the community. The problem is that I don't think the community (ie the people that read the policy) will notice the change. I think that will only happen if the we modify the policy to directly and explicitly address single-source scenarios by saying "synthesis is not limited to multiple sources" or "this policy includes synthesis from a single source".

That said, I agree with your idea that the addition should probably be to the first sentence or paragraph. I would support the addition you proposed. Would you like to be bold and add it to the policy? Charles35 (talk) 19:45, 13 January 2013 (UTC)

"not mentioned by" (or "in") should be replaced with "not attributable to". Should it be accepted if an out-of-context quote can be cited as the "mention" and rejected if it's unanimous that it could be attributed to the source but there isn't an exact word-for-word quote?--Brian Dell (talk) 03:25, 15 January 2013 (UTC)

In response to my comment in the RfC --Tryptofish (talk) 22:45, 14 January 2013 (UTC) :

Tryptofish - I hope you don't mind, but I think the "Survey" section should be left open for votes. Please keep comments in this section. I'm not sure I understand what you're trying to say... Are you commenting on the fact that I signed both support and oppose? I literally just copy and pasted this format from the RfC page, which included the signature. I didn't mean for those to seem like they are actual votes. You are supposed to sign your name after every line. But I can see how that might be confusing, so I will delete them.

I welcome any thoughts you have about my recent addition. As I said in the edit summary, feel free to revert or revise my edit. What exactly do you find wrong with it? The reason I changed what you wrote is because I didn't think it was very clear what you were trying to get across. I don't see the need for being subtle; it will probably just confuse readers of the policy. Charles35 (talk) 03:45, 14 January 2013 (UTC)

I realize you are a new editor, but WP:NOTAVOTE, so I put my response to the RfC back where it was supposed to be. I supported your recent revision of the sentence that I added at the end of the first paragraph of SYNTH. There's no reason that I would want to revert it. I think that it's possible that that may be sufficient to solve the issues raised in this RfC. I'm also receptive to going a little further, by stating explicitly that there can be SYNTH within a single source. Your RfC proposes specific wording to state that explicitly. I support the concept of what you are proposing, but I'm not prepared to support the exact wording that the RfC proposes. --Tryptofish (talk) 22:51, 14 January 2013 (UTC)

The reason I moved your comment is because I didn't see it as being particularly relevant to the survey section compared to any of these comments down in this section, with the exception of the confusion about my signatures. However, I have no problem with the comment being up there if you think that's better. I would like to change the title back to "survey" though, especially because I would like to follow the format laid out on the RfC guideline and because I think that it is important to designate an area that gives people the opportunity for people to vote. Thank you for pointing out that we shouldn't preclude discussion altogether in that section, and thus I think it can function as both a comment/vote section (and evidently, it is, considering Bdell555's comment).

That said, I do agree that technically the current state of the article has solved the issue raised in this RfC. I'm glad you agree that explicitness is important here. This is a moderate change to the policy and I don't want it to be overlooked. As I've said, my goal here is not to change the policy for personal use; I want to make sure that the community benefits from it. And I don't think that that will happen if we are not explicit in the project page. I think it's a good thing that you don't support the exact wording, because a second opinion on the wording is very important IMO in order to make sure it makes sense to everyone and not just me. What, specifically, do you think we should change, and what do you suggest we change it to? Charles35 (talk) 04:28, 15 January 2013 (UTC)

Thanks, that's fair. I'd suggest tightening it up to:

The definition of synthesis is not limited to material from more than one source. Material from a single source can constitute synthesis if multiple points, not treated as directly related by the source, are used to draw a conclusion not drawn by the source.

All I really did was make most of the same points while eliminating excess verbiage. (By the way, it's "synthesis", as opposed to "original synthesis". We do say "original research", however.) I could support that shorter version, and I could also support what is on the policy page now, with the new material about "A, B, C", in the event that other editors do not come to WP:Consensus for actually changing the definition. --Tryptofish (talk) 23:59, 15 January 2013 (UTC)

I would support that version. I think that most of the removals you made were useful and appropriate. But there's one that I'd like to ask for you to give a second thought - "taken out of context". I think that phrase would be useful for the policy. Do you see a place where it would fit in? If not, that's fine and I can see how one might consider it inexpedient. Anyway, the revision to "A, B, C" sounds good to me as well. It seems that 3 of us agree on that (you, me, Johnuniq). Charles35 (talk) 05:06, 16 January 2013 (UTC)

Good, thanks. We could do this:

The definition of synthesis is not limited to material from more than one source. Material from a single source can constitute synthesis if multiple points, not treated as directly related by the source, are taken out of context to draw a conclusion not drawn by the source.

That works for me. Let's give the discussion a little time now. If there emerges a consensus that it's OK to expand the definition, maybe we can use this, and if not, we still have the revision that has already been agreed to. --Tryptofish (talk) 16:59, 16 January 2013 (UTC)

It's starting to look to me like such a consensus is unlikely to emerge. --Tryptofish (talk) 18:03, 23 January 2013 (UTC)

Well, we addressed the concerns of both WhatamIdoing and Bdell555 by taking out "far apart in the source material". WhatamIdoing's concern about the community norms persists, but personally I don't see that as a strong argument and I think it fundamentally contradicts the philosophy of wikipedia (for example, ignore all rules, work in progress, and consensus can change. You, Johnuniq, and myself all agree with your proposed text. Thus, I don't think it's a lost cause. I don't think Bdell555 has anything to disagree about. Technically, if you want to count votes (although to be clear, this does NOT substitute for discussion) it's 3-2 in favor of the proposed change, and maybe 3-1 or 4-1 depending on Bdell555's status. Why don't we just let this continue playing out? Charles35 (talk) 19:33, 23 January 2013 (UTC)

Enhancements regarding the mother of all WP:OR and WP:Synth questions

The fact of the matter is that about 80% of all editing (including nearly all summarization) violates a rigorous interpretation of wp:nor / wp:synth (and by overlap, wp:ver as seen through the wp:nor lens) Yet, (lets say) 95% of the time, this policy works really well in this area. How can that be? And how can we improve the "5%"? And how can we provide better guidance here for newbies so that it doesn't take years to figure out how it really works?

The "how can that be?" is that this policy, perhaps more than any other, depends on the "fuzziness" that is needed and used to make Wikipedia work. I think that it was Rickover who said "If you're not exceeding your authority, you've not doing your job. Just don't ever be wrong." The Wikipedia version is "if you're not violating a rigorous literal interpretation of wp:NOR, you aren't doing the main job of adding material. But if someone objects in that area, particularly if your stuff is a greater "reach", your stuff is going to get deleted". The "fuzziness" is actually a neural net decision making process where the rules themselves are very influential in the result, but the "if someone objects" has immense weight, particularly here where normal practices are technically a violation of a rigorous interpretation of the rules. As is the case elsewhere in Wikipedia, most of that "5%", i.e. when the system doesn't work is when the "objector" is not sincere in pursuing the objectives of Wikipedia and that objection is serving some other interest such as a POV effort or a pissing war. I have been thinking about this for 2 years and believe that the following would be two good general directions for efforts in this policy:

Develop a little more guidance on the line between summarization and selection-of-what-from-the-source-to-incorporate-into-the-summarization and WP:OR / WP:Synthesis. These should still acknowledge utilize the "fuzziness" that makes Wikipedia work. And due to the heavy overlap between WP:nor and WP:ver, (and to a lesser extent, with WP:NPOV) these efforts should be concentrated on areas closely related to "putting things together", and inevitably on how summarization inevitably includes good and bad decisions on what to include/exclude....this to avoid creating entanglements with the core areas of wp:ver and wp:NPOV.
Encourage the norm that objectors / invokers include additional thoughts and concerns besides just quoting the particular rule. Then, such (or lack thereof) will help enable the next layer of the fuzziness/neural net that makes Wikipedia work (evaluating the merits of the objection beyond the rulebook/ using the subtle influence of the mere existence another rule wp:IAR) to work to a greater extent.

Sincerely, North8000 (talk) 13:45, 13 January 2013 (UTC)

You've just made a good argument for WP:IAR. It seems to me that the kind of guidance you discuss here is better suited to an essay than to a policy. --Tryptofish (talk) 01:31, 14 January 2013 (UTC)

It was kind of a 30,000' view. But I think that the point most relevant to the evolution of this policy is advocating more clarification on the line between acceptable summarization and unacceptable OR/Synthesis.North8000 (talk) 14:49, 16 January 2013 (UTC)

The fact of the matter is that you've been saying this for years, and nobody agrees with you. I'm pleased to see that you only think that 80% of Wikipedia is now in violation of these policies. You used to say 90%. NOR directly and explicitly requires summarizing and writing in your own words. WhatamIdoing (talk) 04:34, 23 January 2013 (UTC)

They were both guesses. I don't think that anybody except you has disagreed with me. :-) Mostly it just hasn't been discussed. Not that it's a crisis...It works most of the time. I've been learning that sometimes fuzziness is needed in Wikipedia and makes it work. Sincerely, North8000 (talk) 13:29, 23 January 2013 (UTC)

When you are concerned about a strict interpretation of NOR, do you mostly mean SYNTH? Specifically, do you worry that anyone could argue that placing two facts from different sources next to each other "serves to advance a position not advanced by the sources".

I agree with what you say about challenging stuff - this is addressed directly in WP:V. Are you saying WP:NOR also needs a "challenged or likely to be challenged" type statement?

Yaris678 (talk) 17:58, 23 January 2013 (UTC)

No, he basically seems to mean that if you write in your own words (sufficiently in your own words that you aren't committing plagiarism and are producing an encyclopedic summary) that you are automatically violating the sourcing policies, because the specific phrases that you used aren't in the sources and because you used your own editorial judgment about what the most important points (in the source) or most relevant points (to the article at hand) were.

It is not a position that anyone agrees with. It's not even a position that North believes enough to follow in his own editing. WhatamIdoing (talk) 06:13, 29 January 2013 (UTC)

Why does Wikipedia prefer secondary sources to primary sources?

Please forgive me if this is the inappropriate place for this question.

As someone who checks wikipedia frequently for scientific content, I have long considered the overuse of secondary sources to be Wikipedia's biggest problem-- finally checking policy, I see that problem is by design. I see studies cited with textbooks, magazines, etc, when this would be considered a serious failing in any academic writing. Citations allow me to check conclusions. Hiding the original source of information behind a hierarchy of secondary sources complicates that and, to me, is one of the warning signs that the author lacks authority. 98.203.173.56 (talk) 22:09, 31 December 2012 (UTC)

A secondary source is an original source. It is merely one that is somewhat removed from the event instead of being a part of it. See WP:PSTS for more on this (if you already haven't, that is). The reason that secondary sources are necessary both to establish notability and on which to base most of the article's content, is that if primary sources were allowed, then the personal website or blog of the subject of the article, or someone closely involved with it, could be used as a source. This would present the danger of introducing novel interpretations of information, and raise questions of objectivity and self-promotion. Secondary sources are needed because they involve generalization, analysis, synthesis, interpretation, or evaluation of the original information. There's more on this at the article Secondary source.

However, this doesn't mean you can't use primary sources, but you have to be careful about how they're used. When writing a BLP article, I'll use a primary (like the subject's own website) for stuff like where or when they were born, what their influences were, etc., but I will never rely on that for stuff like what awards they've won, and prefer not to for info on their upcoming projects, because of conflict of interest and WP:NOTADVERT issues. Happy New Year. Nightscream (talk) 22:19, 31 December 2012 (UTC)

This does not convince me that secondary sources are better, in fact it seems to strengthen the view that primary sources are better. For example (using a type of subject that might be looked up), let just say someone were writing about notable public figure such as a politician or celebrity, a citation directly from the subject would be a more reliable and less subject to (especially second hand) reinterpretation or bias if taken straight from the source. Similarly, if I were looking up a scientific topic I want it to be based on the actual research (preferably peer reviewed) rather than a secondary summary; primary sources in academic journals are almost always preferable to secondary summaries in academic circles. (If Wikipedia actually insists on being a tertiary or further removed source no wonder its a laughing stock to most academics, even beyond the "any one can edit" risks of finding vandalized or otherwise bad source before its corrected.) Not that I haven't used a lot of secondary sources myself in articles, but generally because it was easier, and felt a little bad about it -- as a teacher I would never let my students get away with relying primarily on non-primary sources in a paper. 67.172.104.2 (talk) 23:42, 31 December 2012 (UTC)

Before I forget: Happy new year :) Thank you very much for your points. I can see now that there are many times that a secondary source would be preferable to a primary source. However, there are other times that a primary source would be not just acceptable, but preferable. Here's an example I noticed recently: http://en.wikipedia.org/wiki/Color_psychology#cite_ref-AEP_12-1 . In that case, the secondary source (it's a textbook, but acting as a secondary source there) is cited when the original research would be vastly preferable to anybody I can imagine accessing the page. The textbook provides proof of notability, and would be relevant to any discussion of notability (which would appropriately go on the talk page rather than the main page), but otherwise acts only to obscure access to the original source and to impugn the reliability of the information.But why does it matter? Because framing secondary sources as always preferable to primary sources affects editing and conversation. Based on Wikipedia guidelines, which apply reasoning appropriate to certain pages and sources to all pages and sources, I would not feel comfortable editing the page to replace the secondary source with a primary source, nor would I communicate to the authors (who have requested feedback on the talk page) that a primary source would be preferable to a secondary source in this situation, even though it would be in every publication I can imagine save Wikipedia. 98.203.173.56 (talk) 00:24, 1 January 2013 (UTC)

It is my general opinion that a significant defect of content editors on the English Wikipedia is their too-widespread misunderstanding of primary, secondary, and tertiary sources. (And I don't throw claims around like this lightly). It is my understanding that the current emphasis on secondary sources derives from the interest to keep people from originally analyzing stuff when they should be using secondary sources instead. But folks take this concept much farther than it's supposed to be and fall into the trap of thinking that secondary sources are by default preferable to primary sources when this is not true.

If I am writing an article about Aristotle, I should describe what he says in his works by citing his works, not by citing what someone else said he said in his works. Now, if someone does indeed analyze Aristotle's works so that we're dealing with content beyond what Aristotle said, then yes that's where we want the secondary sources. And that's what this policy is trying to encourage: don't analyze Aristotle yourself; cite reputable people that have already done so. Secondary information is valuable and really fleshes out articles to an encyclopedic level, but we mustn't forget that the primary sources are still the raw materials that we need to use as well. NTox · talk 02:06, 1 January 2013 (UTC)

I thank the IP editor for raising some thoughtful questions. Actually, something like a textbook is a tertiary source, not a secondary one. I note that the IP editor is largely asking about scientific content, where the primary literature includes a large amount of high-quality peer-reviewed scientific journals. For specific facts of the sort that the IP editor asks about, such primary journals are, in fact, reliable sources, and often are, indeed, better than textbooks or (other) encyclopedias. And they are also often better than secondary sources in the popular press, written for laypersons. But, on the other hand, consider what happens with medical information. A primary study comes out in a respected medical journal. It later is contradicted by a preponderance of further studies. In that case, prematurely citing the first study would have misled our readers, perhaps misleading them in something where there could be life-and-death medical implications! Thus, for situations like that, Wikipedia prefers secondary sources in the form of review articles in those same peer-reviewed professional journals. That way, we can avoid giving undue weight to misleading information when it's still preliminary, but we're still using high-quality scientific source material. --Tryptofish (talk) 17:16, 1 January 2013 (UTC)

67.172.104.2, when you say, "a citation directly from the subject", what do you mean by the word "citation"? Do you mean like a direct quote? In some circumstances, using such a primary source would be alright, I think. However, direct quotes can also be found in secondary sources. If it's a direct quote, then a reliable source is presumed to transcribe it accurately, without interpretation.

Similarly, I don't think there's anything wrong with relying on peer-reviewing science journals in science articles. I'm not even sure if that's considered a primary or secondary source, since the peer review process involves repeating research originally reported elsewhere, but I have recently been made to understand that editors who work on science articles on WP prefer actual research to popular sources like TV documentaries, because when I tried to add information in an article by citing a TV program, I was challenged by someone who showed me some guideline indicating that something of the caliber of research papers were needed. Nightscream (talk) 19:21, 1 January 2013 (UTC)

A minor clarification: peer reviewing of scientific manuscripts does not involve trying to repeat the research; that only comes after publication. Similarly, review articles in scientific journals do just that: review the already-published literature – not report new experimental results (such as trying to replicate the studies being reviewed). --Tryptofish (talk) 20:33, 1 January 2013 (UTC)

Wikipedia:No_Original_Research#primary source discourages Primary Sources for very good reasons, though more so in some cases than others -- see WP:Common Sense. The areas of history, social science, and literature must avoid drawing conclusions from Primary Sources, though they can be used for illustration or to add color. It is difficult to tell when something from the time of the event is actually based on whether the author viewed things which no modern author could see, whether they made it up, whether they misunderstood what they saw, or whether they later changed their minds. Using their testimony as a Primary Source would be like using the testimony of only one witness in a trial without cross examination and without further documentary or circumstantial evidence. Newspaper and magazine accounts from the time of the event are almost always Secondary Sources in any case, since the authors mostly did not see the event themselves. A true Primary Source would be a document or a record.

Later scholars have several advantages which compensate for the fact that they were not there. They can read the testimony of conflicting witnesses, use sources (such as documents and archives) which the eyewitnesses did not have access to, and must submit their findings to the judgments of other s who have experience in the field. It is true that later scholars have their own points of view, but they have to convince readers who do not share it.

This is actually to agree with N Tox that too many editors don't understand the uses of primary, secondary, and tertiary sources. I can see that the situation in scientific fields is different, but feel strongly that many history articles are marred by indiscriminate use of primary sources, especially ones which are now available online. ch (talk) 04:04, 4 January 2013 (UTC)

I'm still working my way through the archive (admittedly, where I should have begun). I see a lot of discussion on this, with any consensus apparently one of consensus-by-exhaustion. Archived discussion seems complicated by people apparently using "primary"/"secondary" in different ways. I'm going to avoid that by using more concrete terms.

I believe that there is a very simple and general rule that is an exception to the wording on WP:NOR, and I would word it thus: A quotation is preferable to a quotation of a quotation; a paraphrase is preferable to a paraphrase of a paraphrase. And, if it needs saying, reliable, published synthesis of multiple sources is preferable to analysis of single sources. I think this gets to the heart of WP's preference for secondary sources, while preventing the kind of overinterpretation that I, 67.172.104.2, and NTox were talking about. There is, of course, a fine and fuzzy line between a paraphrase and an analysis, but this is dealt with elsewhere on WP.

I still think a good article is a mix of raw data and analysis-- neither is of any use without the other-- and because of that, primary sources have a place alongside (not beneath) secondary sources, but what I see on Wikipedia is that editors don't need to be told that: they include information from the primary source at least as often as they need to, even if they sometimes attribute it, by quoting a quotation, to a secondary source (as in the example I gave above, and as in NTox's excellent Aristotle example, which shows this isn't a problem that can only occur in more sciencey articles).

With all the history on this, it's probably arrogant of me to believe that there could be consensus on the exception I mentioned, so I would appreciate any criticism. 98.203.173.56 (talk) 23:12, 4 January 2013 (UTC)

I think that primary sources are best for some points, secondary ones for others. Let's imagine, for example, that Roger Penrose came up with a new, groundbreaking mathematical proof, and the possible sources were Penrose's paper in Annals of Mathematics or a report of the discovery on the BBC. For the actual equations, we would trust Penrose's paper to be more accurate than the BBC. But for an analysis of the significance of the discovery, we would prefer the BBC. Because most arguments on Wikipedia are about nuances of significance and weight rather than about simple points of fact, secondary sources are more likely to resolve an argument and let us get back to editing, so there are good reasons why we spend so much time talking about secondary sources. But it's a misunderstanding to think that secondary sources are always better.—S Marshall T/C 01:37, 5 January 2013 (UTC)

For an analysis of the significance, we would look to other scientific papers and reviews, not to the BBC. The popular news media is not a particularly reliable source for analysis of scientific advances. — Carl (CBM · talk) 13:55, 2 February 2013 (UTC)

I agree with NTox and the IP. There are certainly cases where primary sources are important if not necessary (including the example NTox gives about an article on Aristotle). Thus, some of the wording of this section, which discourages/disparages primary sources, needs to be changed. —Fishicus (talk) 13:30, 2 February 2013 (UTC)

Two thoughts

The unregistered editor may want to read Wikipedia:Ten Simple Rules for Editing Wikipedia (originally a PLoS article).
In the end, we prefer secondary sources (or even tertiary sources) because our volunteers are less likely to screw up when using them. Our average volunteer is a kid in his 20s with no or very little formal training in the subject he's writing about. Focusing on secondary sources means that we make slightly fewer of the types of mistakes that give science journalists such a bad name. WhatamIdoing (talk) 04:45, 9 January 2013 (UTC)

"kid in his 20s" Many would argue that people in their 20s are not kids. Perhaps you a fan of Thomas Hardy: "He had just reached the time of life at which "young" is ceasing to be the prefix of "man" in speaking of one. ... In short, he was twenty-eight, and a bachelor." (Far from the Madding Crowd, p. 1) -- PBS (talk) 12:41, 10 January 2013 (UTC)

In the sciences and mathematics, the average editors are graduate students, professors, and others with a very good idea about the subject they are writing about. The idea that these articles are generally written by people who have no idea about the subjects they write is simply not correct. — Carl (CBM · talk) 13:51, 2 February 2013 (UTC)

To the original IP: we do not prefer secondary sources; primary sources are perfectly acceptable. For scientific experiments, we do wish that the original source was cited in each case, as explained at WP:SCG#Attribution. However, because we are a work in progress, and because it takes effort to look up the original sources, our articles don't all meet the goals we have for them. — Carl (CBM · talk) 13:51, 2 February 2013 (UTC)

I also interpreted this section as indicating a preference of secondary to primary sources. Particularly due to this part: "Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources. Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources, though primary sources are permitted if used carefully." If there is no such preference, I think this needs to be modified to make that clear. —Fishicus (talk) 14:49, 2 February 2013 (UTC)

We do have a preference for secondary sources (see WP:MEDRS for the most stringently enforced version), but we do not have an exclusive rule against using primary sources. Primary sources are acceptable for some purposes, and not for others. See WP:USEPRIMARY for details.

SCG does not say that we prefer primary sources. It says that when a paper is actually famous, or the original source of a major concept, then it's normal to provide a bibliographic citation to that paper "even if those papers were not originally used as sources in writing the article".

In other words, Dunning–Kruger effect ought to provide information about Dunning and Kruger's paper, but that doesn't mean that you get to cite that paper in other articles, or that you get to use primary sources to claim that some mushroom cures cancer on the grounds that you found a primary source that says some concentrated mushroom extract kills cancer cells in the lab. WhatamIdoing (talk) 23:48, 3 February 2013 (UTC)

Question of this kind keep coming up, and the reason is often that editors have very different backgrounds and therefore incompatible but very strong ideas of how the division into primary, secondary and tertiary sources works. It would be great if someone could come up with good descriptive names that are not in common use outside Wikipedia. Hans Adler 23:58, 3 February 2013 (UTC)

Routine calculations

At WP:No original research#Routine calculations we see that it is ok (if consensus agrees) to use routine calculations that reflect sources. I am concerned about a recent edit (diff) that changed:

Basic arithmetic ... is allowed

to

Basic arithmetic ... is allowed [→Elementary arithmetic: +, −, ×, ÷]

Neither link is helpful: anyone needing the link to learn what "arithmetic" means should not be putting calculations into articles, and it is not satisfactory to use the link to specify what operations are permitted (are "routine calculations" +, −, ×, ÷ only, while square root and trigonometric functions are not permitted?). Is there a problem with removing the arithmetic link altogether? Or perhaps change "Basic arithmetic" to just "Calculations"? Johnuniq (talk) 00:22, 13 January 2013 (UTC)

The problem with being too presecriptive is that the rule-maker assumes that he has covered all possibilities (sometimes a rather arrogant assumption). One should concentrate on the rationale behind the rule rather than the rule itself. For this reason, I prefer the text:

"Routine operations ... is allowed."

This will allow the taking of square roots, use of trigonometric functions and the like in rare situations where such operations are appropriate. The emphasis is on the word "routine" (ie no original research). Martinvl (talk) 07:10, 13 January 2013 (UTC)

"Simple" situations are seldom as simple as they look. For example, every statement is also essentially a second statement that "RS's said that". (I have seen an example where even a "simple addition" statement wasn't, but a haze-deliberately-created-for-the-dispute made that "subtlety" ignorable, but the threshold to more complex math was understood as such.) And yes, someday we'll need to do better here at defining the border between normal editing/ summarizaion and OR/Synth. But I think that enumerated exceptions are a bad way to do it, and so I'd rather see them eliminated than expanded, and thus I oppose this expansion. North8000 (talk) 11:02, 13 January 2013 (UTC)

Example: (names and types changed to be cool) It is overwhelmingly sourced that Country "A" has an area of 10,000 square miles and Country "B" has an area of 12,000 square miles, and there is no wp:RS that lists their combined area. There is a movement that wants to consider them to be a single country, country "AB". In the Wikipedia "list of countries by area" they do "simple addition" and list the area of country "AB" as 22,000 square miles. The second implicit statement is that the source referred to country "AB" as an entity. In the haze deliberately created for the dispute, such a subtle objection (involving "only" an implicit statement) goes nowhere. But when they got to things like "average height above sea level" and "highest point in the "country"" it was pretty clear (even in the deliberately-created-haze) that such "one step up math/derivation" would be crossing the line. North8000 (talk) 11:39, 13 January 2013 (UTC)

Actually the "simple" addition in the example may not even be warranted, as the sources for country A, may use a different way to estimate the surface area than those for country B (e.g. by dealing with lakes, islands, coastal areas, swamps and rivers in a somewhat different way). The "simple" addition in that case would result in a total that would not be the outcome of any type of (single) estimation method.

In other words, a routine calculation can only be performed on measures that are identically specified. This is often the case for mathematical and physical measures, but as soon as social issues (as in geography) come in, this is no longer straightforward.

So actually in the example, the overwhelming sources should not only provide the numbers, but also the overwhelmingly sourced evidence that areas are estimated in an identical manner (which I somehow doubt).

I think this is an excellent example where the rationale behind this policy would warn against even seemingly trivial calculation (while for pure mathematics far more complex calculations may still be fine). Arnoutf (talk) 12:06, 13 January 2013 (UTC)

Yes. For the simplification in my example, I should have clarified/emphasized that all of the sources had the exact same 10,000 and 12,000 numbers. But that does not in any way take away from your excellent points. North8000 (talk) 12:14, 13 January 2013 (UTC)

Whether something is routine calculation or not needs to be decided in a given context! Explicit enumeration of some mathematical operation is in doubt causing more problems than it solves.--Kmhkmh (talk) 12:31, 13 January 2013 (UTC)

Agree. And it really isn't doing much good either. We should delete it.North8000 (talk) 12:52, 13 January 2013 (UTC)

I generally am expansive on this point, because it's rare that anyone seriously proposes a completely inappropriate mathematical calculation. I don't think that we should be specifying that "+, −, ×, ÷" are good (they may not be) or implying that anything else is not. But I don't particularly mind "Basic arithmetic". I think we should also include "simple descriptive statistics, such as identifying the range (statistics) from a list of values". WhatamIdoing (talk) 04:31, 23 January 2013 (UTC)

We've had this discussion a few times! I usually weigh in to point out that for articles within the scope of Wikiproject Mathematics, you can't provide a "sourced" calculation without copy/pasting or literally typing out a calculation from a textbook or paper (with perhaps a few cosmetic changes). This violates of our copyright rules. So what I always say is that WP:CALC should allow high school-level calculations in most articles and undergraduate-level maths for articles about maths or certain aspects of physics. Then I usually give you what I call "Blueboar's Law" because it came from a previous discussion in which Blueboar made a good point. It's Blueboar's Law but the wording is mine: "It's not original research if Euclid proved it in 300BC."—S Marshall T/C 09:09, 29 January 2013 (UTC)

Wow... I have a Law? Thanks! (is this where I should say it's more of a rough guideline than an actual law?) :>) Blueboar (talk) 04:30, 16 February 2013 (UTC)

OR:

Whom exactly, is this reliable public source that you are referring too? A university, GOD perhaps, or is that someone a bit less schizofrenic then yourselfs and the methods used to create these pages.

You definitely need to clarify, because ALL reliable sources, have defacto copyrights associated with them, even when those sources are published or have a copyleft or public license.

All that, besides that little stickler, where most sources are temperory sources, there being many a dead link on all and every wikipedia page.

Do you consider yourselfs as a reliable source, there being many self-references to other pages.

What about considering a defacto ONG, library of congress, university/college, those that maintain THE reliability as the only true reliable source depositories.

How about some ISBN numbers, or hasn´t any of your clientele ever written a bookreport, termpaper, or thesis? — Preceding unsigned comment added by 186.94.187.76 (talk) 14:12, 22 February 2013 (UTC)

See WP:RS, WP:CIRCULAR, WP:ISBN, WP:DEADLINK, and WP:GOD, which between them address your questions here. --Demiurge1000 (talk) 14:47, 22 February 2013 (UTC)

Schizo this:

Is GOD, or ALLAH, a reliable source?

If so, and you talk on a dayly basis with aphrodite, mercury, zeus, hermes, and ehhhhh, loki I suppose, would you consider that original research, and/or would you consider that from a reliable source? Just asking, after all, the gods are maintainers of themselves, quite self-referencial, although without ISBN number. — Preceding unsigned comment added by 186.94.187.76 (talk) 14:32, 22 February 2013 (UTC)

GOD may be a reliable source, but SHE does not want HER words screwed up by an average wikipedian, so SHE talks directly to YOU. However YOU are not a reliable source, because we have no idea whether GOD talked to YOU or YOU are simply messing with US. Staszek Lem (talk) 23:42, 22 February 2013 (UTC)

"...that can be verified by any educated person with access to the source but without further, specialized knowledge"

Is it correct to assume that an ordinary user of English Wikipedia is able to read only English sources? If a primary source is written not in English, can we assume some specialised knowledge is needed to verify a statement supported by such a source?--Paul Siebert (talk) 16:53, 25 February 2013 (UTC)

Use of other languages is in accordance to the wikipedia rules in as stated in WP:OR and more precisely being spelled out in WP:NOTOR - Wanderer602 (talk) 17:21, 25 February 2013 (UTC)

Unfortunately, WP:NOTOR is just an essay. It is neither a policy not guidelines...--Paul Siebert (talk) 17:32, 25 February 2013 (UTC)

That is why i didn't refer to it as such, only to the actual ruling made on WP:OR - "Faithfully translating sourced material into English, or transcribing spoken words from audio or video sources, is not considered original research." . - Wanderer602 (talk) 17:35, 25 February 2013 (UTC)

You didn't answer my question. I am talking about the usage of an untranslated foreign language primary source. Does the policy say that the descriptive claim supported by a foreign language primary source can be verified by any person without specialised knowledge?--Paul Siebert (talk) 17:40, 25 February 2013 (UTC)

That seems to be exactly what WP:OR discusses, see: "Faithfully translating sourced material into English, or transcribing spoken words from audio or video sources, is not considered original research." Continued furthermore in Wikipedia:Verifiability#Non-English_sources - "Translations published by reliable sources are preferred over translations by Wikipedians, but translations by Wikipedians are preferred over machine translations." - Wanderer602 (talk) 18:00, 25 February 2013 (UTC)

It is important to note that "any editor" does not mean "any specific editor"... While the average reader/editor may not be able to read a non-English source, they have easy access to other readers/editors who are able to do so (Wikipedia has language help desks for just this purpose). In other words, while editor X may not be able to read Foobarian, he can go to the Foobarian language help desk and ask editor Y (who does read Foobarian) to check the source and see if it supports the information in the article... or if the article contains OR. Blueboar (talk) 13:25, 26 February 2013 (UTC)

Agreed.--Paul Siebert (talk) 19:51, 26 February 2013 (UTC)

On ""facts by omission" points"

During some discussion a user expressed an opinion that when a secondary source discussing some topic in details does not mention some event, we can use such a source as a support for the statement that this event didn't happen. In my opinion, it directly contradicts to the policy. Am I right?--Paul Siebert (talk) 17:45, 25 February 2013 (UTC)

Issue is related to fighting at a certain spot in Karelian Isthmus in 1941 where certain Soviet/Russian sources claim that at a specified time there would have been a fighting at certain place with the location in question even changing hands repeatedly at the time. However on the Finnish side there is nothing on the event in secondary sources discussing the fighting in detail in the area in question around the time the event took place - equally little (nothing) is even in primary sources (i.e. in official war diaries or tactical maps) of the Finnish units facing the are but these have not been used for the claim.

I would also hope for an opinion how such an event with purely one-sided information handles in conjunction of WP:NPOV - mainly because the demand of a source at states that nothing took place at some arbitrary spot at a specified time does seem like circular reasoning to me, that is should nothing have taken place then it would have been highly unlikely for there to be any records of the nothing in question either. - Wanderer602 (talk) 18:14, 25 February 2013 (UTC)

You can't conclude that nothing happened. Leave that to secondary sources.

The text of the rule is clear: Even with well-sourced material, if you use it out of context, or to advance a position not directly and explicitly supported by the source, you are engaging in original research.

Also, NPOV is not an excuse for OR. -YMB29 (talk) 18:28, 25 February 2013 (UTC)

This is exactly about the secondary sources. And the question i laid out above is not the same as the one you described. In addition claimed OR does not allow NPOV to be ignored either. - Wanderer602 (talk) 06:59, 26 February 2013 (UTC)

How many users have to tell you that it is OR before you understand?

It is OR and you can't excuse it with NPOV. -YMB29 (talk) 19:10, 26 February 2013 (UTC)

Question... what article is this related to? Remember that a Wikipedia article is an overview of its topic... it can not (and should not) try to cover every tiny detail. We always have the option to omit verifiable information if that information is deemed irrelevant or trivial to the overall topic of the article. So, the first question to ask is... does the specific article in question really need to mention that there was/was not fighting at this specific spot on the Karelian Isthmus in 1941? Blueboar (talk) 13:09, 26 February 2013 (UTC)

No, i do not see the information from the either side regarding the events at the site to be relevant to the article. Article handles the whole of the Continuation War from June 1941 to September 1944 - which can be understood as having been a part of the Eastern Front of WW II. - Wanderer602 (talk) 13:28, 26 February 2013 (UTC)

It is relevant because the event is evidence that Finland was not only taking back its territory and stopped at the old border, as claimed in the same section of the article. -YMB29 (talk) 19:10, 26 February 2013 (UTC)

That claim you made is actually wrong since article is quite clear that Finns crossed the border already without that any statements with regards to N. Beloostrov: "On 31 August, Finnish HQ ordered the offensive to halt at a straightened line just past the former border.[63]" - Wanderer602 (talk) 19:46, 26 February 2013 (UTC)

Taking N. Beloostrov suggested that the Finns intended to advance much further and go around the fortified defensive line. -YMB29 (talk) 04:16, 27 February 2013 (UTC)

How so exactly? N. Beloostrov is located on the shore of the River Sestra which is the river along which the border line run in 1939 - does not appear to be 'much further'. - Wanderer602 (talk) 10:10, 27 February 2013 (UTC)

Blueboar, although my question was dictated by a very concrete discussion, actually, it is general. I think I have to rephrase it.

"The policy says: The term "original research" (OR) is used on Wikipedia to refer to material—such as facts, allegations, and ideas—for which no reliable, published sources exist. That means that some allegation, idea or fact must be explicitly present in a reliable source, and that is a necessary conditions for adding such a material to Wikipedia. However, let's consider a situation when some source does not mention some event, although it should have to. Obviously, for every reasonable person it is clear that, according to that source, this event had never occurred. However, such a conclusion is made by a reader of the source, not by the author himself. In other words, the source supports such a statement only indirectly. In connection to that, I think that ""facts by omission" arguments" violate our policy and should be explicitly prohibited."

--Paul Siebert (talk) 20:05, 26 February 2013 (UTC)

I think there is no need for that. Every single source omits more than 99.9999999% of all facts, and reflecting on those would either be original research, or something like a conspiracy theory (covered under WP:Fringe), where omitting facts (or denying facts) is considered proof of these omitted facts.

In the specific case under discussion, where the Finnish sources and the Soviet/Russian sources report different facts on a war between the countries, the reliability of both sources (secondary or not) maybe doubted because the sources may not be neutral (cover in WP:NPOV).

So in my view the different "facts by omission" arguments I can think up are already adequately covered, albeit in different policies. So no need for additional explicit examples Arnoutf (talk) 20:35, 26 February 2013 (UTC)

Non-investigative journalism is often tertiary

Presently we have this:

Tertiary sources are publications such as encyclopedias and other compendia that summarize primary and secondary sources. Wikipedia is a tertiary source. Many introductory undergraduate-level textbooks are regarded as tertiary sources because they sum up multiple secondary sources. Another example is obituaries.

(I added the last sentence, from the section at WP:RS on the same topic, since this fact was missing here.)

I propose adding "rehash" journalism (i.e. that which is not investigative, interview-based or editorial in nature), something like so:

Tertiary sources are publications such as encyclopedias and other compendia that summarize primary and secondary sources. Wikipedia is a tertiary source. Many introductory undergraduate-level textbooks are regarded as tertiary sources because they sum up multiple secondary sources. Another example is obituaries. Journalism that is not investigative, interview-based or editorial is usually tertiary, repeating information (not always accurately) from other sources without citing them in detail if at all; examples include summary snippets of celebrity gossip with uncertain attribution, most articles in children's and young-adult publications, and articles of an introductory or basic nature in many periodicals, such as pet fancier and science popularization magazines. It is preferable to use such tertiary sources as clues to finding the underlying and more reliable secondary and/or primary sources, rather than citing tertiary ones directly, as they are often questionable.

— SMcCandlish Talk⇒ ɖ^⊝כ^⊙þ Contrib. 14:14, 7 March 2013 (UTC)

Disagree with classifying obituaries as tertiary. Often obitiaries are primary, secondary and very often simply unique sources about scientists, who do not hit multiple headlines, unless they are Nobel prize winners or student rapists. I would agree that a paragraph-length newspaper obituary is a bit of formalism. But obituaries about scientists published in scientific journales are extremely reliable and well researched. Staszek Lem (talk) 19:01, 7 March 2013 (UTC)
I don't believe that "rehash journalism" is normally considered tertiary. It's usually built on primary sources, and so it usually produces (low-quality) secondary sources. WhatamIdoing (talk) 00:16, 6 April 2013 (UTC)
I should also say I don't think obituaries are always tertiary; in fact I often see them as secondary, because they attempt to draw broad conclusions about people from little details of their lives. 'Rehash' journalism is harder to classify because in some cases I have seen it as tertiary, but I think it is usually secondary. Low quality, to be sure, but secondary because the rehash journalist has considerably framed the content to be short and interesting; to fit a 'rehash style', if you like. Tertiary writing is more about attempting to describe the consensus of thought on some issue. NTox · talk 03:41, 6 April 2013 (UTC)

Simple math operations ?

A question of principal matter. Example: if a valid, undisputed source states for instance that "in 1930 one US-Dollar was equal to two of an other currency". Is it then concidered that stating, "the other currency equaled 50 US-cents in 1930", as original research ? I fail to see that, in this perticular example. Or if a valid, undisputed source states the number of inhabitants of a certain country, and the size of that country in km2 - and someone then figures out the population density of the country in question. So it may seem simple, and perhaps stupid (?) to concider that one single basic math operation equals "original research". But if a (still simple) series of mathemathical operations are involved (like a long list of adding and subtracting and then followed by a division operation, or even further "sideoperations" to decide the addings or subtractions before determine each element of the tally) - where to draw the limit for becoming "own research" ? I'm not talking about math as subject, but the use of math as source in other articles, like in my examples above ? Cannot find any real guidelines for when the use of simple math operations turn into "own research". Greatful for comments. Boeing720 (talk) 08:26, 4 April 2013 (UTC)

I would put conducting a string of simple calculations under the classification: Potential Original Research. The more complex the string, the greater the potential for calling it OR. Also, a lot depends on the context of the calculation. A given string of calculations might be acceptable in one circumstance and not acceptable in another. Essentially it comes down to this... if someone questions the calculations, you should then go to the talk page, explain exactly what you did and why, and achieve a consensus that it was acceptable given the context in which it was used. Blueboar (talk) 02:47, 5 April 2013 (UTC)

Also, combining numerical data to make a point that the source does not make could be a violation of WP:SYNTH. Zero^talk 03:46, 5 April 2013 (UTC)

I thank You both for this information. When it comes to logic I would like to rise the following question (taken from another Wiki some years ago). Under the article Columbo (the famous TV-detective), under a headline "attributes", was already his car, his never appearing wife, his dog called just "Dog" etc mentioned. I added that Columbo also has (or has had) a sister. And refered to the episode "No time to die", where Columbo visits his nephew's wedding. The male nephew had a different name, and hence I thought and added "Columbo has a sister aswell" by refering to the episode in question. (The sister is aqtually mentioned atleast once in a different episode, but that's beside the point - and was not to my knowlidge at the time). Someone else rejected my small edit due to OR. I've studied the WP:SYNT page, but still wonder if my contribution really was OR or not. This real example may not be of any importance, but as principal matter of larger interest. All examples at the WP:SYNTH gives small, but still different meanings and are rather easy to understand. But they do not quite cover what I would call "simple logical issues". Boeing720 (talk) 06:22, 5 April 2013 (UTC)

I think there are two facets of this. The first facet is the question of whether deriving an obvious fact is "OR", and for the examples cited, clearly it isn't. If I have a source that says the diameter of a planet is 40,000km and I put in the article that its radius is 20,000km, then I haven't performed original research. Such trivial mathematical operations are covered under WP:CALC. Likewise, if I say that Poirot has a great-aunt in Harwich (he probably doesn't but stipulate this for the sake of argument)... if I say Poirot has a great-aunt, then provided I cite a page of an Agatha Christie novel on which it says "Poirot had just returned from visiting his great-aunt Petunia in Harwich", then I haven't performed original research. I've pulled a fact from a primary source.
But the second facet is whether the fact I have derived actually belongs in the article at all. If I tried to add the business about Poirot's great-aunt to an article about squirrels, then it would be removed. If I tried to add it to our articles on Agatha Christie or Harwich, then it should be removed after a brief discussion. If I tried to add it to our page on Poirot then it would probably be removed after a longer discussion. Such a trivial fact might belong in the article about the novel, which is the level at which it's relevant, but even then, only subject to talk page consensus.
The point here is that articles should be a relatively brief summary of their topic, and while added detail is good, low-level and trivial detail may not be desirable, true though it undoubtedly is.—S Marshall T/C 07:52, 5 April 2013 (UTC)

And back to the Columbo example. If we follow new LBGT legislation coming into practice in increasingly many countries, the logic could based on the observation "nephew's wedding. The male nephew had a different name, and hence I thought and added "Columbo has a sister" could also lead to the conclusion "Columbo has a brother" (who married another man). Not likely in the time frame of Columbo, but nevertheless nothing necessarily logically wrong with the argument; besides it porbably being outside the frame of reference of the person making the logical deduction. And that is exactly why we should be extremely cautious, even with the most trivially appearing logical inferences (i.e. synthesis), as logic is confounded by the frame of reference of the person applying the logic. (Note that the famous inference/synthesis of Einstein that light is both a wave and a particle was far outside the frame of reference of the logic in physics at the time.). Arnoutf (talk) 08:36, 5 April 2013 (UTC)

Synthesis and definitions

It might be helpful to include some discussion of the role of definitions in deciding what is and what isn't WP:SYNTHESIS. An example has come up in a discussion on meta-ontology (a rather abstract topic) so I'll try to put forward a much simpler example.

Suppose an editor wishes to insert the assertion into an article that "black is a color". An editor opposes this assertion on the basis that no source is provided for this assertion, so it is WP:SYN (that is, it is a conclusion not explicitly stated by any source). Although the issue might be resolved by finding a source that makes this assertion, let's suppose for the sake of argument that no source can be found. Here is the question: if the definition of color obviously includes 'black' is the definition sufficient to escape the charge of WP:SYN?

In this case, at least according one definition, it appears that color is a property of light, and the absence of light (black) is therefore not consistent with the definition of 'color'. This particular assertion is therefore indeed WP:SYN (unless a different definition can be found). So the example is flawed, but it may serve to illustrate the question:

If an assertion that "A is an example of B" falls within the definition of B, is that sufficient reason to escape WP:SYN, even although no source is provided to assert this claim?

Brews ohare (talk) 16:37, 11 April 2013 (UTC)

My answer would be: Sometimes....

The problem is that in some cases (especially where neat categorisation does not hold, and it hardly does) the logic is not as obvious. For example, a long held definition of an animal species is that animals cannot produce fertile offspring they are different species (like horse and donkey producing the sterile mule). Following this logic and looking at a specific bird (sorry I lost its name) living in arctic regions this would result in odd conclusion. The bird lives around the world, and the families living in Scandinavia cannot breed fertile offspring with the Alaska variant (conclusion two species), however the Scandinavian and Greenland birds go fine together, as do the Greenland - Canadian ones, and the Canadian - Alaskan, Alaskan-Siberian and Siberian Scandinavian (conclusion all neighbours are the same species, hence the whole population is one species...).

BTW, another definition of colour is the perception of..... which allows inclusion of black, as perception is sensory registration of a physical phenomenon interpreted against existing associations. And though these associations people the absence of light gets a meaning as a colour (ie black) or darkness (which is not a colour). Arnoutf (talk) 17:19, 11 April 2013 (UTC)

Arnoutf: You raise the issue that definitions are not always straightforward, and of course debate may ensue. What strikes me, however, is that it can happen that the definition actually is clear. In such cases, one can avoid bickering between editors if WP:SYN states that where the definition of B clearly includes A, then WP:SYN does not mean a source must be found.

If WP:SYN is amended in this way, it could happen that differences arise instead about the definition but, even if that happens, that discussion actually is useful because it shifts attention to ‘what is A an example of?’ Does A exemplify B₁, B₂, or B₃? That possibly shifts focus in a fruitful direction for a WP article. What do you think about this suggestion? Brews ohare (talk) 19:33, 11 April 2013 (UTC)

True. The problem being is who decides when a definition is clear. Therefore, I think we shoud not change the policy for this. I think it would be extremely difficult, if not impossible to contextualise this in an unambiguous way in the guidelines. My suggestion would be that this should be usually resolved by consensus in the talk of the relevant article (this of course assuming good faith, but without that assumption we might as well stop). Arnoutf (talk) 20:24, 11 April 2013 (UTC)

I think it can get tricky to provide specifics of this sort, because it gets difficult to cover all possible cases, and anything we overlook could be an invitation to Wikilawyer. Perhaps it would be a good idea to create an essay that goes into these kinds of details about synthesis. --Tryptofish (talk) 20:35, 11 April 2013 (UTC)

Tryptofish: It is very difficult to get any agreement on changing a WP policy, so perhaps an essay is the only thing left to do. IMO that is unfortunate. WP needs policies that encourage collaboration, not wikilawyering, as you point out. That was my intention here: to simply state that where a definition is clear "A is an example of B" is OK with WP:SYN. It could happen, of course, that arguments over whether a definition was satisfied would get out of hand. One could take an experimental approach: try the change and see if it happens that way. If WP policies remain inflexible because of fear of changing them, they will not evolve to improve collaboration, but will stress instead easy enforcement as a greater good than making sense. Brews ohare (talk) 12:57, 12 April 2013 (UTC)

Related to this: scholarship of the best quality is often addressed to an audience of specialists, and the author assumes a shared knowledge base. The role of the WP editor is to make that information accessible to our general readers, who may come to the article with no knowledge at all. So it's sometimes necessary to use one scholar to explain a point in another who simply assumes his intended audience would know that. To me, that isn't synthesis unless it points toward a conclusion not intended by either or both; that is, unless putting the two together results in OR. But if you're in a content dispute, an editor arguing against you can choose to call it synthesis as an obstructionist or deletionist tactic. What results is often text that doesn't serve the reader: a hodgepodge of exegesis in the "he said, she said" mode that resembles the first chapter of a dissertation, not an encyclopedia article. Synth policy is supposed to prevent OR, not the seamless compilation and good writing that produce reader-friendly articles. I'm just offering this as a perspective; I'm afraid I have no proposal for a solution. Cynwolfe (talk) 15:39, 12 April 2013 (UTC)

Cynwolfe: Maybe an example of this "obstructionist" tactic is the misuse of WP:SYN to block a simple statement of "A is an example of B" where B clearly is defined to include A. This can happen for a variety of bad reasons, like personality issues, but it also can happen because the two editors have different background contexts. The result of using WP:SYN to require a source in such a case is escalating discord. One editor says a source is needed while the other says its obvious and no source is needed. You might expect the two editors to eventually realize that a definition is behind all this, but it often happens that each assumes the other is simply misinformed (or, more bluntly, an idiot). When tempers rise it becomes a case of simply hiding behind the policy and not trying to find out what is the matter. Brews ohare (talk) 19:24, 12 April 2013 (UTC)

But that is not what you are doing Brews. Over multiple articles you find a word or phrase that is used, then you do a google search and expand on that phrase regardless of the original subject. You are making the decision to add material in without any actual source which says the material is relevant. You've been rejected by other editors on the two RfAs you have raised (without any support) which includes picking up on your various COATRACK efforts. When someone explains something to you, then we get the strawman restatement of the problem (as we see above) rather than dealing with the issue. Anyone who opposes you on an RfA is subject to multiple and extended arguments and restatements of positions already answered. It is exactly the same behaviour that won you first a temporary and then a permanent ban from editing article to do with Physics ----Snowded ^TALK 20:05, 12 April 2013 (UTC)

Snowded: I am sorry you have so many complaints about me, but they have nothing to do with this discussion whatsoever. Brews ohare (talk) 21:47, 12 April 2013 (UTC)

You opened this discussion with a reference to Meta-ontology and restated the issue there so that is the context of my response. You are trying to get an abstract statement agreed that you can then interpret on articles. I happen to believe commenting editors need the full context. ----Snowded ^TALK 05:34, 13 April 2013 (UTC)

The origin of this suggestion was a discussion on Talk:Meta-ontology as I pointed out, but the question stands on its own outside this context. Bob K31416 has pointed out to me that the present discussion is better phrased in terms of WP:OR itself, rather than WP:SYN. Brews ohare (talk) 17:13, 13 April 2013 (UTC)

Proposed Sub-section for WP:NOR

As an evolution of the discussion of an earlier thread, I'd like to propose discussion of the following addition to WP:NOR:

Definitions

A particular example where the question of original research can arise is in the case of definitions; as an example, consider the case of a statement like "A is an example of B". A particular instance might be "black is a color". The question arises whether a reliable source is needed that explicitly makes such a statement, or on the other hand, whether it suffices to point out that the definition of B clearly does (or does not) include A.

One might expect two editors who disagree to realize eventually that a definition is behind their disagreement. But it often happens, especially where different backgrounds are present, that each editor assumes the other simply is misinformed (or, more bluntly, an idiot), leading to rising tempers. When tempers rise it is a temptation to invoke a violation of WP:NOR, which diverts attention from trying to find out what is the matter, and might impede the discovery that definitions are involved. Where definition is involved, attention should be directed to establishing the definition(s) of the term, and WP:NOR should not be invoked.

Evidently, it can arise that editors disagree about what a definition contains. Establishment of a definition should be based upon WP:RS, and should not be unnecessarily parochial (that is, where a variety of usages exists, choice among them should be discussed in the context of the topic at hand).

Comments

I believe such an addition would encourage a collaborative interaction among editors by removing the 'easy out' from a proper discussion of definitions by peremptory use of WP:NOR that abruptly stops what could be useful discussion and could lead to a better WP presentation. Brews ohare (talk) 17:51, 13 April 2013 (UTC)
I believe that such an addition would encourage trivia and occasionally serious wikilawyering by people who are violating the spirit of this policy for the purpose of introducing novel connections between subjects that reliable sources have considered to be unrelated. I therefore oppose this proposed addition. WhatamIdoing (talk) 21:13, 13 April 2013 (UTC)
I don't see how this helps at all. WP is based on sources and the proper approach when something appears unsourced is to ask for them. This is true of definitions as much as anything else. And one way to ask for sources is to invoke this policy, using perhaps the shortcut WP:NOR. It's not the only relevant policy, but it is a core one covering all articles so is often invoked. And suggesting editors are bad tempered or abusing each other over this is entirely unnecessary. Editors can get annoyed for all sorts of reasons, but it's unlikely to be over whether a definition is properly sourced. So also oppose.--JohnBlackburne^words_deeds 21:37, 13 April 2013 (UTC)

John, the suggestion made here that you seem to have overlooked is that definitions be supported by WP:RS. I think that covers your objections. The problem with a general referral to WP:NOR as you prefer is that it is not as specific as (for example) citing WP:NOR with a new specific short-cut to a new subsection say (possibly) WP:DEFINITION. The lack of specificity in simply using WP:NOR leads (in my opinion) to the likelihood of a less successful resolution and a less successful WP article. Brews ohare (talk) 22:35, 13 April 2013 (UTC)

OR to disprove

A source may be reliable, verifiable, and wrong. I've long wondered whether it is permissible to use blatant but replicable OR to remove flawed but correctly sourced information from an article. A few purely hypothetical examples to illustrate:

A non-technical source makes an incorrect calculation: confuses bits for bytes, ignores rules for combining probabilities, fails to use a physics formula.

Sources from a number of reputable news organizations quote a sound byte from a speech made by a Nicaraguan politician. A Spanish-speaking editor views a video of the speech, and notices that this is a mistranslation.

A source states that a certain word appears in a certain document a certain number of times. The definitive document is available online, and a search for the word comes up with a close but different number.

Let's assume that in each case there is no other reliable source that has spotted the mistake and published the correct version, and that there is a unanimous consensus (itself a situation to be defined...) among all editors who claim to understand the issue or to be able to verify the error. I see three potential courses of action: 1: The reliable source stands. OR has no place in Wikipedia, and the incorrect information stays in the article. 2: The information is removed and not replaced, at least until a reliable source can be found that expresses it. 3: The incorrect information is substituted with correct information, as determined through the OR.

None of these options seems ideal. I'd be inclined to discount the first, although I imagine some editors might take such a hard line view. The second is perhaps the safest option, and it's the one that I would generally prefer. It would make sense for a piece of spurious information to find its way into a high quality source because it is not particularly relevant to the core of the article in question; and that is probably going to be in line with the core of the Wikipedia article using it as a source. Of course, that is using OR to affect the content of an article. Finally, I wonder whether there might be cases where the third option is justifiable.

In any case, I would be very grateful for any comments on how this sort of situation ought to be handled. 201.230.147.37 (talk) 09:04, 6 March 2013 (UTC)

NOR only prevents you from presenting original research within the article proper. This policy is by no means a prohibition against using original research to decide what material to present, and what material to omit. Ultimately, it is up to a consensus of editors to decide whether a particular source is reliable, or a particular fact significant. Someguy1221 (talk) 09:34, 6 March 2013 (UTC)

This is my take on the issue. I don't know if everyone will share this, but I think it's a relatively common view:

The importance of rules around here is WP:CONSENSUS > WP:IAR > WP:NOR/WP:V/WP:NPOV. In this case, if there is a consensus, then we can ignore all rules, which in this case means ignoring no original research. As long as there is a consensus, you can do any of those options. The larger the number of editors coming to the consensus, the more accurately it will reflect wikipedia's goals, probabilistically speaking. The rules themselves were agreed upon through consensus. They might be wrong sometimes and they might have flaws. But in those cases, as long as there is a consensus, you can ignore all rules. Charles35 (talk) 23:18, 6 March 2013 (UTC)

You would need a really really really good reason to justify ignoring WP:NOR... it is one of our core policies after all. The key here is that we are allowed to discuss OR on talk pages... and we are allowed to make judgement calls on what verifiable information to present in an article. So... if you think something an article says is incorrect, and you can demonstrate this through your own Original Research... you may discuss that OR on the talk page in an attempt to convince others that the (arguably) incorrect information should be omitted (whether that attempt will be successful is another matter entirely). What you may not do is replace the (arguably) incorrect information with your OR.

To illustrate: Suppose a source says Joe blow was born in 1872 ... you have OR information that indicates he was actually born in 1875. You can present your information on the talk page, and discuss these dates. If the consensus agrees that your information casts reasonable doubt onto the sourced 1872 date, we can come to a consensus to remove the 1872 date. However, we can not take the next step and replace it with your 1875 date, as that information is not (yet) verifiable. You would need to get the 1875 date published before we could add it. Blueboar (talk) 02:33, 7 March 2013 (UTC)

It seems that is a fairly probable situation. I was accused of OR in a talk page in exactly this situation: an otherwise reliable source uttered nonsense in the area out of their expertise, but I was was prohibited from removing it. My proof in talk (based on books) was called "OR" and ignored. I DGAFed at the time, but since other people have the same issue, I guess a phrase to this end must be added to the policy. Staszek Lem (talk) 18:57, 7 March 2013 (UTC)

That's just the thing. Wikipedia is about the only source out there where a single article can be co-authored by experts in widely different fields. The sources that we are using can be authoritative and provide vital contributions but may include misunderstandings in tangential areas. According to Someguy1221 it is OK to use OR on a talk page to point out such a mistake, but I too have come across editors objecting to that under this policy, so it would be nice to have that made explicit. 201.230.220.97 (talk) 07:24, 8 March 2013 (UTC)

This is the same argument that has been coming up for as long as WP:NOR has been a policy. But what you need is already right in the first sentence: Wikipedia articles must not contain original research. A different line of argument that leads to the same conclusion: content is required to be verifiable, but verifiable content is not required to be published. Someguy1221 (talk) 07:29, 8 March 2013 (UTC)

NOR applies to articles, not their talk pages. Perhaps this should be mentioned explicitly in WP:NOR. Think about it: a talk page consists mostly of the opinions, ideas, and analyses of Wikipedia editors which you won't find in a reliable source, i.e. a talk page predominantly consists of OR.

If it's clear that info from a reliable source is wrong, then it shouldn't be used. If it's not so clear that it is wrong, then editors should use their best judgement. In any case, if there are objections to an edit, then consensus should decide as usual. --Bob K31416 (talk) 16:00, 4 April 2013 (UTC)

Bob K made an edit saying that NOR does not apply to talk pages, and I can't make up my mind whether I think it's a good thing or not, so I'm asking here. I fully understand the intent of the edit, and I agree with that. But I wonder whether the wording could lead to someone adding OR to an actual page, where we don't want it, and justifying it by saying that they had discussed it on the talk page and it had been OK there. --Tryptofish (talk) 23:18, 5 April 2013 (UTC)

The recent parenthetical addition is, "(This policy of no original research does not apply to Talk pages.)". The scenario you mentioned is too unlikely considering the rest of the paragraph which starts in bold font, "Wikipedia articles must not contain original research." --Bob K31416 (talk) 12:13, 6 April 2013 (UTC)

I saw the edit, and I thought it was perhaps redundant, but that it didn't really go far enough. NOR applies to the mainspace (including templates transcluded in the mainspace), and nothing else. "OR" is just fine on user pages, WikiProject pages, policies (really: where exactly do you think you'll find a reliable source that has published 'the community decided to set a rule that says this...'?), or indeed, on any page except those in the mainspace. WhatamIdoing (talk) 00:14, 6 April 2013 (UTC)

The added clarification is useful, considering the above discussion in this section. Regarding other parts of Wikipedia where OR is allowed, there's no indication that mentioning that would be useful. --Bob K31416 (talk) 12:13, 6 April 2013 (UTC)

Yes, now I agree. It was just something that crossed my mind, but I'm comfortable with it now. --Tryptofish (talk) 20:34, 7 April 2013 (UTC)

On the original question, in essence the rules say that verifiability is a requirement for inclusion. They do NOT say that verifiability is a FORCE or mandate for inclusion. Unfortunately 95% of Wikipedians do not know that, so we need to work on that in policy wording. So, you do NOT need wp:iar to leave out false sourced material, although it might be more understandable for the 95% than the abstract logical underpinnings of what is and isn't in wp:ver and wp:nor. North8000 (talk) 20:54, 7 April 2013 (UTC)

OR to express in-source facts or data in terms that differ or clash with source authors' interpretive expression of those facts?

(I have also cross-posted a rewritten version of this as a Village Pump topic). — Preceding unsigned comment added by AdamColligan (talk • contribs) 16:50, 17 April 2013 (UTC)

This issue has one foot in the above discussion, although it deals with more-direct use of a source's material in a way that may clash with source authors' expression of that material but does not constitute direct commentary in the WP article.

I recently made a technical edit to an article, and I want to make sure that the principle on which I made the edit is sound so that I can apply a correct procedure in future edits and reviews.

In the article Irreligion, I found the following sentence: "A 2012 survey found that 36% of the world population is not religious (including atheists) and that between 2005 and 2012 world religiosity decreased by 9%.[4]" The citation went to a draft WIN/Gallup International report (pdf). And that phrasing was a reflection of how the data in that report had been expressed by the authors in their interpretive commentary. For example, on page 6 of the source material, there is a column labeled "% change in religiosity", and in the summary row "Global Average", that column registers "-9%."

However, that same row of data shows that the 2005 figure was 77%, and the 2012 figure was 68%. This is an 11.7% decline, not a 9% decline. It is, however, a 9 percentage point decline. The authors are making a very clear, evident, and common error in expressing the meaning of the data: confusing percent change with percentage point differences. Therefore, in order to accurately express the source's information, I altered the statement to read, "...religiosity declined by 9 percentage points." I am also considering whether the sentence should be extended to say "or nearly 12 percent" as a disambiguator.

While I feel confident that this seems like the right treatment for this situation, I am somewhat worried about how to extend this principle. Because the source is not being quoted directly, there is no ready format of bracketing the proper meaning or adding [sic]. And while this sort of action is not source commentary in and of itself (generally an original research violation), implied within such an edit is a criticism or disagreement of the alternate meaning that the source authors expressed for their data.

In this case, the disagreement is simply the result of a very common and easily spotted error in the source wording. However, there are other situations where the same principles seem to be at work but where WP editorial discretion feels murkier. For instance, Bear Braumoeller has pointed out (pdf) that many serious academic papers make a basic error in expressing the meaning of a certain type of term in regression models (a standalone term in a model where there is also an interaction term). A significant result is reported based on a number that is generally meaningless. If such a source were cited in a WP article, to what extent would the editor be entitled to write into the article the mathematically correct meaning of the piece of data in question rather than the expression of that meaning as given by the authors? After all, those authors are making a common and easily spotted error in how to express in English the meaning of a number. There would be no need for any complex statistics to be done by the WP author; he or she would merely need to understand what this element in an expression means in somewhat the same way as I understand what percentage change means.

One possible distinction is that in my case, it is clear that the object that I have in mind when I say "percentage point" is the same as the authors have in mind when they say "percent" -- they are simply using the wrong term for that shared object. In the above example, there are consequential differences: the conclusion that the authors draw about the numbers is, in a fairly strict and universal sense, invalid. However, there is nothing wrong with the numbers themselves, and the publication of those numbers is as much part of the academic paper as the publication of the words that the authors place before and after the numbers. So, when could a WP editor express those numbers in English in a way that is robustly defensible through any process of consensus or inquiry but not necessarily compatible with the views of the authors of the source themselves? It would seem odd if the outcome of WP policy for an edit would be different between two scenarios: (A) where a source simply publishes a table of numbers, which the WP editor correctly translates into English for an article; and (B) where the same source also publishes a commentary on those numbers that is erroneous in either fact or language (and does it matter if that commentary comes in the same document or even contemporaneously with the publication of the numbers?).

I am thinking that there should be latitude for editors to do this -- after all, the facts reported in plenty of outdated Victorian scholarship still often serve useful purposes even when the contextual views of those authors in interpreting the facts would not stand up to any current consensus scrutiny and so would be left out. The difference here is that, especially for more recent or rare types of research, there will be no explicity counter-commentary article published to accept and incorporate a source's facts while disputing its authors' interpretations of those facts.

So while it may not be hugely problematic, it feels like it could be, which is why I am posting this for discussion. AdamColligan (talk) 06:03, 17 April 2013 (UTC)

First let me address your question... I don't think it OR to correct the source as you have done... it is obvious that the source simply misspoke... writing "percent" when they should have written "pecentage points".

Now let me question something a bit more fundamental... should we be using the source in question at all? Correct me if I have misunderstood something, but you say the source is a draft report. Drafts of documents are not reliable sources, and I don't think we should be basing information on them (we should wait until the final, clean document is published... for all we know, that final document may correct the very error you are concerned about). Blueboar (talk) 19:02, 17 April 2013 (UTC)

Firstly, I apologize for having pasted the wrong link to the polling report and org name in my post above -- it's fixed now.

On the specific case: I was using the term "draft" loosely, and it may not have been the right term to use. If you click on it, you can see that it is labeled as a guideline that can be further customized. However, it is a full public release to the press, even if it is not a polished academic analysis of the findings. Because the statement in the WP article is really only fundamentally concerned with the numbers -- and there is little reason to believe that the numbers themselves could be subject to any change -- I think it would probably be a mistake to exclude this data as a source.

On the general case, I agree with an indirect implication of your statement, which is that the presence of such sloppy errors can undermine the credibility of the source a priori. However, plenty of final, clean reliable sources also contain erroneous interpretations of data or mistaken uses of language. The question here is really: from a WP editor's point of view, provided that the numbers in the data table are final and simply are what they are, does it really even matter if the article authors in their own narrative, either in this document or in some future report or statement, describe the numbers the right way, the wrong way, or no way at all? AdamColligan (talk) 19:23, 17 April 2013 (UTC)

I'd say percentage points, which is what the author of the press release doubtless meant. (Add your own snarky comment here about people going into journalism and its corporate cousin, public relations, because math is too hard.) But I don't really believe that you should be citing a press release for this material at all. WhatamIdoing (talk) 20:05, 17 April 2013 (UTC)

Well, it wasn't my citation originally; I just found the statement, checked the footnote, and discovered the mistake. If you click on the link, you'll find a document that sits somewhere in between the usual concepts of press release and full-fledged academic report. I'm not sure that any of the terms "draft," "press release", or "paper" accurately describe it. What is clear is that it contains finalized numbers and a description of the methodology used to derive them. Provided that the provenance of the numbers themselves is not disputed, then their integration into something called Press Release, Final Report, Conference Paper, etc. is all about the analytic interpretation and narrative that authors place around the numbers. In the WP article, the statement is simply a basic description of two of the data points. The core of the discussion that I've wanted to start here is the suggestion that in a case like this, that peripheral analysis may be considered irrelevant, even to the point where the description in the WP article can contradict those authors' expressions (as I have done by changing percent to percentage points).

So if your point is that the status of the document makes the provenance of the numbers inherently unreliable, then that might be something interesting that the community should flesh out. But if the reason that such a source is unreliable is that you can't count on the narrative analysis because it is not properly vetted, then I have to return to the original point, since essentially what I'm asking about is the editorial right to simply ignore all that source narrative and just directly translate the source data into WP English. AdamColligan (talk) 21:05, 17 April 2013 (UTC)

Being a press release doesn't make the numbers worse. It does, however, make the number be self-published and non-independent and therefore possibly WP:UNDUE. WhatamIdoing (talk) 03:46, 18 April 2013 (UTC)

"Directly related" exceptions?

I quote "directly related" a lot in cases where someone is trying to say that A = B when there's no source that reliable source that says so, or a source is talking about A in one place and B in another but does not directly relate them.

However, lately ran into examples where people ticked off I cleaned up a highly WP:OR article after bringing people in from WP:ORN are making the same case for examples like:

background - two or three referenced sentences explaining the background of an article/issue/campaign/etc. that don't happen to mention the group so people will understand what the heck's going on)
or outcome - where three WP:RS say a group is prime mover behind a campaign, but then when only one article mentions the campaign was successful but that one article doesn't happen to mention that group, only the fact that some group(s) were involved. So they say you can't mention the campaign succeeded.

Now there must be exceptions in these types of cases, but I can't figure out what they are. Wikipedia:Context#Overlinking_and_underlinking at least mentions background, but only in context of linking. An extension of "Exists" except that ref'd info that is uncontroversial and relevant can be entered? Thoughts?? CarolMooreDC🗽 17:34, 24 April 2013 (UTC)

Hearing no response, I'll assume its more an NPOV issue than a WP:OR issue.CarolMooreDC🗽 22:38, 27 April 2013 (UTC)

You have been kind of vague Carol... it's hard to say whether this is an NOR issue without knowing the specifics. Blueboar (talk) 22:40, 27 April 2013 (UTC)

So it really is more a case by case thing that should be brought to NPOV or OR (depending on which view I have of it :-) That's good enough for me if so and will deal with in couple instances (still in progress) when/if continue to have problems. Thanks. CarolMooreDC🗽 16:15, 28 April 2013 (UTC)

Original research as a source

What if the published source that you cite is itself original research? Many published research studies and dissertations are themselves original research, so citing them as sources would seem to go against the policy, however I can't see why we should not use this kind of information.

I think it should be clarified that the purpose of the no original research policy is to prevent users from posting their individual and often unscientific observations. — Preceding unsigned comment added by 184.58.124.116 (talk) 00:56, 28 April 2013 (UTC)

The rule is that Wikipedia editors can't commit original research themselves, it has nothing to do with other sources. A reliable source that commits original research being cited would not be a violation of this rule.--174.95.111.89 (talk) 02:28, 28 April 2013 (UTC)

To be more clear summarizing a published research paper would not violate this policy. There may be other issues with a particular study but this is no one of them.--174.95.111.89 (talk) 02:35, 28 April 2013 (UTC)

If someone who is a knowledgeable expert in their field writes a research paper or dissertation, they are accumulating primary sources and other secondary sources to create another secondary source (hopefully one that offers new light on a subject and not just repeating the theories of earlier scholars). Wikipedia prefers that editors cite reliable secondary sources--see WP:PSTS. If in an article you as an editor/contributor cite that secondary source to support the interpretation of a fact or other analysis, that is not original research. In analyzing a topic, we write objectively on the theories and interpretation of others (especially scholars), we do not put forward our own theories or interpretation. --ColonelHenry (talk) 13:03, 28 April 2013 (UTC)

Question for FAC: Does it violate WP:NOR/WP:SYN by counting lines of poetry?

An article I did a lot of work on, Duino Elegies, a collection of ten elegies by Rainer Maria Rilke, is almost through the FAC process and one reviewer asked if a line count could be inserted into the article somewhere. I'm not averse to the suggestion, but i do have questions about how to incorporate it, whether it offers much to a reader besides a mere factoid, and most importantly--if it would violate WP:NOR/WP:SYN. Most articles on modern poems doesn't include it (the exception I found was Eliot's The Waste Land). Largely, sources available on modern poetry are more interested in abstract issues, imagery, philosophical questions, and references to other poets/arts/historical happenings and do not focus on issues of prosody and scansion--the mechanics of line and syllable counts.

So the core question is:

Would it violate WP:NOR/WP:SYN if I counted the lines of poetry in Duino Elegies and mentioned it in the article on the work?

FYI: Total in German original is 859 lines (each of the ten elegies: i. 95, ii. 79, iii. 85, iv. 85, v. 108, vi. 45, vii. 93, viii. 75, ix. 80, x. 114). Average 86 lines each.

Thanks for your anticipated guidance on this issue. --ColonelHenry (talk) 22:10, 27 April 2013 (UTC)

Personally I think the number of lines in a work of poetry is trivial information, and would omit it as irrelevant... but, as for violating WP:NOR or WP:SYNT? No, it wouldn't. It is something any one can verify by looking at a published copy of the work itself (ie the primary source). Blueboar (talk) 22:29, 27 April 2013 (UTC)

Blueboar is right. The poem is a reliable primary source for the number of lines it contains. See WP:USINGPRIMARY for some examples of what's okay and what's not. WhatamIdoing (talk) 19:03, 30 April 2013 (UTC)

Historic Photographs

Would the posting of a previously unpublished photograph of a historic event be considered original research? Example: I have a high resolution group photograph of the 96th Aero Squadron from 1919. The article only has photographs of officers included and my photograph depicts over 100 enlisted men as well as some officers. I'm thinking that since the photograph incorporates a placard with identifying information being held up by some of the persons in the photograph that it could be posted without a caption other than a general one and let the photograph speak for itself. Or, alternatively, would I have to first get the photograph authenticated and then published elsewhere first? Since I will be performing an image restoration (my print is badly damaged) I believe I will be the copyright holder of the restored version though I am eager to grant Wikipedia and it's users virtually unlimited non commercial non exclusive rights. I supose I could be lazy and simply post it then wait to see if somebody takes it down and offeres an explanation so I might better understand the rules.

Uploading a photograph is not original research. Adding the photograph to an appropriate article with a neutrally-worded caption is not original research either. If you upload the photograph, we will assume that it's genuine unless there's evidence to the contrary—you need not get it authenticated first.
Copyright in a photograph belongs to the photographer, unless the photographer died more than seventy-five years ago in which case the photograph will be in the public domain. If you perform an image restoration, then you will have created a derivative work and the copyright situation will be as described in derivative work.
You cannot upload an image to Wikipedia with a non-commercial licence restriction. Or rather you can, but it will be deleted because of that restriction. The Wikimedia Foundation requires genuinely free content (libre as well as gratis), which should be uploaded to Wikimedia Commons. In certain restricted situations you can upload non-free content to Wikipedia (not Wikimedia Commons) under a claim of fair use. The rules governing this are here.
I'm sorry this is so complicated, and personally I would prefer it if we allowed non-commercial licences.—S Marshall T/C 11:03, 12 May 2013 (UTC)

Synthesis of published material that advances a position?

Just realized that the word "that" in the header of this section could be confusing... I think we mean "Synthesis of published material to advance a position". After all, it wouldn't be synthesis if the position is actually advanced by the published material. Any objections to changing this? Blueboar (talk) 15:44, 30 April 2013 (UTC)

Right. In two places. Zero^talk 16:24, 30 April 2013 (UTC)

Shall we take bets on how long it will be before the first dispute over whether intentionality is now required to violate the policy? ("I realize my synthesis does advance a position, but I didn't do it to advance the position; it just kind of happens to advance it.") WhatamIdoing (talk) 19:07, 30 April 2013 (UTC)

Is there inherently a difference? If the mens rea is purposeful/knowing or reckless/negligent/accidental? This seems to me to be akin to the difference between voluntary and involuntary manslaughter.--ColonelHenry (talk) 22:58, 4 May 2013 (UTC)

If we restrict the policy to prohibiting only those combinations of sources done "to advance a position", i.e., "for the purpose of advancing a position", then, yes, there will be a difference.

I understand Blueboar's problem. There are two possible reading of the current sentence. We might be prohibiting "(synthesis of) (published material that advances a position)"—meaning that you, the Wikipedia editor, may synthesize as much as you want, so long as the published sources you're citing don't try to advance a position—or we might be prohibiting the "(synthesis of published material) (that advances a position)"—meaning that the spurces may do whatever they want, but you may only synthesize published sources if your synthesis does not advance any position whatsoever. We mean the second, but I can't find an elegant way to indicate that without creating new potential misunderstandings. WhatamIdoing (talk) 23:55, 9 May 2013 (UTC)

Perhaps ""Synthesis (of published material) that advances a position"? Blueboar (talk) 00:31, 11 May 2013 (UTC)

To me the current one sounds clear that "that advances a position" refers to the synthesis, not the published material. Took me a while to figure out how my brain works (or malfunctions :-) on this but here it is: I think it's because "synthesis" means combining multiple items, thus making "published material" appear to refer to multiple items. "Advances" (i.e. not "advance") refers to something singular, (not multiple items) which means that the only think left that it can refer to is "Synthesis". North8000 (talk) 02:08, 11 May 2013 (UTC)

This concern is no longer hypothetical.

What about changing it to something like: Synthesis of published material in ways that advance a position or Synthesis of published material that serves to advance a position? The latter is the phrase used in the lead. WhatamIdoing (talk) 20:04, 19 May 2013 (UTC)

Synthesis and definitional questions

There is a dispute which raises some synthesis policy questions. The dispute in question is currently an WP:RFC at Talk:Sockpuppet (Internet)#Orlando Figes' sockpuppetry? as to whether the historian Orlando Figes belongs on the page. But the argument itself (as far as I can tell) largely revolves around differing readings of synthesis, which may be of interest to policy editors.

The word ("Sockpuppet" in this case) is defined on the page, with sourcing. The example is outlined, again with sourcing. The consensus (as far as I can tell) is that the example meets the definition. But the sources for the example never explicitly use the word. Does that constitute a synthesis?

In either case, is our current synthesis section clear enough?

Thank you. --Andrewaskew (talk) 03:46, 16 May 2013 (UTC)

If source #1 says "Foo is when someone bazt" and source #2 says "Joe Film was definitely bazting last March", then you are permitted to say "Joe Film did Foo last March". This falls under the category of "writing in your own words". WhatamIdoing (talk) 23:50, 16 May 2013 (UTC)

Not so fast, WAID... you are assuming that source #1 is correct when it says "Foo" is synomymous with "bazt". But often there is an opposing source (#3) that disagrees with the definition of "Foo" laid down by source #1. What if source #3 discusses subtle distinctions between "Foo" and "Bazt" (ie says that "Foo" and "Bazt" can overlap, but are not completely synonymous)... in such a case, Wikipedia can not definitively say "Joe Film did Foo last March".... at least not with out a whole bunch of caveats and attributions. Blueboar (talk) 12:24, 17 May 2013 (UTC)

It's possible that it would be more complicated than that, although I doubt that this is the situation in the instant case. WhatamIdoing (talk) 20:05, 19 May 2013 (UTC)

The de facto answer is that if nobody objects, it is sumarization or writing in your own words and is the backbone of Wikipedia writing. If someone objects, then it will be noted that it is technically wp:syntheses / a violation of wp:ver/wp:nor. North8000 (talk) 20:35, 19 May 2013 (UTC)

If someone objects, is that technical violation enough that the example should automatically be removed? Or is there room for further discussion and consensus? Andrewaskew (talk) 04:20, 21 May 2013 (UTC)

There is always room for further discussion, compromise and consensus. Have you considered re-writing the paragraph as an alternative to a dualistic remove vs keep? Blueboar (talk) 14:25, 21 May 2013 (UTC)

Sources that contain both primary & secondary material (interviews)?

Consider -- a news reporter (or whoever) wants to do an interview with a subject. In preparing for the interview, background investigation is done and the interviewer gets biographical info. The interview is accomplished, is edited, and published (like in a RS newspaper). The published interview contains both background data and the quoted dialogue. Doesn't that interview contain both primary source material and secondary source material? Well, if this is correct, how do we parse the primary & secondary aspects of the interview? As written, the guidance classifies interviews as primary, and doesn't address the acceptable secondary source aspects of many published interviews. – S. Rich (talk) 13:54, 21 May 2013 (UTC)

Sure... most news sources contain both primary and secondary material. The issue is whether that makes any difference. It is important to remember that we are allowed to cite primary sources... we simply have to be careful when using them. "Primary" does not mean "bad". What is really important isn't whether the source is primary or secondary, but whether it appropriately supports what is being said in our article (ie whether we are misusing it to support Original research). To know that, we need context... we need to know the specific statement that is being supported by the source. Blueboar (talk) 14:19, 21 May 2013 (UTC)

Thanks! I quite agree as to using care with primary sources. I raise this because I've seen recent discussion where editors want to reject a RS article containing an interview, as primary, not on the basis of indiscriminate usage of interview dialogue. And they cite "primary" as the rationale. I'm thinking of adding a small caveat to the "interview" guidance in footnote 3. Something like "dialogue from interviews" or "quoted dialogue from interviews" or "quotations from interviews". What do you think? – S. Rich (talk) 14:33, 21 May 2013 (UTC)14:57, 21 May 2013 (UTC)

You might want to read WP:USEPRIMARY.

You are making unwarranted assertions about what constitutes an interview. An interview could equally be an unprepared and even anonymous.[5] Furthermore, in typical interviews of the type you mention, the source of the bio is likely to be the interviewee (or his publicist).

Editing does not turn something into a secondary source. Back in 2006, this policy said this, which may be easier to understand: "Secondary sources present a generalization, analysis, synthesis, interpretation, explanation or evaluation of information or data from other sources." Re-writing biographical material or editing someone else's work doesn't do any of those things. WhatamIdoing (talk) 06:17, 26 May 2013 (UTC)

Technical term for PST

Articles appear to have a classification state of being either primary, secondary, or tertiary.

Is there a collective term we can use to describe this status? For example if they were red/blue/yellow we could say 'color'. All that comes to mind at the moment is "ranking" but it sounds too vague.

Any suggestions? I would like it if we could come up with a sensible term for this and utilize it in the discussion of WP:PSTS. What do we label the state of being prime or second or third? Ranze (talk) 00:17, 22 May 2013 (UTC)

I believe that it's called "source classification" in historiography, but I'm not sure that this is quite what you're looking for. WhatamIdoing (talk) 06:06, 26 May 2013 (UTC)

Primary sourcing from laws

Is it incorrect to primarily source laws and treaties? For example, when making a map of the changes to a country, my practice has been to go as primary as I can, linking directly to the treaty that cedes land from one country to another, for example. Is this not the best practice? When what I'm dealing with is purely a matter of law and treaty, isn't it sufficient to source directly from the law or treaty?

There are many pitfalls when sourcing directly from laws. The law might be superseded by a different law. The law might have been declared unconstitutional, but remain on the books. A court might have interpreted it to mean something quite different than it appears to mean. Or it might just be ignored. Jc3s5h (talk) 17:48, 30 May 2013 (UTC)

Being superceded doesn't matter, since a contemporary secondary source would be just as faulty in that regard. Same deal with being declared unconstitutional, or being reinterpreted. If I get a secondary source (like from a newspaper) published about a law on the day it was passed, and it's declared unconstitutional 10 years later, using the secondary source doesn't exactly save me from a pitfall that using the primary source would have otherwise caused. Now, if any of these things happened, I would of course update accordingly with the proper sourcing, but none of what you said is an argument against primary sourcing, it's merely an argument against contemporary sourcing, rather than using updated resources.

Furthermore, for my timelines (like Territorial evolution of the United States), what happened later is kind of irrelevant. It doesn't matter if the treaty annexing the Gadsen Purchase gets overturned in 2050, it was part of the country from that time til 2050. What happens later is only immediately relevant to later points in the timeline. --Golbez (talk) 18:48, 30 May 2013 (UTC)

Newspapers are generally independent sources, but a news article that reports the passage of some law is primary. See WP:PRIMARYNEWS, and WP:Secondary does not mean independent. WhatamIdoing (talk) 22:23, 5 June 2013 (UTC)

Clarification regarding primary sources

This section of the policy states:

"Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources, though primary sources are permitted if used carefully. Material based purely on primary sources should be avoided. All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source"

In my opinion some of this paragraph applies to entire articles, and some to portions of articles, and just which is which is unclear.

I believe the last sentence of the policy is very appropriate to both articles as a whole and in part: "interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source'".

I believe the first sentence: "Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources, though primary sources are permitted if used carefully" is intended to apply to entire articles. It should be made clear that 'notability' of the material using secondary or tertiary sources is not necessary for sub-sections. An interpretation insisting on notability requirements for sub-sections interferes with breaking up discussion into digestible portions.

However, the most difficult sentence is the second-last: "Material based purely on primary sources should be avoided." This not good policy for portions of articles, and, taken literally, would flag as violations even verbatim quotations from primary sources used for illustration or eloquence. This sentence either should be removed, or the context in which it applies should be made clearer. Brews ohare (talk) 21:08, 5 June 2013 (UTC)

Instead of "Material based purely on primary sources should be avoided." how about "conclusions based solely on primary sources should be avoided." Rjensen (talk) 22:12, 5 June 2013 (UTC)

I agree in practice there plenty of scenarios where even (good) articles can hardly avoid primary sources. Consider that in a historic context newspaper and magazine articles are consider primary sources, however many articles in particular biographies can hardly do without them.--Kmhkmh (talk) 22:24, 5 June 2013 (UTC)

Newspaper and magazines are secondary sources (except about themselves). The older they are the more careful you have to be using them as language evolves over time while the things readers (of newspapers and magazines) can be assumed to know changes. But they are still valuable secondary sources and often the only ones for biographies of people from previous decades.

As for "Material based purely on primary sources should be avoided", material from primary sources, including quotations, should only be used if a secondary source establishes its relevance or pertinence. If it's illustrating a point then that is drawing a conclusion from it that must be supported by secondary sources. If it's unclear what point it's making it should not be used, as unclear content does not belong in an article.--JohnBlackburne^words_deeds 22:48, 5 June 2013 (UTC)

See WhatamIdoing's posting below. Historians usually treat newspapers and alike as primary sources and that holds for contentompary history as well and not just century old newspapers. In particular for contemporary history we have the issue i've just mentioned. Clearly primary sources need to be handled with care and clearly (good) secondary sources are preferred, but they are too many scenarios where secondary sources are not available in sufficient degree.--Kmhkmh (talk) 15:44, 6 June 2013 (UTC)

Rjensen's suggestion "conclusions based solely on primary sources should be avoided" is not clear; does Rjensen mean that Wikipedia editors should not make conclusions based solely on primary sources? No, they shouldn't, but they shouldn't make conclusions based on secondary sources either (except the most elementary and obvious conclusions, such as if 1000 people live in a county of 100 square miles, the population density is 10 people per square mile). But if Rjensen means that if a primary source contains a conclusion, that primary source can't be used in Wikipedia, that's nonsense. That's an important activity of primary source authors, making conclusions. Even if there were some valid objection to primary sources that contain conclusions, this would be the wrong policy to forbid them, because it isn't original research. Jc3s5h (talk) 23:13, 5 June 2013 (UTC)

Most newspaper stories are not secondary. Here is a sampling of university-based sources that address the question:

"A newspaper article is a primary source if it reports events, but a secondary source if it analyses and comments on those events." [6]
"Characteristically, primary sources are contemporary to the events and people described [e.g., like a newspaper article on a current event]... Examples of primary sources include...newspaper ads and stories. In writing a narrative of the political turmoil surrounding the 2000 U.S. presidential election, a researcher will likely tap newspaper reports of that time for factual information on the events. The researcher will use these reports as primary sources because they offer direct or firsthand evidence of the events, as they first took place." [7]
"There can be grey areas when determining if an item is a primary source or a secondary source. For example, newspaper journalists may interview eyewitnesses but not be actual eyewitnesses themselves. They also may have completed research to inform their story. Traditionally, however, newspapers are considered primary sources…. Examples of common primary source formats can include...contemporary newspaper articles…. Newspaper articles, although often written after an event has occurred, are traditionally considered a primary source…. " [8]
"Examples of primary information: A current news report that is reporting the facts (not analysis or evaluation) of an event." [9]
What are primary sources? Published materials (books, magazine and journal articles, newspaper articles) written at the time about a particular event. While these are sometimes accounts by participants, in most cases they are written by journalists or other observers. The important thing is to distinguish between material written at the time of an event as a kind of report, and material written much later, as historical analysis." [10]

I realize that there are some significant differences between academic disciplines (law, for example, does not believe that tertiary sources even exist), and that there are some nuances (e.g., an analytical piece is secondary, even if it happens to be printed in a newspaper), but the fact is that most of the academic world believes that most newspaper articles, and especially those doing simple, non-transformative, non-analytical basic reporting of facts (e.g., "WhatamIdoing's Gas Station caught fire last night") are primary sources. See WP:PRIMARYNEWS for more. WhatamIdoing (talk) 06:31, 6 June 2013 (UTC)

Just to confuse the situation with news sources further... the age of the source can be a factor in determining the primary/secondary classification. Consider a newspaper column that analyses or comments on a recent event... we would probably classify it as a secondary source, right? Now consider a similar column that was was written two hundred years ago, analyzing and commenting on something that happened at that time. Historians would classify it as a primary source.

And just to make this even more confusing... Now, consider a column from two hundred years ago, analyzing something that happened three hundred years ago... that would probably still be classified as a secondary source - but it would be considered an outdated secondary source (and probably not very reliable). Blueboar (talk) 14:57, 6 June 2013 (UTC)

Actually imho this case is sorta both, you can treat that as a most likely outdated secondary source or as primary source (used by current secondary sources). That scanrio for instance is true for many writers from antiquity (in particular ancient historians), whose works are secondary sources (as they themselves compiled and analyzes other (often lost) sources). However from the perspective of a current article they become primary resources of sorts.--Kmhkmh (talk) 15:51, 6 June 2013 (UTC)

@WhatamIdoing,You have made such claims before, but surly if a source is using primary sources to report on something then it is not a primary source it is a secondary source. Taking into account Blueboar's observations and talking about issues that are current and have not yet entered history: When a Chancellor of the exchequer stands up in Parliament and gives a budget speech, Hansard and the direct electronic recording are primary sources (as would be the notes from which he read). A verbatum copy of that speech in a newspaper is an unreliable copy of a primary source -- try to defend not paying the correct tax based on a typo in the Times as see how far you get! -- but a summary of that speech in a newspaper is not a primary source it is a secondary source because the act of summarising makes it so, just as summarising secondary sources makes this a tertiary source. If a reporter states "I saw five men shot by xyz", then that is a primary source. If the newspaper reporter reports that "a government spokesman states that five men shot by xyz, but there is not independent source to confirm this statement" then that is a secondary source. -- PBS (talk) 15:20, 6 June 2013 (UTC)

Well independently whether you personally agree with WhatamIdoing, it is definitely not just his claim, but an opinion/notion held by many people in wikipedia and academia. Which álso shines light on another issue, that people within wikipedia do not quite agree on the exact nature of primary and secondary sources (nor do probably people from different fields in academia). As consequence of this some of guidelines might need to be more concrete rather than vaguely talking about primary/secondary sources and leaving to each's personal notio what that might mean in a given context.--Kmhkmh (talk) 15:59, 6 June 2013 (UTC)

Our sourcing policies as well as our notability guidelines are written in mind with Whatamidoing's analysis, that most newspapers simply reporting on facts are primary sources, and that's been the way for some time. There are some that disagree with that (taking the approach "one step removed" is generally sufficient) but this typically does not win out in consensus discussions. Again, primary sourcing should not be considered bad, just that it fails to meet other aspects of our policy/guidelines (eg original research, notability). --MASEM (t) 16:06, 6 June 2013 (UTC)

@WhatamIdoing, those are the policies of other institutions, while WP:PRIMARYNEWS is an essay you've written not policy. I think PBS above summarises it well: a verbatim copy is a primary source, as is an eye-witness report. But once the newspaper or news organisation summarises it, using editorial discretion and judgement on what to include, it's a secondary source. And almost all newspaper reporting is like that: they like to give the impression that they have reporters on the ground but they rarely do, especially now when wire services and local news services are ready and reliable sources.--JohnBlackburne^words_deeds 16:09, 6 June 2013 (UTC)

PBS, I think you need to read WP:LINKSINACHAIN. If mere repetition of a fact, in slightly different wording, turns the first repetition into a secondary source, then what do we have when are citing the eighth repetition? Are we going to invent a concept called octonary sources?

PBS and JohnBlackburne, you can call it just my claim, and you can call it the policies of unrelated universities, but I've given you sources for my claim, and you've produced none. Furthermore, most editors agree with those academic sources and disagree with your assertions. A secondary source is not simply repeating what the other guy said, or even repeating your favorite parts of what the other guy said. It's a transformative intellectual product. Repeating basic facts is not transforming them. WhatamIdoing (talk) 16:18, 6 June 2013 (UTC)

@WhatamIdoing I made two distinctions the first was that the newspaper is not yet part of the historical record and the second was the difference between a newspaper's copy of a budget speech and a newspaper's summary of that speech, the former being an unreliable primary source and the second a secondary source. I fail to see what WP:LINKSINACHAIN is supposed to add to that. -- PBS (talk) 17:40, 6 June 2013 (UTC)

To draw the discussion back to my original concern, the statement in the policy that "Material based purely on primary sources should be avoided." I felt that some context was needed here. For example, it could be changed to say: "No Wikipedia article should consist in its entirety of material from one primary source or one author."

Blackburne has proposed the following:

"As for "Material based purely on primary sources should be avoided", material from primary sources, including quotations, should only be used if a secondary source establishes its relevance or pertinence. If it's illustrating a point then that is drawing a conclusion from it that must be supported by secondary sources. If it's unclear what point it's making it should not be used, as unclear content does not belong in an article.

This position strikes me as too strict. If the subject is the background of the Uncertainty principle, do I need to establish using a secondary source that a quote from Heisenberg is relevant or pertinent? How about this:

In a letter of 8 June 1926 to Pauli, Heisenberg confessed that "The more I think about the physical part of Schrödinger's theory, the more disgusting I find it".

Do I need more than the (possibly primary) source where the quote can be found? Are we to assume that, lacking further support, what Heisenberg thought is trivia, like what he thought about ice cream? Maybe it's sufficient that somewhere this quote was thought worthy of record? What about some more direct source like The Physicist's Conception of Nature, written by Heisenberg himself? Brews ohare (talk) 16:25, 6 June 2013 (UTC)

I don't think we can write a policy, or a guideline even, that says when the use of primary material is justified or when too much is too much. There are case examples on both sides I think we can develop - we can, for example, rely on primary sources to review the details of a notable event, while on the other side, we can't use primary sources to go into infinite detail on fictional characters. But there's a huge grey area in between. The optimal case is that primary and secondary/tertiary sources should be intermixed as appropriate, but as to what degree or the like, there's no way we can simply quantify that for all topic areas on WP. It's a "I know it when I see it" type problem. --MASEM (t) 16:49, 6 June 2013 (UTC)

I somewhat agree but the policy's wording need to reflect that somehow.--Kmhkmh (talk) 17:20, 6 June 2013 (UTC)

The best I could suggest is a guideline in the same way WP:NFC is a guideline on case studies set by WP:NFCC policy. I would definitely keep such advice off WP:NOR (policy) though this should be linked within it. --MASEM (t) 17:25, 6 June 2013 (UTC)

I think that this conversation show problem with the interests of different disciplines does not the restriction of "primary sources that have been reliably published may be used in Wikipedia" cover the concerns of the type of data mining suggested here? "Material based purely on primary sources should be avoided." is there to stop someone publishing Original Research in many fields. For example I have come across editors who either want to praise or bury a flamboyant but controversial figure such as Orde Wingate by stringing together primary sources from archives to "prove" that the secondary sources are wrong. So while for a scientist's biography "No Wikipedia article should consist in its entirety of material from one primary source or one author." it might be adequate, it is useless for most military biographies. -- PBS (talk) 17:40, 6 June 2013 (UTC)

If one is stringing together a number of primary sources to come out with a conclusion that could only be considered an analytic or critical result, that is WP:SYNTH and fails core policy. That doesn't need any clarification on how many primary sources or how long a stretch of material is used to do that, that's simply wrong. --MASEM (t) 17:46, 6 June 2013 (UTC)

Just noting here that most newspaper articles are secondary sources, because written by uninvolved people. SlimVirgin ^(talk) 19:50, 6 June 2013 (UTC)
"Just noting here" once again that WP:Secondary does not mean uninvolved. A meta-analysis is always a secondary source, even if you're doing the meta-analysis on studies you were previously involved in. Gossip repeated verbatim in your diary (or the modern equivalent of a blog) does not magically become secondary just because you're "uninvolved". WhatamIdoing (talk) 21:14, 7 June 2013 (UTC)

Gossip posted on a blog would make the blog a secondary source, just not a reliable one. SlimVirgin ^(talk) 21:19, 7 June 2013 (UTC)

- Sorry, no. On WP, we don't consider secondary sources as being from someone uninvolved. That makes them independent and likely third-parties, but not necessarily a secondary source. We need transformation of facts into a novel statement, that's the metric we have chosen for WP. (There are several other possible ways we can define what is primary and secondary, but we have chosen this appropriate which aligns with most academic fields). --MASEM (t) 21:25, 7 June 2013 (UTC)

The applicability of the sentence "Material based purely on primary sources should be avoided." in the present policy has been described as having a "huge gray area" in which it is unclear whether it applies or not. Maybe that is an indication that it should be removed. I think the remainder of the policy will work fine without this statement. This statement is better described, not as having a "huge gray area" where it 'might' apply, but as having only a narrow area where it clearly does apply, and a huge opportunity for abuse. Brews ohare (talk) 20:05, 6 June 2013 (UTC)

I removed that sentence, because it's too sweeping. It was added here in August 2011. I also removed links to two essays. I think we need to keep these sections relatively simple and not give the impression that primary sources are never allowed. They just have to be used with caution. SlimVirgin ^(talk) 20:59, 6 June 2013 (UTC)

I've added a footnote (footnote 3 here) with quotes from academic sources/libraries about primary sources, along with examples. SlimVirgin ^(talk) 21:54, 6 June 2013 (UTC)

So just to be clear here (given that I know the various content disputes around Brews on this issue). Masem says above that stringing to gather a number or quotations form primary sources to draw a conclusion not explicit and unambiguous in nature is Synth. Further secondary sources should be expected in the case of any general summary of a field for which the primary sources are considered illustrative. ----Snowded ^TALK 08:18, 7 June 2013 (UTC)

At some point in most of our past discussions about primary sources, I say something about original language and intent... so here is is again:

When the phrase "primary source" first appeared in this policy (which was also its first appearance in WP policy as a whole), it appeared in a very different and much simpler context than it does today... that context was in essence: Don't turn Wikipedia into a primary source. In other words, our mention of the term wasn't about the primary/secondary nature of our sources... it was a statement about the nature of our content. It might help if we go back to this original concept.
Using primary sources does not automatically result in OR (although doing so certainly increases the likelihood)... and using secondary sources does not automatically prevent OR (secondary sources can be misused). That's because NOR isn't about the sources... its about article content. It's about how we (appropriately or inappropriately) use the sources. Blueboar (talk) 17:15, 7 June 2013 (UTC)

(reply to Snowded) Yes, Masem is right about that. If we're talking about philosophy, the idea is that editors shouldn't be doing philosophy on Wikipedia using primary sources. Instead we should be reporting what secondary sources say about those primary sources, except where using the primary sources directly is not problematic, or is necessary for some reason, or is clearly preferable. But in cases of dispute, editors should defer to the secondary academic literature. That's what it means to be educated in a subject, that you know what the primary source material says and what others in that field have said about it, and the idea is to sum that up for the reader.

Too much of a focus on primary sources can mean that an editor is not familiar with the field, and this is one of the reasons that relying on primary sources often leads to problematic editing, with editors interpreting the primary sources in their own way and reaching conclusions no secondary source has reached. SlimVirgin ^(talk) 19:08, 7 June 2013 (UTC)

Or, to put it another way... the inappropriate use of primary sources can easily result in an editor performing his/her own analysis, and reaching conclusions that are not found in any source... and when you say something that no source says, you turn Wikipedia into a primary source for that statement. And that is called performing Original Research.Blueboar (talk) 19:28, 7 June 2013 (UTC)

Not sure that I'd agree that OR = primary source. .

Regardless of what one thinks of it, the thought that it's to avoid Wikipedia becoming a primary source is certainly very different than what we have now regarding primary sources.

I think that the current primary/secondary distinction/treatment is good but overemphasized. North8000 (talk) 20:25, 7 June 2013 (UTC)

Blueboar, it depends what you mean by inappropriate. Sometimes primary sources used carefully will produce an article that isn't policy compliant, because of an over-reliance on them. Someone writing about Plato, using Plato's own work purely descriptively (Plato wrote this, and this, and this) would have produced a non-compliant article because of the dearth of secondary commentary, and would have done so without turning WP into a primary source. Against this, it's rare to see an editor accused of over-reliance on secondary sources; that tends to happen only where the secondary sources are found to be in error. SlimVirgin ^(talk) 20:42, 7 June 2013 (UTC)

I've just reverted the massive changes by SV. It's not that I don't appreciate the bold effort to fix things, it's that it's partly wrong, and it reintroduces the "secondhand" idea that we specifically rejected last fall as confusing people and supporting this conflation of "secondary" with "independent".

Look: if I tell you that I'm wearing a red shirt, then my report is primary and non-independent. If you see me and say that I'm wearing a red shirt, then your report is still primary, but independent.

This is not actually a complicated concept. If these words were identical, then we would only use one of them and not require that notability be supported by sources that were both independent and secondary. We require both of these characteristics because they're different concepts. WhatamIdoing (talk) 21:22, 7 June 2013 (UTC)

WAID, I've restored the academic refs I added, and removed the links to the essays (which, as I recall, you wrote; apologies if I have that wrong). We can't have two essays prominently linked to explain policy, and at the same time remove academics refs from the footnote. SlimVirgin ^(talk) 21:25, 7 June 2013 (UTC)

Sure we can use essays to explain policy as long as they are generally agreed they reflect proper interpretation of policy but are not used as policy/guideline. --MASEM (t) 21:29, 7 June 2013 (UTC)

But these don't reflect the policy. Even when essays do, at the time of insertion, reflect policy, they can easily be changed, so the best thing to do is explain the policy on the policy page, rather than linking to essays that may not have consensus. SlimVirgin ^(talk) 21:31, 7 June 2013 (UTC)

Actually, they do reflect the policy, which is why everyone's telling you that you're wrong about whether secondary is a synonym for independent. WhatamIdoing (talk) 21:34, 7 June 2013 (UTC)

I was just going to add a specific objection to SV's reliance on Willie Thompson, because his definition of "secondary" is remarkably divergent from anyone else's. You'll find the definition on p. 79 of the book she cites:

[They] "will have as their first undertaking to read all feasible 'secondary'—i.e., already published—texts"

According to Willie Thompson, every single post at every single personal blog is 'secondary', because it's "already published".

This is not the definition that we use on Wikipedia. It's not even a definition accepted by any academic discipline as far as I can tell. This might be a convenient definition for gaming an AFD, but it's neither real nor ours, and therefore has no place in our policy. WhatamIdoing (talk) 21:34, 7 June 2013 (UTC)

Just as a note as being one of the core policy pages, we should not be edit warring on this (past 1RR). SlimVirgin, if you want to go back to a version of this page that has otherwise been stable for one+ years, you are likely going to need consensus, particularly on one that shifts the meaning of what secondary sources are. There may be other more cosmetic changes that would make sense, but I'd not change the core text. --MASEM (t) 21:39, 7 June 2013 (UTC)

In a nutshell

I'm wondering why anyone would want to question that a secondary source is a secondhand one, given how well-established this is in academia and (until recently, apparently) in this policy. Looking at the red shirt example, if I write on my blog: "I was wearing a red shirt," the blog is a primary source for "what was she wearing?". If someone else writes: "She was wearing a red shirt," that's a secondary source. If a third person writes: "She was wearing a red shirt, and I know this not only because she said it, but because I was there and I saw it," that source is both a secondary and a primary source, depending on how we use it. This is really not a difficult concept at all, and there's no need to make it complicated. Were you there, did you see it, did you take part in it, did you cause it? Then what you wrote is a primary source. Not there, didn't see it, weren't involved, everything coming to you via others? Then what you wrote is a secondary source.

In addition, when we're dealing with issues that took place a long time ago, what today would be regarded as a secondary source (say, a newspaper article by someone uninvolved) becomes a primary source because of its proximity to the event compared with that of the reader. Anything from that era becomes a primary source of information about that era. That is the primary/secondary distinction in a nutshell. SlimVirgin ^(talk) 21:44, 7 June 2013 (UTC)

Just like there's dozens of style guides for writing, there's many possible ways to define primary and secondary. Your approach - one-step removed, effectively - is one of those. The other, and the one that we have been using at WP for a long time (given that it is core to the principle of notability, before 2006 then) is that secondary sources are transformative. That's our house style in defining those terms. Are we conflicting with some areas of academia? Sure - just as our chosen MOS is not in line with other, more popular MOSes. But as WAID has suggested, using the transformation consideration helps to simply classifying sources for purposes of original research, verifyability, notability, and other facets. Specifically it separates the dependency of the author from the type of content of the work - the former needed for WP:V, the latter needed for WP:OR and WP:N. It is not a novel way of breaking down primary and secondary, but we recognize it is not the only way, and thus why we have these pages to make it clear the way that en.wiki has adapted. --MASEM (t) 21:49, 7 June 2013 (UTC)

Where is the house style described? SlimVirgin ^(talk) 21:54, 7 June 2013 (UTC)

Right here , on WP:OR. It explains how to determine if something's primary or secondary or tertiary. That's why the essays included also help to clarify in more detail. (My memory may be bad, but at one point I thought that the WP:PSTS section here was its own separate page, but I guess it was moved here since this is where it is most applicable). --MASEM (t) 21:57, 7 June 2013 (UTC)

It appearss originally in 2005, where the definition in its entirety is Secondary sources present a generalization, analysis, synthesis, interpretation, or evaluation of information or data. Whether it's hearsay or secondhand information, whether the author is independent, and whether a proper editor was involved are irrelevant. WhatamIdoing (talk) 22:23, 7 June 2013 (UTC)

Slim Virgin is on the money with the 'red shirt' example. It is ridiculous to require a secondary source to establish that Descartes wrote "Cogito, ergo sum". In fact, in such cases, a secondary source is an inferior way to establish this point.

The concern is widespread in this thread that allowing quotations from primary sources is a 'slippery slope' toward drawing conclusions that are not supported by the sources. The other side of this is that simply making statements about what authors have said and footnoting them to primary sources is making WP a hearsay presentation compared to an eye-witness (that is, direct) presentation using quotes. If, in fact, quotes are strung together by a WP editor to construct an unsupported point of view, there is plenty of WP policy (in my opinion) to help an editor critique such an attempt. The more serious 'slippery slopes' in using quotations are WP:Undue and WP:NPOV. Brews ohare (talk) 15:25, 8 June 2013 (UTC)

I somewhat agree but as always the exact use of the primary source is critical. If you merely state that Descartes used/wrote that line, then a primary source such as Descartes' (original) writings can be seen as sufficient. However if you instead claim the phrase is attributed to Descartes, Descartes coined the phrase, the phrase was first used by Descartes or the phrase was popularized by Descartes, then the primary source is not sufficient anymore and for those you would need secondary sources.--Kmhkmh (talk) 16:56, 8 June 2013 (UTC)

Kmhkmh: I don't think we somewhat agree; we entirely agree. Your point is supported by the wording " All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source" Brews ohare (talk) 17:09, 8 June 2013 (UTC)

But “Descartes wrote "Cogito, ergo sum"” does not an article make. If an article contained just that it would be deleted. If a paragraph contained just that it should be merged with others, others that describe its significance and notability based on reliable secondary sources. There is no prohibition on using primary sources but they cannot be used alone as this example shows.--JohnBlackburne^words_deeds 17:43, 8 June 2013 (UTC)

JohnBlackburne: Not a point of contention - you are discussing WP:Notability, not the topic here. Brews ohare (talk) 18:20, 8 June 2013 (UTC)

Slim... One problem with your red shirt example... you are forgetting that sources can shift their classification over time. A source that might once have been classified as secondary can be re-classified as primary. Consider the following hypothetical: The Anglo-Saxon Chronicle states that "King Ethelred wore a red cloak when he met with the Danes". If we were reading the A-S Chron back in the year 1000, we probably would call it a secondary source (as you describe)... but today? Nope... it's considered a primary source. Same sentence... same source... different classification.
In fact, it really does not matter whether the A-S Chron is primary or secondary. What matters is whether using it is appropriate (or not) in a specific situation or article context. In one context it will be absolutely appropriate, in another it will be highly inappropriate. This applies equally to primary and secondary sources... (although it is easier to inadvertently misuse a primary source). Blueboar (talk) 17:15, 8 June 2013 (UTC)

Blueboar: As you point out, it would be quite appropriate to cite the Anglo-Saxon Chronicle as to the red cloak, regardless of how it is classified. So we don't yet have an example where citing a primary source could be either appropriate or inappropriate depending entirely upon context. Brews ohare (talk) 20:08, 8 June 2013 (UTC)

OK... try these (using my hypothetical of the A-S Chron noting that Ethelred the Unready wore a red cloak to a meeting with a Danish King)...

Appropriate use of primary source: Article: Red Cloaks - Context: "People have worn red cloaks throughout history. English King Ethelred the Unready wore one in the year 998, at a meeting with the Danish King. <cite: A-S Chronicle>" (article goes on to give several more examples of historical persons wearing red cloaks).

This is appropriate because it directly supports the statement, and does so in the context of the broader paragraph.

Inappropriate use of primary source: Article: History of the Danelaw - Context: "As noted in the previous section, Danes considered the wearing of a red cloak an insulting provocation. Ethelred thus inadvertently insulted the Danish King when he wore one to a meeting in in 998. <cite: A-S Chronicle>"

This is inappropriate because it only supports the statement indirectly... the A-S Chron says nothing about how the Danes felt about red cloaks (that is a fact apparently cited elsewhere in the article). Blueboar (talk) 21:19, 8 June 2013 (UTC)

Blueboar: Your assessment of the appropriateness of usage is accurate. In the second use, the source says nothing about either the Danish view nor about inadvertence, and so does not support the statement in most particulars. I don't think this example falls under the statement: "All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source" because this misuse is none of these three things. I'd say the source had nothing to do with most of the claims, whether or not it is a primary or a secondary source. How would you classify this issue? Maybe it should be recommended that when a source supports only an aspect of a statement, it should not be made to appear to be a blanket support? In my experience, this form of misuse is common on WP. Brews ohare (talk) 21:42, 8 June 2013 (UTC)

If you want more examples, there are a number at WP:USEPRIMARY. WhatamIdoing (talk) 23:16, 8 June 2013 (UTC)

Remember, we are not saying primary sources are bad. They just can't be used to build new claims on, but can and should be included to state facts. --MASEM (t) 21:52, 8 June 2013 (UTC)

More specifically, we are especially trying to avoid egregious misuse, like taking a peer-reviewed primary source that says you can kill cancer cells by pouring large amounts of cyanide on them, and then writing that taking cyanide is the main treatment for cancer. It's one thing to say "somebody ran a study" and another thing to draw a conclusion based on this ("and so cancer patients should take cyanide"). WhatamIdoing (talk) 23:16, 8 June 2013 (UTC)

And just in case someone wants a real life example:

Source, 2nd Amendment to the US Constitution

Appropriate use: (Article: Constitution of the United States) - context: the simple statement - "The Second Amendment of the US Constitution guarantees US Citizens the right to bear arms. <cite: Second Amendment>"

This is a straight forward descriptive paraphrase of what the Amendment actually says. No interpretation or analysis is involved.

Inappropriate use: (Article: Gun Control - context: the simple statement - "The Second Amendment of the US Constitution guarantees US Citizens the right to own assault rifles. <cite Second Amendment>

This does involve interpretation (that assault rifles are what the Amendment means by "arms"... etc.)... This statement would require a secondary source (and, given the debates over the issue, even then it would have to be rewritten to be phrased as an opinion, and not simply stated as unattributed fact). Blueboar (talk) 23:42, 8 June 2013 (UTC)

This is an excellent example, Blueboar. The 'appropriate use' is verbatim, while the second seems to be a special case of the first. However, as you point out, whether 'arms' includes 'assault rifles' is rather debatable as there were no such things at the time the constitution was drafted. Consequently, a possibly very technical historical and legal discussion is involved in this simple change. I don't know if there is a simple general statement that covers such things, or if we are left to the devices of conflicting editors to sort such matters out. Brews ohare (talk) 02:51, 9 June 2013 (UTC)

On noting the absence of sources

Apologies if this is a recurring topic; I haven't watched this policy page. I'd like to bring up a situation that I've run into repeatedly, where our policy creates difficulties. What do we do when no usable sources exist concerning a particular point, and where the absence of sources is a fact that it is essential for the reader of an article to know, even if we lack a source that tells us explicitly that no sources exist?

Let me give a concrete example. I recently made some revisions to the article Eigengrau, and on looking over the literature, I realized that the term has completely fallen out of use (it dates from the nineteenth century). There are only around a dozen mentions in the indexed scientific literature, and the last of them was 13 years ago. Now it is surely impossible for our article to serve the reader properly if it doesn't explain that the term is no longer in use -- but precisely because nobody uses it, there is no source that explicitly states that nobody uses it. My inclination is to handle things like this by applying WP:IAR, but sometimes I run into people who don't accept that approach. Do you think there is any possibility of tweaking the policy to deal with such situations? (Note that verifiability is not really at issue here. The statements in question can easily be verified -- it just takes a touch of effort.) Looie496 (talk) 21:21, 24 June 2013 (UTC)

Can you not just source that "visual noise" is a more recent term here (eg proving visual noise and eigengrau are one and the same?) --MASEM (t) 14:18, 25 June 2013 (UTC)

How can I source that "visual noise" is a more recent term? If I don't have a source stating that it's a more recent term, that's OR, isn't it? Looie496 (talk) 14:34, 25 June 2013 (UTC)

Just doing a rough check on Google Scholar, I don't think its fair to call Eigengrau as "completely falling out of use" as there's papers from 2000 that use it. So I don't think you can IAR and claim that. And of course, without explicit sources, there's no much else you can do. It is probably just the best to say that Eigengrau is related to the terms "visual noise" in terms of the phenomena. --MASEM (t) 14:49, 25 June 2013 (UTC)

Not sure I agree with this proposal. I understand the problem exactly but I think the current implementation of WP:NOR is correct and the result of its application to this sort of situation yields the desired result, it should not be included. Besides, isn't eigengrau used in this journal article from 2009? Which points out the problem of going with the proposal. I'm sure there is some review article or textbook out there that covers the use of the terms over time, it just needs to be located. Zad68 14:52, 25 June 2013 (UTC)

Damn it, now I feel like an idiot. I still think my point is valid, but my "example" has blown up in my face. Looie496 (talk) 15:50, 25 June 2013 (UTC)

Looie one thing you're not is an idiot - when I see your name appear on my watchlist I usually think "Thank goodness Looie got to it." Side note: I'm probably going to be hitting you up for a review of an article I'm working on that needs a review from a neuro SME to fix all the mistakes I'm surely making.

Your example was actually good, it's just that it was good for illustrating the danger of the kind of OR you're talking about. If you're a smart guy with a solid handle on the medical research tools available and you can miss that kind of thing, is it a good idea to loosen the rules so editors even less experienced than you can make the kinds of edits we'd be allowing? Zad68 16:00, 25 June 2013 (UTC)

The sentence in question wasn't necessary, either way. Editors can't be aware of every publication made on the subject, so anyone can make that mistake. But if a case came up where a fact had to be said to make the article work, but there were no sources for it, probably let the reader figure it out on their own. Alternatively, be alert for a new article to support it, or in an extreme case mention it as minimally as possible, and hope to find a source. - Sidelight 12 ^Talk 08:56, 26 June 2013 (UTC)

Summarizations based on routine calculations

"Summarization" is a kind of synthesis, and "numerical synthesis" or "summarization of numbers" are also subject to check if they are "original research by synthesis".

My opinion: when source offer data, and wikipedist do only a simple "tratament of numerical data", it is exact, with no alternative interpretation; and is simple and reproductive because use only routine calculations. --Krauss (talk) 18:38, 1 July 2013 (UTC)

Example-1: Totals and subtotals are complements of numeric table presentation. If the source show (without any summarization) "1+1+1+1", Wikipedia article can express with summarization "1+1+1+1=4". To express only the result 4, not explicited by a source, it can a point for discussion.

Example-2: A table with valid routine calculations and summarizations (generated informations). (see onMouse-over hints) Possible discussions: how many decimal places? Show diffeferences of the first line as "0%" or as null? Use it or not in the average? The table need captions explaning each calculated group? etc. So, discussion page can be used, or another wikipedist can correct the generated information.

quant. A	quant. B	Perc. of A	Diff.	Accum.
20	123	16,26%	0,00%	0
40	234	17,09%	0,83%	0,83%
55	300	18,33%	1,24%	2,07%
115	657	17,23%	1,04%
(without background) Source data
(this background) Calculed by wikipedist
(this background) Summarized by wikipedist

List of "valid nummerical summarizations":

routine calculations	summarizations
+	totals and subtotals.
+1	counting elements of a table
+ /	average
...	...

Are you familiar with WP:NOTOR? WhatamIdoing (talk) 20:01, 1 July 2013 (UTC)

Thanks a lot! Let see if "Numeric summarization" is one or more of these things, --Krauss (talk) 20:24, 1 July 2013 (UTC)

NOTOR-Simple-calculations: yes, as I said before, it is. But if we do not express here (if we do not by explicitly here), some people will say that is not, because "only a sum" is not a "big Summation", neither a "only multiply" is not a "big products of sequences"... So, we need express here that it is.
NOTOR-Compiling-information: yes, I think it is a good conceptual reference, "it is a valid summarization if it is for compiling information".
NOTOR Conflict-between-sources: hum... Perhaps a good point for discussion, see articles about Crowd counting eternal conflicts... A wikipedist cited ref1 and ref2, where "ref1 say 2000 people" and "ref2 say 6000 people", so, writes at Wikipedia article "~4000 people (by ref1 and ref2)", that is the average value (2000/2+6000/2)... Or is more encyclopedic to write "ranging from 2000 (by ref1) to 6000 (by ref2) people"?
NOTOR Translation: yes, that is another good view... "1+1=2" so, wikipedia article can say "1+1" or say "2", they are synonymous, no matter about what the source say.

I see nothing wrong with cataloguing the heights of the US presidents, plotting the height against the presidential number (Washington = 1, Obama= 44) and stating their average height, the standard deviation of their height, the average increase in height of each president in respect of his predecessor (slope of height vs number) and the correlation coefficient of the slope calculation. This can be done using built-in EXCEL function, so can hardly be original research. The fact that half the Wikipedia readership does not understand what the terminology that I used is immaterial - the manipulation is 100% routine. It is however original research if, in the article Heights of presidents and presidential candidates of the United States, I discuss the implications of these figures. On the other hand, if I were writing an article How to interpret statistical data, I see nothing wrong in using the same data as a real life example to explain how to use stats as I am not promoting a novel idea. Martinvl (talk) 21:04, 1 July 2013 (UTC)

First two sentences of section "Primary, secondary and tertiary sources"

I think that the beginning two sentences of the section Primary, secondary and tertiary sources could be made clearer. Here's the current version.

"Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources. Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources, though primary sources are permitted if used carefully."

It looks like the first sentence is saying that secondary and tertiary sources should be exclusively used, prohibiting the use of primary sources, but then the end of the second sentence says that primary sources are OK. I think that the phrase "primary sources are permitted if used carefully" should be moved closer to the front for more clarity, and that what is meant by "carefully" should be clarified. Here's a revised version of the first two sentences that includes some other rewording too.

Wikipedia articles should be based mainly on both reliable, published secondary sources and, to a lesser extent, tertiary sources. Primary sources are permitted if one is careful to avoid original research. Secondary or tertiary sources are useful for establishing the topic's notability and avoiding novel interpretations of primary sources.

--Bob K31416 (talk) 19:38, 16 June 2013 (UTC)

The problem is selective quotation to present a conclusion in WIkipedia's voice. Maybe that needs to spelled out? ----Snowded ^TALK 21:41, 16 June 2013 (UTC)

If so, that would be for somewhere other than this beginning part of the section, which is a more general statement in both the present and suggested versions. --Bob K31416 (talk) 00:12, 18 June 2013 (UTC)

I agree that the first sentence can be confusing to those who stop there and don't read the rest of the PSTS section. The problem is that the sentence is talking about appropriate sourcing for an entire article (when viewed as a whole) ... it isn't talking about sourcing specific information within an article. But too many editors read it as applying to specifics (which leads to the mistaken idea that primary source = bad source). Now... can we find a way to restate the sentence so we keep the intended meaning and yet avoid the misunderstanding.

As a start to the discussion... what about something like:

Articles, when viewed as a whole, should rely on secondary sources for the bulk of their information; While primary sources can appropriately be used in some specific instances, they are best used in conjunction with secondary sources).

Something like this would make it much clearer. Blueboar (talk) 00:42, 18 June 2013 (UTC)

Your version seems to be including new ideas, whereas my version is clarifying the ideas that are already there. For example, a new idea in your version is, "primary sources...are best used in conjunction with secondary sources." Not even sure what that means. In my version I tried to use the wording that was already there as much as possible, whereas your version is completely different from the wording of the current version, which essentially throws away the work that went into the wording of the current version. --Bob K31416 (talk) 01:47, 18 June 2013 (UTC)

I don't mind adding "based mainly on"; "based on" has never meant "composed exclusively from".

I also think that our explanation is lousy. True, we need secondary sources for notability. But using a secondary source doesn't actually "avoid novel interpretations of primary sources" (except perhaps very indirectly), and we need secondary sources for determining due weight/anti-cherry-picking, which this doesn't mention. WhatamIdoing (talk) 02:03, 18 June 2013 (UTC)

Thanks. Looks like you noticed some of the reasons for some of my changes.

Re the other part of your message, "we need secondary sources for determining due weight/anti-cherry-picking" — Due weight/cherry-picking could be a problem with secondary sources too. --Bob K31416 (talk) 02:58, 18 June 2013 (UTC)

I agree, however, the first sentence should be reworded differently than the way you suggested, to be clearer. I really think it was worded like that to weasel the implication that primary sources weren't allowed, but then saying primary sources are allowed. The realization is that primary sources shouldn't be omitted. It is really unclear, it depends on if you read it fast or slow that makes a difference in the interpretation. - Sidelight 12 ^Talk 04:58, 18 June 2013 (UTC)

In that regard, note that my version is an improvement over the current version of the first sentence since: "mainly" was added to avoid the impression of only secondary and tertiary sources; and my version has the phrase "primary sources are permitted" directly following the first sentence instead of appearing later, as in the current version; yet my version qualifies that phrase regarding OR, i.e. "Primary sources are permitted if one is careful to avoid original research." --Bob K31416 (talk) 11:15, 18 June 2013 (UTC)

I also agree that secondary sources can be associated with cherry picking. The secondary source could use cherry picked data, but I propose do nothing about that part. The point is, secondary sources are as vulnerable to cherrypicking as primary sources. see WP:PRIMARYNOTBAD. - Sidelight 12 ^Talk 05:08, 18 June 2013 (UTC)

That is only an essay, and hence a personal opinion. On the other hand: while an article can still be unbalanced when using only secondary sources, consensus is that this is much harder than when using primary ones. It would be similar to using or not using sources: we could have a great article without any sources, but it would be much harder. We have decided that occasionally some ideas do not need a source (since they are common knowledge), but otherwise we should cite. VERY (VERY, VERY, VERY) occassionally a primary source is valid, but IN GENERAL secondary sources are required. Moreover: regarding your sentence that "The secondary source could use cherry picked data" is irrelevant to us: What has to be neutral and balanced are our articles, not the references we based them on. References have to be reliable, published secondary sources, not neutral.--Garrondo (talk) 09:21, 18 June 2013 (UTC)

Side note: "only an essay" is a poor objection. Is WP:BRD "a personal opinion"? How about WP:Use common sense? People get blocked over WP:Tendentious editing—is that "a personal opinion"? The WP:Five pillars is "only an essay", too. I suggest reading WP:PGE. WhatamIdoing (talk) 10:37, 20 June 2013 (UTC)

The problem here is that Original Research isn't really about which type of source you use... it's about how you use it. That basic concept is what is missing from the opening sentence. The reason why we caution people about using primary sources is that they are easy to misuse ... but if you use them appropriately they are fine (and indeed in a few situations they are actually better than secondary sources). Blueboar (talk) 12:40, 18 June 2013 (UTC)

Agreed, but indeed 99.999999999 of the times a secondary is better and to use a primary is to give undue weigth to its conclussions (why was that specific source chosen and not all the other existing ones?), advance an agenda and/or make OR. So IMO emphasis in secondary sources is even not strong enough.--Garrondo (talk) 14:25, 18 June 2013 (UTC)

Bob K31416 the problem with you proposed change of wording can be read that while secondary and tertiary have to be published, it is ok to use unpublished primary sources. One of the planks of this section is that unpublished primary sources may not be used (this is crucial in many disciplines if we are to prevent OR). -- PBS (talk) 13:52, 18 June 2013 (UTC)

Note that the current version in policy has what you are calling a problem, so my version is not introducing that. Also, with respect to primary sources, my version adds the phrase "careful to avoid original research", so it's an improvement with respect to what you mentioned. --Bob K31416 (talk) 15:24, 18 June 2013 (UTC)

Easy enough to fix... just specify published primary sources. Blueboar (talk) 14:33, 18 June 2013 (UTC)

Problem is not from unpublished primary sources, but the misuse of published ones.--Garrondo (talk) 14:45, 18 June 2013 (UTC)

That's probably true in most cases, and my version has the improvement of mentioning with primary sources, "careful to avoid original research". --Bob K31416 (talk) 15:30, 18 June 2013 (UTC)

Well, both are a problem. The second can be corrected by better explaining how to use various kinds of sources appropriately. Blueboar (talk) 15:34, 18 June 2013 (UTC)

If you think that adding "published" is worthwhile, then see if it has consensus and make that change in the current version of policy, and I will incorporate it here. --Bob K31416 (talk) 16:02, 18 June 2013 (UTC)

Anyhow, at the beginning of this section is my offering. If anyone wants to implement it, as far as I'm concerned, feel free to do that. I'll be leaving this discussion now. --Bob K31416 (talk) 17:45, 18 June 2013 (UTC)

That's a very good idea to use the word published. "Reputable published primary sources" also works. Published primary sources are more common than realized. I support this proposed change. Peer reviewed can be used for scientific publications, and if the words peer-reviewed gets used, there probably needs to be an additional rule for that. - Sidelight 12 ^Talk 01:33, 19 June 2013 (UTC)

Bob K31416, you may need to come back to vote on your proposed change, obviously. - Sidelight 12 ^Talk 01:36, 19 June 2013 (UTC)

Disagree: I believe emphasis on secondary sources is best explained with current wording.--Garrondo (talk) 07:24, 19 June 2013 (UTC)

There are occasions when an article is dedicated to primary sources that are themselves self-explanatory. Such articles include Jefferson's writings such as Declaration of the Causes and Necessity of Taking Up Arms, Plan for Establishing Uniformity in the Coinage, Weights, and Measures of the United States and other articles such as International System of Units which is based on this publication. These articles would be meaningless if they did not make extensive use of the original text. Martinvl (talk) 08:26, 19 June 2013 (UTC)

I don't think that it's a good idea to introduce the concept of peer review. Peer review is good, but so is normal editorial oversight. Preferring peer review turns into "you can't use that anatomy textbook for basic facts, because it's not 'peer reviewed'. We'll have to stick with my cherry-picked pay-to-play journal article with a sham peer-review process, even though it says that humans normally have three noses." WhatamIdoing (talk) 10:46, 20 June 2013 (UTC)

(So I'm a trouble-maker.) I've always thought that the distinction made between primary and secondary sources is more trouble than it is worth, and misses the point about what we want to allow and what we want to disallow. All sources can be misused. Novel interpretations are just as possible to make of a secondary source as of a primary source. Both primary and secondary sources can be either reliable or unreliable. Both primary and secondary sources can be either published or not published (though the distinction is not easy to define in these digital days). All sources are reliable for some things and not for other things. Really, once we are clear that novel interpretations are not allowed, that sources should be published, and that sources can only be used for information they are reliable for, what is left of the primary/secondary distinction that we actually need? Zero^talk 09:29, 19 June 2013 (UTC)

For historical articles I think there is a difference. If someone accurately summarises secondary sources, then they are creating a tertiary source (it is not OR). A summary of multiple primary sources that has not been made before and published in a reliable secondary source is novel interpretation of those primary sources and therefore OR. -- PBS (talk) 13:07, 19 June 2013 (UTC)

Historical articles are my specialty, and I disagree with you. There is no prohibition against mere summarising. What we do have is WP:SYNTH, that forbids us "to reach or imply a conclusion not explicitly stated by any of the sources". That is typically easier to violate in the case of primary sources, since a secondary source is more likely to have already drawn the conclusion we seek. However, application of WP:SYNTH produces the correct result in both cases without the need to decide whether the sources are primary or secondary. I contend that we don't need that division. Zero^talk 23:37, 19 June 2013 (UTC)

Another important issue that many fail to see is undue weight issues. Imagine that we have 10k primary articles as possible refs for an article. The decision of which ones are included in the article is critical, and by itself a form of original research. A secondary source has already done that first selection on which primary sources are relevant and which ones are not, and also secondary sources also summarize consensus among primary ones. Moreover: an editor includes a primary one that is not mentioned in any of the secondary ones: in such case by simply using that source (even if perfectly quoting from it) is given undue weight to a non-notable point of view. I completely disagree with the estatement that we do not need the distinction: the use of secondary sources is critical to get balanced articles which are not mere laundry-lists of primary sources (which already is common in many articles).--Garrondo (talk) 07:20, 20 June 2013 (UTC)

We have guidelines about weight. And about balance. We can all agree that articles that consist of laundry lists of cherry-picked sources are undesirable. But couldn't we make that same judgement even if we never heard the words "primary" and "secondary"? Zero^talk 13:08, 20 June 2013 (UTC)

At some level, I think that Zero is right: we have overemphasized this issue. This is partly because we had so many editors who thought that "secondary" was a fancy way to spell "independent" for a long time. It's also because there are some definitions of secondary that are so broad that you really don't want to use anything primary. For example, there's that historian who said that anything already published is a secondary source. Under that odd definition, then WP:V outright bans the use of primary sources. But under our definition, which essentially is that secondary sources are an intellectual product that involves analysis, comparison, or some other significant intellectual transformation of primary sources (so not mere summary, quotation, citation, or description, even though there are a few academic areas, like genealogy, that use such a definition), secondary sources are highly desirable, and primary sources can also be acceptable. WhatamIdoing (talk) 10:46, 20 June 2013 (UTC)

The boundaries between primary, secondary and tertiary sources are a great deal less clear-cut than Wikipedia's simple declarative treatment of the subject would have you believe. What's this? Since it's an interview, there are Wikipedians who would have you believe that it's a primary source (and imply that it's therefore not to be trusted). But because it's been edited by journalists and bookended by descriptions of the man and his accomplishments, it's in a very different category from a (hypothetical) simple transcript of the man talking about himself. I view the whole area as quite problematic and although I do think we need to discuss it, I feel it should be (a) given less prominence and emphasis, and (b) tweaked for added caveats and nuances.—S Marshall T/C 11:26, 20 June 2013 (UTC)

Yes, I agree. Part of why I never liked the way rules are based around primary vs secondary is that so many fundamentally different types of things are primary sources even by our definition. A declassified raw intelligence report, a travelogue written by the traveler, and an original research article in a physics journal are all primary sources but they are so different that lumping them together seems pointless. Much better to say that the intelligence report is unreliable because only expert analysts can assess such things in context, the traveler's impressions can be cited with attribution if reliable sources consider the traveler to be citable (which is weaker than requiring a reliable source to have quoted the same impressions), and that keeping science articles up to date with the very latest research is called splendid editing. Zero^talk 13:08, 20 June 2013 (UTC)

Objections to proposal, switched order, added reliably published

To - 11:05, 21 June 2013 (UTC)

"Reliably published primary sources are permitted if used carefully, and if one is careful to avoid original research. Secondary or tertiary sources are useful for establishing the topic's notability and avoiding novel interpretations of primary sources."

From

"Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources, though primary sources are permitted if used carefully."

- Sidelight 12 ^Talk 01:39, 21 June 2013 (UTC)

At the moment it is not clear to me what we gain/loose with the change.--Garrondo (talk) 07:23, 21 June 2013 (UTC)

For clarification. The order is switched to move a sentence part closer to the front, and "though primary sources are permitted if used carefully" becomes "Reliably published primary sources are permitted if used carefully, and if one is careful to avoid original research." The proposed restriction and existing restriction are used here. - Sidelight 12 ^Talk 11:05, 21 June 2013 (UTC)

Separating proposal, whether or not rewording

"Wikipedia articles should be based ~~mainly~~ on ~~both~~ reliable, published secondary sources and, to a lesser extent, tertiary sources."

needs separate consensus. Let's work that separately.- Sidelight 12 ^Talk 01:39, 21 June 2013 (UTC)

At the moment it is not clear to me what we gain/loose with the change.--Garrondo (talk) 07:23, 21 June 2013 (UTC)

The second part can't really be voted on directly until we have the proposed changes. I see that now. - Sidelight 12 ^Talk 11:05, 21 June 2013 (UTC)

The proposed wording seems useful to me.:

"Reliably published primary sources are permitted if used carefully, and if one is careful to avoid original research. Secondary or tertiary sources are useful for establishing the topic's notability and avoiding novel interpretations of primary sources."

It separates the purposes of the various types of source, distinguishing between 'presenting' material and establishing 'notability'. In my experience this distinction often escapes the notice of critics.

The comment by S Marshall deserves attention in another context - in scholarly work it is common for debates to rage for decades. The distinction between primary and secondary sources is entirely bogus. The originators of some ideas are hard to identify, and various arguments appear in all types of sources: journals, books, encyclopedias; and all are written by individuals, who may or may not have a balanced perspective. Like WP itself, these works are not useful for providing a definitive view of matters. The most one can hope for from any of these categories of source is some clues as to what are the various facets of a topic, and some of the pros and cons.

A WP article should serve to make the reader aware of the many currents flowing, but it may not be able to say how the tide is running. The reader of a WP article has to make their own personal decision about how the cookie crumbles. It is unrealistic to expect WP to find 'the best' sources to present a topic. To echo in part the comments above by Zero0000, the primary-secondary distinction in presenting material on WP (in the context of scholarly work as opposed to news events) is a crock. The governing principle is WP:NPOV. Brews ohare (talk) 14:56, 21 June 2013 (UTC)

The current wording of WP:SECONDARY may not be perfect, but the suggestions do not seem an improvement as they do not address the fundamental reason for WP:SECONDARY. While many of the statements in this discussion are correct in general, at Wikipedia there is a special problem because we have to rely on sources, and that allows an editor to unknowingly or purposefully cherry pick sources that appeal to them. That problem cannot be eliminated, but the problem would be much worse if editors were able to select primary sources to assert some general conclusion (consider the creationists who would pick primary sources to claim that evolution is bogus). The policy requires that general conclusions be verifiable in secondary sources in order to reduce the amount of original research that occurs. A policy is useless if it says something like "you can do X if careful". That means that if I do X it is ok because I am careful, but if someone else does X it is bad because they are not careful. The current "if used carefully" is reasonable as the emphasis is clearly that articles should be based on secondary sources. On various noticeboards, the comment is often made that primary sources are fine for illustrating a conclusion from a secondary source—that is careful use. Johnuniq (talk) 01:49, 22 June 2013 (UTC)

Agree with Johnunig here. Many editors are more than capable of showing some judgement in use of primary sources, but others wish to use wikipedia as their vehicle to take part in "debates" that "rage for decades" to quote Brews. The current wording places some check on that ----Snowded ^TALK 02:37, 22 June 2013 (UTC)

Made half of the proposed change. There seems to be no objections to it. Someone questioned it, but for this part it adds more restriction to what they agree with. Meaning basically unchanged, emphasis was put on primary sources are allowed if "reliably published" and no original research. - Sidelight 12 ^Talk 03:07, 22 June 2013 (UTC)
I do not see any advantages on your proposal, I believe current wording is more clearer. Others think similarly so please refrain from changing policy before clear consensus is reached. ~~I had asked for which advantages your proposal you believed brought and you did not even answer, so do not say that there was consensus.~~--Garrondo (talk) 12:47, 23 June 2013 (UTC)

You didn't ask anything, you said it wasn't clear to you. I Did in fact explain it, so I responded to your statement. I answered your statement. If you didn't get the question answered that you wanted, its because you didn't ask anything. There was streamlined consensus, all the editors agreed, except you only said it wasn't clear to you, which isn't a clear objection. I explained it to you. In fact nothing changed, only the clarification of the same thing. No one objected until now. - Sidelight 12 ^Talk 13:03, 23 June 2013 (UTC)

You didn't ask that question, so don't say you did, when you didn't. And besides that did get answered anyway. - Sidelight 12 ^Talk 13:07, 23 June 2013 (UTC)

"I had asked for which advantages your proposal you believed brought" you didn't ask that. you said, "At the moment it is not clear to me what we gain/loose with the change." And I responded to that STATEMENT, fully. So don't throw around accusations. - Sidelight 12 ^Talk 13:20, 23 June 2013 (UTC)

While I have to note that I had not seen your answer, and hence you are right that I was not fair (I have crossed my comment and I am sorry for my comment) I still do not clearly see any advantages from your proposal, partly because I am lost with all this back and forth. Since I was not the one to revert it is still valid the issue that there is no clear consensus. I would recommend starting a new section with the smallest possible change and discussing it. If there is no clear consensus it would be better to leave it as it is.--Garrondo (talk) 19:03, 23 June 2013 (UTC)

Its ok. I tried to break it down into smaller pieces. I thought everyone agreed that it needed to be clearer, and the edit would have been fine. The specific objection wasn't made sooner to what has been proposed since the beginning. - Sidelight 12 ^Talk 02:32, 24 June 2013 (UTC)

Second part of proposal, changing from "Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources." This move is more controversial, so each objection should be weighed carefully, and wait about week before it is changed after decided on. In order to not anyone feel slighted, and future editors can relate to this. Try to compromise somewhat. Suggestions and comments welcome.
- "Wikipedia articles should be based mainly on both reliable, published secondary sources and, to a lesser extent, tertiary sources."
- "Wikipedia article topics should be mainly based on both reliable, published secondary sources and, to a lesser extent, tertiary sources."
- "Wikipedia articles should be based on third party sources."

- Sidelight 12 ^Talk 03:07, 22 June 2013 (UTC)

In regard to Johnunig's comment: that WP cannot support a policy that "allows an editor to unknowingly or purposefully cherry pick sources". Of course, that is a risk that every WP article runs. But restriction of the use of primary sources does not prevent it. If primary sources were banned outright, which would make the construction of WP impossible, one can still cherry pick the remaining classifications to the same end. An even more noxious way to cherry pick, which can be conscious or unconscious, and which also is very prevalent on WP, is cull sources for statements taken out of context. The cherry-picking remedy is WP:NPOV and that is much more easily enforced and less easily blown up into unending argument than differences over whether a source is primary or secondary or reliable, or whatever. Brews ohare (talk) 17:10, 22 June 2013 (UTC)

I think "Reliably published primary sources are permitted if used carefully, and if one is careful to avoid original research." is less than helpful as this is the policy that seeks to explain how to "avoid original research". I think "Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of [reliable published] primary sources." is much better. -- PBS (talk) 10:13, 23 June 2013 (UTC)

Yes, it is. But it remains a problem that in the context of academic topics, novel interpretation is best avoided using WP:NPOV because it is bias that is the problem, not the type of source. Novel interpretation is to be avoided when it is the WP editor's novel interpretation. If primary sources can be found for a point of view, it is not for us to judge its novelty beyond requiring a reputable publisher. Just be sure that all sides are presented and sourced. Brews ohare (talk) 14:55, 23 June 2013 (UTC)

The suggested wording does cover the idea that a secondary source can present a novel interpretation which we can then include in WP, while a novel interpretation only based on primary sources is not permitted. (we want sources as "to avoid" that latter situation). The NPOV aspect is a separate but important manner but outside the scope of this policy, so perhaps a line to NPOV should be included here. --MASEM (t) 15:07, 23 June 2013 (UTC)

The current wording is "to avoid novel interpretations of primary sources" is not "to avoid novel interpretation by primary sources". This was discussed at great lengths over the example of Wellington and "the nearest-run thing ..." see here in the archives. -- PBS (talk) 18:15, 23 June 2013 (UTC)

An excellent point, PBS, although perhaps a wording that does not rely on a single preposition would be less likely to be misread. Brews ohare (talk) 18:23, 23 June 2013 (UTC)

Masem: Maybe we should discuss this point a bit further. A secondary source, I suppose is something like the Stanford Encyclopedia of Philosophy or the Internet Encyclopedia of Philosophy, as two examples. The articles in these works are written by a single author (as are most encyclopedia articles) and their objectivity is that of the author. Now we could also look at an edited book of essays like David Chalmers, David Manley, Ryan Wasserman, ed. (2009). Metametaphysics: New Essays on the Foundations of Ontology. Oxford University Press. ISBN 0199546045.{{cite book}}: CS1 maint: multiple names: editors list (link) This collection is very much like an encyclopedia: it has editors, for example, and contains articles by individual authors. And we could consider journal articles by individual authors, subject to peer review and to editors again, An example would be Matti Eklund (2013). "Carnap's Metaontology" (PDF). Noûs. 47 (2): 229–249. doi:10.1111/j.1468-0068.2011.00830.x. Now, personally, I see no difference in any of these sources as far as reliability or parochialism. In fact, some pf these journal articles will be simply collected by some editor and published as a book on some particular topic. The protection WP needs is against a one-sided presentation, not the novelty (in our own inexpert opinion) of some published author's approach. The protection WP needs is provided by WP:NPOV. Using WP:OR to deny the use of any of these sources on the basis that one or the other is a greater risk to original research is not sensible, and curtails a full presentation of those topics or sub-topics too specialized to appear in a textbook or an encyclopedia. Would you agree? Brews ohare (talk) 18:23, 23 June 2013 (UTC)

All I'm saying that NPOV is the policy where the balance of viewpoints (if one is needed) is discussed in depth. The only aspect where NOR has a say is in the essence of trying to create a counter viewpoint within the light of NPOV by synthesizing the other one from primary sources alone. If you have reliable secondary sources creating novel interpretations, then the proposed language still works just fine - in so much as NOR's scope merits. It is possible that novel interpretations presented through secondary sources may not be appropriate for inclusion due to NPOV, but that's not NOR's problem to worry about - the reason to exclude them would be due to an imbalanced viewpoint and not the novelity of the idea. --MASEM (t) 19:09, 23 June 2013 (UTC)

And just an FYI, there has been a historical problem with editors overuse of the Stanford Encyclopedia of Philosophy which is more (as Brews says) a collection of original essays than a secondary source. If anything we need to tighten up on some of these ----Snowded ^TALK 19:18, 23 June 2013 (UTC)

Masem: Maybe we are talking past each other. My idea of NPOV is that it says all and every published sides of a debate should be presented (with due weight). That has nothing to do with WP:SYN, which is about a WP editor going beyond all sources (not just primary sources) to invent their own (unpublished and unsourced) opinion. A synthesis by a WP editor has really nothing to do with primary, secondary, tertiary or whatever sources. It has to do with having zero sources. It has to do with going beyond the sources to say what you (the WP editor) wants to say. It may be that the author of any category of source has their own point of view - it is not for the WP editor to decide that published opinion is OR; only a WP editor's unsupported opinion is OR. There seems to be some confusion in this thread that somehow 'primary' sources are more prone to synthesis than other types of source. I don't see any reason to think that way. Do you share this view? Brews ohare (talk) 20:23, 23 June 2013 (UTC)

Any source can be a problem for an editor creating novel interpretations - an editor could use two unassociated facts in two secondary sources and come up with an inappropriate interpretation. The point that is being clarified that if there is an interpretation being done of sources, we must source to a secondary (and sometimes tertiary) source that makes that interpretation for us. We cannot use primary sources at all to support novel interpretation, but to be clear, novel interpretation are not only a symptom limited to primary sources, if that is what you are getting it. --MASEM (t) 20:39, 23 June 2013 (UTC)

It is possible for an editor to exceed the source no matter what class of sources he uses. It is possible for an editor to add up two or three sources to get something that isn't in any source, no matter what class of sources he uses.

However, it is much easier and much commoner for editors to make these mistakes when using primary sources like "Effect of accidentally pouring fruit juice on cancer cells: an uncontrolled experiment" or "Personal experience: I cured my skin cancer by eating potatoes and dancing in the moonlight (well, and also with surgery)" than when using secondary sources like "Systematic review and meta-analysis of corticosteroids for accelerating fetal lung maturation". WhatamIdoing (talk) 21:34, 23 June 2013 (UTC)

So I guess we are on the same page here - the problem on peoples' minds is synthesis by a WP editor. The question is: What does restricting the use of primary sources have to do with that? WhatamIdoing thinks secondary sources are less likely to lead to synthesis. I guess the idea is that a review article will cover several angles, and the WP editor might conclude that there are a variety of facets to an issue and back off their own interpretation. But if a WP editor is prone to extrapolate beyond a source, though, the type of source is incidental. If synthesis happens, WP:SYN allows any WP editor to challenge a view that is unsupported, and my feeling is that such a challenge should be based upon there being zero support, not upon limitations on using only a primary source in support. What say you all? Brews ohare (talk) 22:29, 23 June 2013 (UTC)

We're agreeing the sources are incidental to making novel interpretation and original research. I think the point is that , if you know and source Fact A, and know and source Fact B, and there is a possible conclusion between Fact A and Fact B, then save for the most trivial cases (eg like WP:CALC allows), then the only way you can associate Facts A and B is if a secondary/tertiary source does that for you. That's what the proposed language is saying not-as-many words. --MASEM (t) 22:36, 23 June 2013 (UTC)

"if one is careful to avoid original research" was proposed since the beginning of this section, and this objection wasn't made sooner. The wording I made was fine. It was redundant to have what was in the title, because that's where people think more care should be emphasized. The proposed alternate is worse. "If one is careful to avoid original research" could be replaced with "if one is careful to avoid synthesis, and interpretations." - Sidelight 12 ^Talk 02:42, 24 June 2013 (UTC)

Primary sources allow for Original Research, in other ways

"the problem on peoples' minds is synthesis by a WP editor" That is not the only issue with Original Research. Primary sources allow for Original Research, in other ways. For example in Britain there is the 30 years rule when many secret government papers are released to the public. Let us suppose that on of those papers contradict what is in all modern histories of an event. A Wikiepdia editor should not quote that paper, if rubbishes the accepted history (eg that British Government did not have any prior warning of the incident (when the newly published cabinet papers show they did)), because that is OR, the article should not include the new information until this information is absorbed into a new secondary publication. Normally these sorts of sensational discoveries are reported on in the news-media, what should not happen is that a Wikipedia article becomes a news item because it is the first to publish such a revelation.

Another example. There is a process to the integration of primary sources into the historical record. It may be that a primary source is found by an historian researching historical archives for information for an historical biography (or whatever), this will then appear as a footnote in the historians publication. But another method for the publication of primary sources as a catalogue of manuscripts (eg from boxes of papers in the attic of a stately home), and those published catalogues are then used by historians to help write new history papers. It maybe that in those published catalogues are papers that have not been include in any published history. Quoting such a source may be Original Research if the fact mentioned is not published elsewhere and it introduces, without synthesis, a novel piece of information. For example during his escape after his defeat at Worcester King Charles II passed thorough a village where he had an encounter with a Smith. It has long been speculated that this village was Bromsgrove (and is included in some secondary sources as a fact based on reasonable deduction from what is know of his route, the roads on the route, the location Bromsgrove and the Kings's description of the village)) However Bromsgrove is not named in the primary sources. If a paper is sitting in a published archive somewhere conclusively proving that he did, Wikiepdia is not the place to first include that fact based on a primary source that names Bromsgrove, because discovery of such a paper will be Original Research. -- PBS (talk) 12:44, 24 June 2013 (UTC)

Your two examples appear to be cases of undue weight, rather than original research.

Also, you wrote, "If a paper is sitting in a published archive somewhere conclusively proving that he did, Wikiepdia is not the place to first include that fact based on a primary source that names Bromsgrove, because discovery of such a paper will be Original Research." — Actually, in your example, Wikipedia isn't the first place that includes that fact because the first place is the primary source. Also, it seems like you are using your own personal definition of original research, rather than Wikipedia's, when you wrote, "discovery of such a paper will be Original Research." The word "discovery" in this policy is referring to something that doesn't appear in any published source, rather than "discovering" a published source. Here's the Wikipedia definition of original research from the beginning of the lead of WP:NOR,

"The phrase "original research" (OR) is used on Wikipedia to refer to material—such as facts, allegations, and ideas—for which no reliable, published sources exist."

--Bob K31416 (talk) 14:31, 24 June 2013 (UTC)

Original research is not supposed to be done by the Wikipedia editor, this doesn't apply to the published source. You're introducing ideas that change the whole meaning, when we were trying to clarify and emphasize something. This is a nuisance, something wasn't objected to, and all of a sudden you want to object to it, then say a bunch of philosophy that you didn't say before that completely attempts to change the guidelines to something alien to Wikipedia. - Sidelight 12 ^Talk 16:23, 24 June 2013 (UTC)

It is another reason for keeping the concept of Primary and Secondary sources separate, or, Bob K31416, are you suggesting that a primary source that to date has only been published in a catalogue and not analysed by secondary sources, can be used to contradict the established history of an event? If primary sources are used that way it would seem to me to be a classic example of Original Research and is covered by "do not ... evaluate material found in a primary source yourself". As to the other point Sidelight12 what exactly is it that is you are tying to clarify? Because from this conversation it seems to me that the proposed new wording is a change in meaning not a clarification. -- PBS (talk) 08:12, 25 June 2013 (UTC)

Re "are you suggesting that a primary source that to date has only been published in a catalogue and not analysed by secondary sources, can be used to contradict the established history of an event?" — No. As I wrote at the beginning of my last message, "Your two examples appear to be cases of undue weight, rather than original research." --Bob K31416 (talk) 13:11, 25 June 2013 (UTC)

I think Bob's right here.

We do sometimes find it DUE to contrast the views of primary and secondary sources. The most typical case is for BLP reasons: "Secondary Source says Bill smoked marijuana. However, Bill says on his blog that it doesn't count because he didn't inhale." In other cases, we find it preferable to ignore the primary source, and in still others to omit all of them. But this is a decision of DUE weight, not of making up unpublished ideas.` WhatamIdoing (talk) 08:41, 26 June 2013 (UTC)

"though primary sources are permitted if used carefully." proposed-ly changed to "Reliably published primary sources are permitted if used carefully, and if one is careful to avoid original research." What was obviously made clearer is emphasis that primary sources are allowed. All that was added was to be careful to avoid original research, which was proposed since the beginning and suddenly you want to object to that. Stop playing games. The page is about avoiding original research, and emphasis was put right there to "carefully" avoid it when dealing with primary sources. - Sidelight 12 ^Talk 08:56, 26 June 2013 (UTC)

UNDUE is in Wikipedia:Neutral point of view for a reason that it is to do with bias and it is appropriate that it is in the NPOV policy. In the examples I have given above there is no bias involved, this is about the use of primary sources to do original research. This is also the reason for the use of "published" in the primary sources section (something that I have had to argue in favour of keeping more than once and to restore on one occasion). The problem is that as with the Bromsgrove example above it is possible that historians have overlooked a published historical manuscript (or have silently dismissed it). It is not up to Wikipedia editors to "enhance" the historical account by introducing such information if it contradicts all known relevant secondary sources (as opposed to helping to balance the view of competing secondary sources -- which is covered by UNDUE). This was once sort of covered in this policy by "or, in the words of Wikipedia's co-founder Jimmy Wales, would amount to a 'novel narrative or historical interpretation.'" -- although read in the context of the sentence in which it was placed it could be seen as a reiteration of UNDUE. The phrase was removed by SV in a large edit back on 18 December 2007, without AFAICT any discussion over its removal. Whatever wording is used, this policy ought to make it clear that publishing the type of information I have highlighted with two hypothetical examples is restricted by this policy. -- PBS (talk) 13:31, 27 June 2013 (UTC)

PBS, it appears to be your belief that if I find a published primary source—perhaps a diary written during the 19th century, and some magazine decided that it would make good, cheap filler—and I use that published source to write something fairly trivial, like, "According to the recently published diary of Daisy Maizy, the famous Theodore Thespian had dinner with her and the rest of her family in the Maizy Historic Home on 23 November 1834" that—even though this information is directly and obviously in the published diary—then I have committed the sin of original research, because no historian had written about it.

Do I understand you correctly? WhatamIdoing (talk) 15:32, 27 June 2013 (UTC)

I have pondered on this hence the delay in replying. It is not something I think can have a simple rule like that of "Primary sources must have been reliably published". For example it would not be original research to use an article like "Wartime reports debunk Speer as the Good Nazi", or those I mentioned that are published in newspaper articles when released under the 30 years rule although newspaper articles about newly discovered primary sources are always in danger of the Hitler Diaries type forgery, but such publications usually have an historian verifying them before publication so that comes down to NPOV issues. Taking your example above should a detail from a newly published primary source that has not been vetted by an historian be used, maybe if it does not contradict the current historical record, but what if the historical record say that Theodore Thespian was dining with Ann Other Player on the night of the 23 November 1834 at the Railway Inn? Probably not. Let me give you another example Liddell Hart Centre for Military Archives, they have in there examples of boxes of archive material that have been catalogued that may or may not have been read, let alone published apart from the catalogue entry. See for example Papers of Rt Hon Sir Frank COOPER (1922-2002) and the box of . To use the archive "Copies, subject to the condition of the original, may be supplied for research use only. Requests to publish original material should be submitted to the Trustees of the Liddell Hart Centre for Military Archives, attention of the Director for Archive Services." Should such primary sources from archives such as this be allowable if their use introduces a novel historical account into Wikipedia? I think such use would be OR, what do you think? -- PBS (talk) 18:00, 1 July 2013 (UTC)

I don't think that we need to worry about primary sources that "may not have been read, let alone published". It is flatly anti-policy to use any unpublished source. It does not matter if the unpublished source is primary, secondary, or tertiary. You cannot comply with WP:V or WP:NOR by using any unpublished source.

As for published primaries that contradict published secondaries, I think you have to balance all the facts and circumstances. One would normally omit the primary, but in some circumstances, one might mention the fact that some sources disagree. WhatamIdoing (talk) 20:12, 1 July 2013 (UTC)

Excuse my clumsy wording (I assumed in the context of what I had written previously) "may not have been read, let alone published" I mean published in a source other than as a listing in a catalogue. Usually the only way to show that it has been published elsewhere is to cite that publication. Let me give you another example

Historical Manuscripts Commission (1904). Calendar of the manuscripts of the Marquis of Bath, Preserved at Longleat, Wiltshire. Vol. 1. His Majesty's Stationery Office. p. 33.

contains a brief summary of the the garrison of Hopton Castle in 1644. But it does not list most of the names. A facsimile of the original document was shown on a British television program called Time Team and their historian having studied it concluded that the majority of the men listed were Welsh. If prior to the 2010 Time Team program, let us suppose that only the brief summary in the "Calendar of the manuscripts..." was all that had been published, if a Wikipedia editor had been to Longleat obtained a facsimile of the original, placed the 21 additional names in the list on Wikipedia, would that be acceptable or would it be a form of OR? -- PBS (talk) 13:52, 4 July 2013 (UTC)

Given that the manuscript is cataloged on a public website, and that any member of the public could travel to see it, I would call it "published". The real question is what you do with it on Wikipedia. If all you do is list the names and say they were members of the garrison (cited to the manuscript itself) ... that would not be OR. The names and the fact that they were members of the garrison are facts that are directly supported by the primary source with no analysis or conclusion involved. However, in order to go any further than that (such as noting that the names are Welsh) would require a secondary source. Now, I don't really see the point of simply listing the names of the garrison without going further (as a reader, I would expect some explanation of why these people are being mentioned in the article) ... so... I would remove the list as being pointless trivia... but not for being OR. Blueboar (talk) 14:19, 4 July 2013 (UTC)

Whether you would delete it because it was trivial is beside the point -- see my example above about Bromsgrove and the escape of Charles I. BTW the Longleat library is only open to "established scholars by appointment" (if that is the site where this document is still stored) so the average reader can not travel to see it (however that is a detail). What happens if the catalogue entry instead of to the item is to "a box of documents relating to the 1644 siege of Hopton Castle"? I think that a distinction has to be made between cataloguing of a primary source and the content of that primary source being published. When the Calendar of the manuscripts of the Marquis of Bath (1904) was published, it contained many copies of original manuscripts, but some entries like this garrison list were summaries, so while the summary has been reliable published the content of the primary source may not have been. -- PBS (talk) 16:04, 8 July 2013 (UTC)

If the average member of the public is not able to see the document, then that fact is not "a detail", but is a critical fact that tells us that the document in question is definitely not published and that its contents are therefore not usable on Wikipedia.

"Publication" involves making something available to the public, not just to "established scholars". The 1904 catalog is published: you may cite it to support a claim that Longleat has a list of who was inside the castle. The 1644 garrison list itself is unpublished (or was, as of the date you stipulated for this exercise): you may not cite it for anything.

Have you read Wikipedia:Published recently? It already covers this, but if you'd like, we could expand it to specifically name "archived somewhere and only established scholars (or members of the religion, or whatever) are allowed to look at it" as an example of something that is not published. WhatamIdoing (talk) 19:45, 9 July 2013 (UTC)

"carefully"

In the following excerpt from policy, what is meant by "carefully"? Is it referring to the sentence following it, i.e the sentence beginning with, "All interpretive claims..."?

"Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources, though primary sources are permitted if used carefully. All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source, rather than to an original analysis of the primary-source material by Wikipedia editors."

--Bob K31416 (talk) 02:19, 24 June 2013 (UTC)

Yes and no because I think one has to assume that the opening paragraph is a brief summary of the rest of the section, and so it also refers to the Primary sources paragraphs below. To understand why there is similar wording used twice one has to look at the history of the policy, picking one old version at random see for example the last version from December 2005, helps with that understanding. It would probably be a good idea to merge any of the details of that line not already covered by the sentence "Do not analyze, synthesize, interpret, or evaluate material found in a primary source yourself; instead, refer to reliable secondary sources that do so." and then remove it. -- PBS (talk) 08:41, 25 June 2013 (UTC)

Re "Yes and no because I think one has to assume that the opening paragraph is a brief summary of the rest of the section, and so it also refers to the Primary sources paragraphs below." — The Primary sources paragraphs appear to give an example and some explanation of what was already mentioned in the sentence following "carefully". ~~So it looks like the sentence following "carefully" correctly describes what is meant by "carefully".~~ --Bob K31416 (talk) 13:47, 25 June 2013 (UTC)

Alternate change in wording related to discussion above

Right now discussion is regarding wording of when primary sources use is appropiate. Discussion is between saying though primary sources are permitted if used carefully or Reliably published primary sources are permitted if used carefully, and if one is careful to avoid original research.

I think that none of the two wordings is appropiate. They both estate that IF used carefully and avoiding original research THEN primary sources are permitted. This is probably untrue. The two (careful use and avoidance of OR) are pre-requisites to be used, but their fullfillment is not enough, since many times there will still be problems for its use, mainly realated to undue weight, existing secondary sources contradicting the primary, or enough secondary sources that make it redundant. I propose to change it to:

Use of reliable primary sources may occassionally be appropiate if they are used carefully to avoid original research or give undue weight to them.

--Garrondo (talk) 09:45, 26 June 2013 (UTC)

I think that if we can produce a nuanced definition of the words primary, secondary and tertiary with respect to sources, we'll go a long way towards giving editors the tools they need to make the judgment calls that this topic area requires.—S Marshall T/C 12:29, 26 June 2013 (UTC)

Realistically, we want editors who edit articles, not editors who spend more time reading policies than adding good information to articles (maybe I'm exaggerating a little). So we can't really expect editors to dismiss from their mind the definitions of "primary" and "secondary" they have learned from their studies and occupation in favor of a definition contained in a Wiki policy while they are editing. But looking collectively at the various occupations and academic areas, "primary" and "secondary" are loosely defined. Thus we shouldn't write restrictive policies that would exclude sources that would be secondary in the minds of the group that wrote the policy, but primary in the mind of an editor who wants to use it in an article. Jc3s5h (talk) 12:49, 26 June 2013 (UTC)

I just noticed that trying to explain what "carefully" means can incur the problem of giving a sufficient condition for the use of primary sources that allows violations of other policies and guidelines by not including all other restrictions in the sufficient condition. We should avoid this problem and the problem of the vagueness of "carefully" by wording the paragraph so that it states the requirements of NOR without contradicting the requirements of other policies and guidelines. For those purposes, please consider the following change to the subject paragraph, where additions are underlined and deletions are struck out.

"Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources and primary sources. Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources~~, though primary sources are permitted if used carefully~~. All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source, rather than to an original analysis of the primary-source material by Wikipedia editors."

--Bob K31416 (talk) 12:55, 26 June 2013 (UTC)

I disagree with using a qualifier like occasionally. There may be the instance where a page has a primary source three times used for the same reason, but then someone wants to remove 2 of those, because it says occasionally. For natural disasters, most that is available of them is primary news reporting. Secondary sources don't always catch up on the vast amounts of different reports from primary sources, or don't even have those facts discussed. Someone may ask an important question on a wikitalk page, and to find the answer to that, a primary source sometimes has to be used.

The wording to Bob's proposal, while it is clear to understand, it is written blurry. All sources are primary, secondary or tertiary, but it did prioritize them. I think the purpose of it saying articles should be based on secondary and tertiary sources was for notability, and established publications. This suggestion is borderline ok. - Sidelight 12 ^Talk 17:54, 26 June 2013 (UTC)

Please note that the only changes I made to the current version of policy was adding the underlined part, "and primary sources", and deleting the struck out part, "~~, though primary sources are permitted if used carefully~~". --Bob K31416 (talk) 18:33, 26 June 2013 (UTC)

I noted that. It says these three types of sources can be used, which all there is. Its only additional meaning was to prioritize them. I think primary sources were separated from the other two for a reason having to do with established interpretation or a similarly related reason. - Sidelight 12 ^Talk 03:06, 27 June 2013 (UTC)

The changes are minor except for removing the vague term "carefully". It's essentially the same paragraph as the current version of policy but without using the word "carefully".

I'm not sure this addresses your point, since, for example, it wasn't clear to me what your point was in mentioning, "It says these three types of sources can be used, which all there is", and whether whatever you are referring to is a characteristic of the current version in policy too. If you mean that the proposed version implies that all sources fit into these three categories, that is what the current version in policy implies too. Do you disagree with that categorization in the current version of policy? For the rest of your message regarding "prioritize", that's what the current version of policy does too. Do you disagree with that prioritization in the current version of policy? Regarding "primary sources were separated", if you mean they didn't appear in the first sentence in the current version of policy, that was part of the problem since the first sentence of the current version of policy suggested that primary sources should not be used by leaving out mention of them in the first sentence.

Again, the only changes are adding "and primary sources" to the first sentence, and deleting ", though primary sources are permitted if used carefully" from the second sentence. --Bob K31416 (talk) 11:44, 27 June 2013 (UTC)

The proposal said, primary, secondary, and tertiary sources could be used, and that's all that exists. The wording only had value in prioritizing secondary sources over the other two, which is fine.

From the original wording, secondary and tertiary sources had more in common than primary sources, that's why I think they were lumped together. Wikipedia is also a tertiary source so other tertiary sources could be competition to wikipedia, that's why it said articles should rely less on tertiary sources, also they were already complete. It may not matter that primary sources and tertiary sources are on the opposite sides of the spectrum, saying "to a lesser extent" for those two can be used is fine. I noticed what you struck out and added. - Sidelight 12 ^Talk 09:28, 28 June 2013 (UTC)

Thanks. In your first response you wrote, "This suggestion is borderline ok." Do you still have that same opinion? If so, and since you are the only one to comment on my suggested change, it appears that I should wait until there is more support before implementing the change. --Bob K31416 (talk) 11:18, 28 June 2013 (UTC)

Its ok by me. Primary and tertiary sources were separated in the original wording per different reasons. The new proposal removes this, but the proposed definition in how they are used works out. I don't know how important or if it is important to keep this reasoning. It may not even be necessary, but my opinion on whether preserving the old reasoning is not strong. The wording is fine, it just removes the implicated reasoning that separated the primary and tertiary.

Based less on tertiary sources (per one reason), based less on primary sources (per a different reason). Primary sources run a higher risk of the editor interpreting it, secondary and tertiary sources are already interpreted. So the wording can be kept as you proposed, and state whatever it was not to interpret on one's own. Secondary sources can also be interpreted by the editor too. To use reliably published of all three types sources (as it is in the proposal), and not to use sources that are of the editor's synthesis, by one definition. Now that I think of it, the proposal seems better. (I may not be back soon to respond) - Sidelight 12 ^Talk 12:16, 28 June 2013 (UTC)

Sidelight, if the secondary sources don't "catch up on" details that someone wants to know, then that's a signal that those particular details are probably unencyclopedic trivia. It is sometimes helpful to read professionally written encyclopedia articles on similar subjects, like this one, to keep some perspective. WhatamIdoing (talk) 06:47, 27 June 2013 (UTC)

Not necessarily. For plant perception articles, it talks about how plants react to stimuli. It was missing that plants could sense. A primary source was the missing link for information to add what allowed the plants to react. The Aurora Borealis had sounds associated with it, and I used a primary source by a university study to mention this in the article. Eventually a secondary source was added, but the primary source had to jumpstart it. The primary source is still the better source, it still explains the phenomena. There are other articles were insight is lacking, and primary studies could give what little evidence there is. There is not always enough effort available towards secondary sources to cover everything. Also there is information on molecules or plants, where there is ten years of research on it, and there is no secondary source to cover this research. From your link, I have an old version of encyclopedia Brittanica, that I used to use all the time. If a secondary source lacked for instance (maybe not in this case, but in similar cases) the exact time, epicenter, etc a primary source would have to be used. - Sidelight 12 ^Talk 09:28, 28 June 2013 (UTC)

If you couldn't find a secondary source that says plants can sense things, then I suggest that you haven't looked very hard. WhatamIdoing (talk) 20:07, 1 July 2013 (UTC)

Statistical operations

I reverted an edit with had the sentence "Summarizations based on statistical methods, however, are original research by synthesis, as they involve the reinterpretation of data." A summary of the data (I believe that is what was meant by the word "Summarizations" is just that, a summary. It can include average (mean), standard deviation, skewness, median and mode. These can all be calculated in a purely mechanical manner. For the record, the expected value is often the mean value. It only becomes an interpretation when I try to explain what these values mean. I believe that the following statement is quite in order: "The average maximum temperature in June MyTown between 1964 and 2013 was 24°C and the standard deviation was 1.5°C. Martinvl (talk) 16:36, 30 June 2013 (UTC)

The problem with the measures average - median - mode is that they are all measures of central tendency that are identical for normally distributed data but NOT for other distributions. In e.g. the Weibull distribution the average has little relevance. Nevertheless by reporting the average it gets meaning in the article. Similar issues exists for SD. In fact I have seen reports where the average gender = 1.49 with an SD =.50 where reported (meaningless) or response times with an average less than 1 SD above zero (implying that about 17% of all response times were negative..... going back in time). Therefore any statistical summary without paying due account of the distribution it was based on makes it likely that other editors will misinterpret those.

For that reason alone I would prefer to err on the safe side and include "summaries based on statistical method" to original research. At least those that do not provide critical reflection on the distribution of the data. Arnoutf (talk) 17:11, 30 June 2013 (UTC)

Another problem with ""The average maximum temperature in June MyTown between 1964 and 2013 was 24°C and the standard deviation was 1.5°C" is the precision, or lack of it. I.e. temperatures, especially extremes and averages, are often given to one decimal point, so 24 seems too imprecise. There are no conventions for standard deviation as it's hardly used but 1.5 seems far too imprecise: if it's accurate to half a degree then the actual value's between 1.25 and 1.75, a range of 40%. Even if it's accurate to 1 decimal place the true value is over a range of over 10%.

Which shows why we need to base such statements on sources: so much goes into doing the calculation, not just what sort of average but the details of the calculation and precision, where numbers are rounded and in what way, etc. If experienced it's easy to make sensible decisions over how to do the calculation yourself but you are making decisions, ones that effect the result, and it is far from purely mechanical, at least if you're doing it properly.--JohnBlackburne^words_deeds 18:07, 1 July 2013 (UTC)

Reply for Martinvl, Arnoutf and JohnBlackburne:

About "Numerical summarizations" or "Treatment of numeric data" (that was not treated at source), based on routine calculations: I added a new section below, for discussion. --Krauss (talk) 18:38, 1 July 2013 (UTC)

About "summarizations based on statistical methods" and the revertion: it involves some statistics expert work, so, the expert need to put its work into a reliable source. There are a "(statistic) routine calculation" at Wikipedia? Well, what is "routine" or "purely mechanical", based in the Wikipedia tradiction and history? There are examples of accepted wikipedist statistical work? I think we can adopt tradiction as parameter. --Krauss (talk) 18:38, 1 July 2013 (UTC)
PS: a typical confusion is about arithmetic mean (valid summarization) and expected value (mean and standard deviation calculated by reliable source).

True, arithmetic mean is a calculation which in itself is (more or less) straightforward. That mean is however meaningless to predict expected value (or for anything else) in non-normal distributions indeed ;-) Arnoutf (talk) 18:43, 1 July 2013 (UTC)

Basic descriptive statistics ought to be accepted, so long as all the editors agree that the information in accurate and relevant. Nobody should object to looking at Heights of presidents and presidential candidates of the United States and saying "US Presidents have ranged in height from X to Y", even though range (statistics) is a statistical calculation. Editors aren't supposed to turn their brains off.

If editors at an article don't agree on the accuracy and relevance, then they can have an RFC or discuss it at NORN to resolve the dispute, like any other dispute. WhatamIdoing (talk) 20:05, 1 July 2013 (UTC)

Actually, can we add "Editors aren't supposed to turn their brains off" to the policy? Preferably in large, red letters. That throb with urgency.—S Marshall T/C 20:18, 1 July 2013 (UTC)

I think the words about statistical methods are not intended to prevent mechanical things like calculating averages. They are intended to prevent things like this: (1) applying a statistical test (say a t-test) to determine that US presidents are significantly taller than UK prime ministers. (2) writing that 80% of scholars have some opinion by counting journal articles. Zero^talk 21:53, 1 July 2013 (UTC)

Mmm I think you underestimate how much brains can be turned off. Just today I reviewed a paper submitted to a scientific journal that reported gender (female=1, male=0) with following summary: Female: Max score=1; Min score=0; Mean=0.51; SD=0.50. If this is the level of statistics that university staff feels ok to submit to be published in a scientific journal I am worried that the "relevance" of statistics will rapidly become a cause for much heated dispute. So I would be extremely reluctant to allow summary statistics to be reported. It is probably not the calculation that is the problem here, but the interpretation...... (I am not kidding about the example; this kind of mess is seriously submitted to scientific journal - This was far from the only problem, so I advised the editor to reject the paper). Other examples I have encountered are things like Response time was on average 5.4 seconds (SD 6.7 seconds. Euhm, that implies the negative response times occur at less than 1 SD from average. ).(ok these latter numbers are made up but I have seen such things). Arnoutf (talk) 18:07, 3 July 2013 (UTC)

There's the question of when it's appropriate to use statistics, and the question of what to do when a source is obviously wrong. I think it's best to treat these two as separate questions.—S Marshall T/C 14:46, 4 July 2013 (UTC)

Proposal to add subsection "Numerical summarizations"

As discussions above, #Statistical operations and #Summarizations based on routine calculations, I think that the subsection "Numerical summarizations" of section Routine calculations, or a similar text, can be added. --Krauss (talk) 11:08, 2 July 2013 (UTC)

(TEXT1 OF) Numerical summarizations

Treatment of numeric data is an encyclopedic issue: summarization by sum, average, etc. are necessary expedients, and should not be confused with original research.

Example: totals and subtotals are complements of numeric table presentation. If the source show (without any summarization) "1+1+1+1", Wikipedia article can express with summarization "1+1+1+1=4". To express only the result 4, not explicited by a source, it can a point for discussion.

Summarizations based on statistical methods, however, is original research by synthesis, as they involve the reinterpretation of data. It is common to confuse the arithmetic mean (summarization) with the expected value (mean and standard deviation calculated by reliable source). In case of doubt (summarization vs. statistical reinterpretation), discuss first.--Krauss (talk) 11:08, 2 July 2013 (UTC)

May I suggest the following (using the word "summaries" rather than "summarizations". "Summarization" is not a UK English word).

I oppose this. "US presidents have ranged in height from 163 to 193 cm" is a "summarization based on statistical methods". The statistical method in question is range (statistics). We do not want to ban this. WhatamIdoing (talk) 19:49, 9 July 2013 (UTC)

(TEXT2 OF) Numerical summaries

The generation of numerical summaries of data using routine techniques such as summation or the calculation of averages, standard deviations and other processes that are standard spreadsheet functions are not "original research". However interpretation of the data using statistical methods is "original research". For example, stating that the average height of a group of 200 people was 180 cm and the standard deviation was 8 cm is not original research, but to make the statement "therefore we can expect 136 people (68%) to have a height of between 172 cm and 188 cm (180 ± 8 cm" is original research (unless it is being used as an example in an article on how to manipulate statistical data).

In case of doubt (summaries vs. statistical reinterpretation), discuss first.

Martinvl (talk) 16:03, 2 July 2013 (UTC)

(TEXT3 OF) Summarizing numerical data

The generation of numerical summaries of data using routine techniques (with valid routine calculations) such as summation or the calculation of averages, standard deviations and other processes that are standard spreadsheet functions, are not "original research". Example (see hint explanation moving mouse onto table cells):

quant. A	quant. B	Perc. of A	Diff.	Accum.
20	123	16,26%	0,00%	0
40	234	17,09%	0,83%	0,83%
55	300	18,33%	1,24%	2,07%
115	657	17,23%	1,04%
(without background) Source data
(this background) Calculed by wikipedist
(this background) Summarized by wikipedist

However summarization of the (source) data using statistical methods is original research. Statistical interpretation like expected value creates a new interpretation of truth, so is not "only an encyclopedic synthesis". Below some few examples where common sense decides if is bether to avoid the numerical treatment (or to discuss before add treatment to the article):

Case	Valid interpretation of source	Looks like original research (need source or discussion)
Source show data as "`1+1+1+1`"	Wikipedist show data with the summarization (like to add a translation): "`1+1+1+1=4`"	Wikipedist show only the summarization: "`4`" A footnote or a comment in the discussion page is recommended, when showing only the result of a (not obvious) summarization was done.
Arithmetic mean	For data summarization, interpreted as average.	Interpreted as expected value, with mean and standard deviation calculated.
Source show data as "`0.22 ± 0.01; 0.30 ± 0.03`"	Wikipedist show some data item as sample, "`0.22 ± 0.01`", or show all as an (valid context) average "`0.26 ± 0.01`"	Wikipedist round or add decimals: "`0.2 ± 0.01`" or "`0.220`" Or do an average without error propagation rules: "`0.26 ± 0.02`"... Or mistook, using propagation rules when should be using standard deviation.
Range (statistics)	As in arithmetic, the difference between the largest and smallest values. Same as `MAX(X)-MIN(X)`.	When have a more complex meaning, using the descriptive statistics interpretation of the concept of range.
Source show a table or a list with N itens	Wikipedist show both, the itens and the counted N; or show only N for summarize the "volume of data" at the source.	Wikipedist use only N to interpret another measure, example Relative species abundance.
...there are many other...	...	...

--Krauss (talk) 13:16, 3 July 2013 (UTC), edited 8 July 2013 (UTC)

(TEXT4 OF) Summaries of numerical data

TEXT4-COMMENT: At the risk of being overly inclusive, in my view this covers the main issues.

The generation of numerical summaries of data using routine techniques such as summation or the calculation of averages, standard deviations and other processes that are standard spreadsheet functions are not "original research".

There are two things that must be kept in mind:

Numerical summaries can only be made if they make sense in the context. In practice this mean that summaries can only be made:
1. if units of measurement are identical (adding 5 miles to 6 kilometers and arrive at 11 makes no sense)
2. if (social) constructs are operationalized in the same way (e.g. London city is much larger than Athens city, in part since Athens is divided in many independent municipalities, while London is one city – The UK and Greece operationalize cities differently, hence summaries of numbers of inhabitants of cities across the UK and Greece makes no sense).
3. the type of data is appropriate to the chosen operation (e.g. it is possible to calculate average and standard deviation of gender in a population, but the numbers Average gender=0.51 female, SD=0.50 make no sense – The average person is either male or female not 51% female).
Interpretation of the data using statistical methods is "original research". For example, stating that the average height of a group of 200 people was 180 cm and the standard deviation was 8 cm is not original research, but to make the statement "therefore we can expect 136 people (68%) to have a height of between 172 cm and 188 cm (180 ± 8 cm" is original research (unless it is being used as an example in an article on how to manipulate statistical data).

In case of doubt (summaries vs. statistical reinterpretation), discuss first.

TEXT4-COMMENT: Arnoutf (talk) 14:48, 4 July 2013 (UTC)

TEXT4-VOTE: ACCEPTED --Krauss (talk) 15:14, 4 July 2013 (UTC)

Oppose — For the following reasons:

1) Re "The generation of numerical summaries of data using routine techniques such as summation or the calculation of averages, standard deviations and other processes that are standard spreadsheet functions are not "original research". " — I think that unless the result is reasonably obvious to most readers, it should be considered OR. For example, the average of a few numbers would be reasonably obvious to most readers, but not the average of many numbers. I don't think that the standard deviation is reasonably obvious to most readers in any case; in other words, most readers seeing a calculated standard deviation in an article would not have an inkling about whether it is right or wrong. That's why reliable sources are useful so that the reader can see that someone credible has made the calculation, rather than an anonymous contributor to Wikipedia whose credibility is consequently unknown.

2) Re the part: "1. There are two things that must be kept in mind:" — This is an inappropriate digression for this policy page since it instructs (in a questionable way) how to analyze data, rather than how to avoid OR.

3) Re "For example, stating that the average height of a group of 200 people was 180 cm and the standard deviation was 8 cm is not original research" — I'd say that should be considered OR. From the Routine calculations section,

"Basic arithmetic, such as adding numbers, converting units, or calculating a person's age, is allowed provided there is consensus among editors that the calculation is an obvious, correct, and meaningful reflection of the sources."

I think what is meant here is that the result of the calculation is obvious. In the case of 200 people, the average and standard deviation isn't obvious. In the case of a few people, the average would be somewhat obvious but the standard deviation would not.

--Bob K31416 (talk) 17:46, 6 July 2013 (UTC)

Yes, I agree... Please check "TEXT5" below (or "TEXT1", "TEXT3" above), and say if you "Oppose"... I think not, so, we can use TEXT5. You can also edit or create your TEXT-N. --Krauss (talk) 12:21, 8 July 2013 (UTC)

I don't think the other proposals are worthwhile for the reasons I just gave, and/or because of the amount of space they would be using compared to their significance for this policy. You might consider putting your ideas in an essay. See Wikipedia:Wikipedia essays. --Bob K31416 (talk) 15:41, 8 July 2013 (UTC)

Thanks, but "essay" seems a very hidden thing. Can you help me to coordenate this proposal? There are insufficient votes in the ballot... You and others can "clean/edit" it, I think not need more than two paragraphs. TEXT5 is bigger because with illustration is more simple to show the point and obtain consensus. --Krauss (talk) 03:20, 9 July 2013 (UTC)

Essays considered useful aren't hidden. Regarding helping you coordinate, I haven't seen anything worthwhile to add to this policy. --Bob K31416 (talk) 13:13, 9 July 2013 (UTC)

Ok, I'll put TEXT5... Help me not deleting, waiting for someone else to say something or editing there... If they vote against, then I try an essay. --Krauss (talk) 14:36, 9 July 2013 (UTC)

(TEXT5 OF) Summaries of numerical data

The repeated use of "routine operations" (basic arithmetic), such as summation, "products of sequences", or the calculation of averages, are not original research, when used for well-knowed (and consensual) forms of "numerical synthesis", and can be interpreted by the article's reader as summaries of numerical data. Example:

quant. A	quant. B	Perc. of A	Diff.
20	123	16,3%
40	234	17,1%	0,8%
55	300	18,3%	1,2%
Total: 115	Total: 657	Average: 17,2%	Average: 1,0%
(without background) Source data
(this background) Calculed by wikipedist
(this background) Summarized by wikipedist

The table above illustrates an encyclopedic issue produced with source data and NOTOR Simple calculations. It "translates and synthesizes" the source data, with accuracy and neutral point of view; preserving "the truth" of the source. A "new truth" can be produced by some statistical methods, such when interpreting an average as an expected value, so in case of doubt (summaries vs. statistical reinterpretation), discuss first.

TEXT5-COMMENT: --Krauss (talk) 12:21, 8 July 2013 (UTC)

The table is complete gibberish. Jc3s5h (talk) 13:18, 8 July 2013 (UTC)

?? Of course, it is an illustration of calculations, not an article. Please be polite, for not be ignored. For another readers: there are a lot of misunderstands here, the table is putting all minds in the same point. --Krauss (talk) 14:50, 9 July 2013 (UTC)

(TEXT5 COMPRESSED)

TEXT-COMMENT: here a "compressed version" of TEXT5, "because of the amount of space (...) would be using compared to their significance for this policy", as Bob_K31416 pointed. The new sugestion here is to add only a paragraph, not a new subsection, neither a table. --Krauss (talk) 15:44, 9 July 2013 (UTC)

Routine calculations do not count as original research. (...)

The recursive use of routine calculations, such as summation, products of sequences, or the calculation of averages, also do not count as original research, when interpreted by the article's reader as a summary of numerical data — i.e. when used for well-knowed (and consensual) forms of "numerical synthesis".
Example: totals and subtotals are complements of numeric table presentation. If the source show a list (without any summarization) "{1,2,3,1}", Wikipedia article can express the same list with its summarization "{1,2,3,1} Total 7", if the sum make sense to the article and to list units.

Oppose — I've already addressed types of problems in this proposal in my previous comments,[11] and you said you agreed.[12] --Bob K31416 (talk) 16:41, 9 July 2013 (UTC)

I note that you implemented this proposal 15 minutes before you proposed it.[13] I just deleted it. --Bob K31416 (talk) 17:00, 9 July 2013 (UTC)

About my agree: I changed the text a lot, please check this compressed version, it reflects my agree. The main problem was about "statistics interpretation" and discussions about, I removed. I not see any point of opposition, please explain.

PS: I am editing with two browsers-tabs, no matter of few minutes, plase put back for others appreciate few days. --Krauss (talk) 20:03, 9 July 2013 (UTC)

We don't seem to be communicating well enough to continue this discussion. This Talk page, not the policy page, is the place for displaying proposals. Please do not add any proposals to the policy page without consensus. --Bob K31416 (talk) 20:47, 9 July 2013 (UTC)

Ok, there was a question for you, "I not see any point of opposition, please explain". So, other question is How to vote objectively here?!? It is a very simple text here (!), everyone here discuss and come back to the same place, nobody is voting a final text. --Krauss (talk) 13:40, 10 July 2013 (UTC)

Working definition numerical tratment

Please, if you not agree about #Summarizations based on routine calculations, show here what you understand about (valid and not valid):

Routine calculations: ...(if you think not obvious or not consensus here) Your Definition HERE Please...

Summaries of numerical data: ...(if you think not obvious or not consensus here) Your Definition HERE Please...

[ User:Krauss posted the above on 8 July 2013]

The question is whether or not an editor is trying to publish hitherto unpublished research, or whether the editor is genuinely summarising exiting information. I do not think it feasible to specifiy exactly what is and what is not WP:OR. I favour replacing the sentence

"Routine calculations do not count as original research. Basic arithmetic, such as adding numbers, converting units, or calculating a person's age, is allowed provided there is consensus among editors that the calculation is an obvious, correct, and meaningful reflection of the sources."

with

"Routine calculations including but not limited to basic arithmetic, such as adding numbers, converting units, or calculating a person's age, do not count as original research, provided there is consensus among editors that the calculation is an obvious, correct, and meaningful reflection of the sources."

This wording allows any type of summary, priovided that the editor concerned is not trying to publish hitherto unpublished research.

Martinvl (talk) 13:42, 8 July 2013 (UTC)

What type of calculations are you trying to include along with basic arithmetic? For example, are you trying to include statistical analysis such as averages, standard deviations, etc. as you proposed in (Text2 OF) Numerical summaries? If so, please see my comments in the section (TEXT4 OF) Summaries of numerical data. --Bob K31416 (talk) 15:17, 8 July 2013 (UTC)

If it is appropriate for the article concerned, yes, but only if there is consensus that it is appropriate. In 99% of articles, it will be inappropriate, but we need to write rules for all 100% of articles, not just 99% of them. Martinvl (talk) 17:26, 9 July 2013 (UTC)

Would the following work for you?

Routine calculations do not count as original research, provided there is consensus among editors that the result of the calculation is obvious, correct, and a meaningful reflection of the sources. Basic arithmetic, such as adding numbers, converting units, or calculating a person's age are some examples of routine calculations.

I incorporated an aspect of your version, "including but not limited to", by using the phrase "are some examples". I kept the number of sentences to two instead of one long sentence. I changed from "the calculation" to "the result of the calculation" to clarify. --Bob K31416 (talk) 20:31, 9 July 2013 (UTC)

And, how about adding this paragraph?

The recursive use of routine calculations, such as summation, products of sequences, or the calculation of averages, also do not count as original research, when interpreted by the article's reader as a summary of numerical data — i.e. when used for well-knowed (and consensual) forms of "numerical synthesis".

It incorporates the basic aspects of "summarizations". --Krauss (talk) 17:23, 10 July 2013 (UTC)

I've already discussed some of the problems with this in previous discussions with you. --Bob K31416 (talk) 22:16, 10 July 2013 (UTC)

Krauss, may I ask what subjects or topics you are used to dealing with? Which calculations are okay depends a lot on the subject matter. WhatamIdoing (talk) 22:51, 10 July 2013 (UTC)

Proposal re introduction to section "Primary, secondary, and tertiary sources"

The introductory paragraph to the section Primary, secondary and tertiary sources currently is

"Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources. Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources, though primary sources are permitted if used carefully. All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source, rather than to an original analysis of the primary-source material by Wikipedia editors.

I propose making the following changes indicated by one underlined part and one strikeout part for an addition and a deletion respectively. Also, there is a minor edit of moving the wikilink for primary sources to the preceding sentence.

Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources and primary sources. Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources~~, though primary sources are permitted if used carefully~~. All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source, rather than to an original analysis of the primary-source material by Wikipedia editors.

The purpose of the proposed changes is to (1) clarify in the first sentence, rather than later, that primary sources aren't prohibited and (2) remove the term "carefully" which doesn't have a clear meaning, noting that the remaining part of the paragraph summarizes this policy's position on the use of primary sources. The above changes result in the following proposed version.

Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources and primary sources. Secondary or tertiary sources are needed to establish the topic's notability and to avoid novel interpretations of primary sources. All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source, rather than to an original analysis of the primary-source material by Wikipedia editors.

Please comment on the above proposal and also indicate Support or Oppose in your comments. Thanks. --Bob K31416 (talk) 12:12, 28 June 2013 (UTC)

support - It removes the implicated reasoning that separated primary from tertiary, but that meaning is not lost. Its an improvement to say all three types of sources must be reliably published, and that other types of sources than secondary may be used to a lesser extent. - Sidelight 12 ^Talk 12:21, 28 June 2013 (UTC)
comment – It is better than what it would replace. However, the last sentence has several problems, not the least being that its meaning is unclear. What are "claims about ... sources", and how is that related to drawing conclusions from the content of sources? Also, "must be referenced to a secondary source" is clunky English. Also, referencing claims to a tertiary source is not good? Finally, is it fine to make "interpretive claims, analyses, or synthetic claims" about secondary sources then? Surely the purpose of the rule is to avoid OR, including SYNTH, regardless of where the raw material for the OR comes from. Zero^talk 13:34, 28 June 2013 (UTC)

The proposal does not change anything in the last sentence. Please note that the only changes proposed are the one part indicated by underline and the one part indicated by strikeout. Changes you would like to see for the last sentence are beyond the scope of the present proposal and can be the subject of a future proposal after this proposal is settled. --Bob K31416 (talk) 14:09, 28 June 2013 (UTC)

Tertiary sources are okay, except its similar to the equivalent of Britannica referencing a competing encyclopedia. This is not the exact case with wikipedia, since its goal is different than traditional encyclopedias. Wikipedia strives to be different than, and less reliant on other encyclopedias. Note, this is not the case with all tertiary sources, since some of them can be textbooks, and not encyclopedias. Also, Wikipedia:What_SYNTH_is_not#SYNTH is not unpublishably unoriginal only says original research is not allowed by the wiki editor, but it is allowed by the published source. - Sidelight 12 ^Talk 07:45, 23 July 2013 (UTC)

Transferring consolidated discussion to an essay

In the context of "Numerical summarizations", as suggested by Bob_K31416 at TEXT4, I did my homework, starting an essay: Wikipedia:About Valid Routine Calculations. All here are invited to complete/correct/discuss/etc. the essay... And perhaps return here with a consensus. --Krauss (talk) 21:33, 14 July 2013 (UTC)

I think this is already covered in the essay, wp:what SYNTH is not. - Sidelight 12 ^Talk 06:00, 21 July 2013 (UTC)

Yes, I added the item SYNTH is not numerical summarization, but not see at that page or other article, any "in-depth characterization" of the problem discussed and not resolved here...
PS: if you understand that the problem is solved, please explain why this change is made with no explicit consensus, and why the suggested change (addding "The recursive use of routine calculations, such ...") need a new essay and a lot of "more discussion". --Krauss (talk) 15:14, 22 July 2013 (UTC)

I prefer the previous version better. They are almost the same, but in the newer wording more emphasis is put on consensus allowing more variation in what constitutes what is allowed, rather than plainly stating routine calculations are allowed. I thought about commenting on that, but found it still to be ok. (I mistakenly thought this edit was made to the new essay)

I gathered that basic calculations were allowed from the section SYNTH is not ubiquitous. Ok, the new essay does describe it better. The essay What Synth is not is a lifesaver for providing the grounds to allow basic calculations and the new essay among other things. - - Sidelight 12 ^Talk 07:27, 23 July 2013 (UTC)

Propose change to footnote on book reviews

There is currently a footnote (#7) that includes this:

Avoid using book reviews as reliable sources for the topics covered in the book; a book review is intended to be an independent review of the book, the author and related writing issues than be considered a secondary source for the topics covered within the book.

To start with, it isn't English (probably should be "rather than") and I will fix that regardless. However I'm raising it here because academic book reviews are frequently written by reviewers expert in their own right whose words can certainly be taken as reliable. It is perfectly normal for such reviews to contain information on the topic from the reviewer's point of view, not relying on or even necessarily agreeing with the book under review. So I propose this modification:

Avoid using book reviews as reliable sources for the topics covered in the book; a book review is intended to be an independent review of the book, the author and related writing issues, rather than be considered a secondary source for the topics covered within the book. Exceptions to this can arise when the reviewer is an acknowledged expert on the topic.

Comments? Zero^talk 00:48, 29 July 2013 (UTC)

I suspect you're motivated by some situation you encountered. If so, could you share that example?

Regarding the rest of footnote 7, it seems like it should be moved to the article Book review and replaced with a wikilink to that article. --Bob K31416 (talk) 01:44, 29 July 2013 (UTC)

If I was motivated by an example I would resist sharing it, since the discussion would be diverted to arguing the merits of the case, and hard cases make bad law. However there is no example in this case; I was just reading the policy and noticed this issue. In my field of editing (history) it is quite common for book reviewers to be more famous than the book authors, and I don't see why their words should be excluded. Zero^talk 09:43, 29 July 2013 (UTC)

I tend to agree, in fact to the extent that I think the footnote should be deleted. I don't see why book reviews aren't evaluated as sources just like any other RS, without the policy advising editors to avoid them. See Wikipedia:Avoid instruction creep. --Bob K31416 (talk) 12:46, 29 July 2013 (UTC)

I looked into the history of this footnote and it seems that the part about "Avoid using book reviews..." was put in with this edit[14] without mention in the edit summary and without discussion on the talk page. --Bob K31416 (talk) 13:16, 29 July 2013 (UTC)

I think there is a valid point that is being made. Namely, when a review simply reports something from the book, like "The book says that X is true", it would be better that we cite the book for X (after looking at the book!) rather than citing the reviewer as citing the book. But why is this in a page about Original Research? It belongs somewhere else, perhaps in WP:RS. Zero^talk 14:56, 29 July 2013 (UTC)

Our rule is to WP:SAYWHEREYOUGOTIT, even if that means citing the second-hand source. I can imagine book reviews being abused, but the compelling point for me is "why is this in a page about Original Research?" Perhaps it could be offered up at WT:RS for possible inclusion there. WhatamIdoing (talk) 01:47, 6 August 2013 (UTC)

OR as applied to figures

The subject of illustrations does not come up in WP:NOR. Shouldn't there be some guideline provided here?

It is obvious that figures are used in discussions in sources and throughout WP. A WP figure has to avoid copyright restrictions, and so cannot be a straight copy of a published figure in most cases. How far from a published figure can a WP figure stray without becoming OR?

I'd suggest the problem is even larger than this, as a figure often can help to illustrate text, and often a completely new image is necessary. It would seem a reasonable proposal governing OR in figures might be the following:

Figures can be used in WP to illustrate text, and these figures may be completely original providing they clearly demonstrate properly sourced text, text that in itself clearly is not original research. In the event of some question about the felicity of the figure to the text, a Talk-page discussion of the comparisons between the figure and the text should be undertaken, and if necessary the relation of the figure to the text should be made clear in the text, or the figure modified to more clearly represent the text.

Any suggestions? Brews ohare (talk) 13:57, 5 August 2013 (UTC)

Apparently I missed this section of WP:OR that refers to original images. Perhaps some changes in wording are advisable? Brews ohare (talk) 14:21, 5 August 2013 (UTC)

You don't mention synthesis Brews which is frequently a problem with figures used in sources, especially when you are attempting to combine a picture on a powerpoint slide set with one from a different author in an academic text book. There is also an OR element when you start to draw conclusions in the illustration that are not necessarily there in the original. ----Snowded ^TALK 14:05, 5 August 2013 (UTC)

Snowded: I think it is the correct representation of properly sourced text that is the basic question. It is not a matter of whether an original figure happens to be a synthesis of already published figures, but whether the new figure is an accurate demonstration of the text. Brews ohare (talk) 14:21, 5 August 2013 (UTC)

Then its very difficult to see how you avoid original research. You are interpreting the text into a diagram, something which is an inevitable simplification in the first place. ----Snowded ^TALK 14:25, 5 August 2013 (UTC)

Snowded: A figure is a translation of words into images. Obviously there is artistic license involved, and there are many ways to do it. That freedom does not become OR so long as the figure does not distort the text it illustrates (provided that text is itself not OR). I fail to see why a figure must be a simplification, and even that is not objectionable if it is not an oversimplification. Brews ohare (talk) 14:53, 5 August 2013 (UTC)

When we use the word "figures"... Are we talking about presenting data (things like graphs, table and charts)? Or are we talking about images (user created drawings, schematics, maps, etc.)? Presenting data in pictorial form is very tricky to do correctly... it can be done, but the likelihood of introducing OR (either intentionally or inadvertently) is very high. If we say anything about it, we should include a strong caution. Blueboar (talk) 15:16, 5 August 2013 (UTC)

Agree, in the case Brews is thinking of he started by trying to integrate two pictures and is now arguing that he can summarise several sources into a picture. The danger of OR and synthesis is just too high ----Snowded ^TALK 15:19, 5 August 2013 (UTC)

The particular figure leading to this discussion is found here. It is a flow chart showing the relation between three technical terms used in the text of a WP article. It is not about 'data'. Snowded is confused by the factors involved in the genesis of the new diagram, leading him to ignore the purpose of the figure in the WP article, namely, to explain terminology. Brews ohare (talk) 16:36, 5 August 2013 (UTC)

Have you noticed a pattern yet Brews? Every editor who has the temerity to disagree with you either has not read the material properly or is confused. May it explains the block and subject ban record, all those people who are confused .... ----Snowded ^TALK 21:51, 5 August 2013 (UTC)

The beginning portion of WP:OR referring to images reads as follows:

"Because of copyright laws in a number of countries, there are relatively few images available for use on Wikipedia. Editors are therefore encouraged to upload their own images, releasing them under the GFDL, CC-BY-SA, or other free licenses. Original images created by a Wikipedian are not considered original research, so long as they do not illustrate or introduce unpublished ideas or arguments, the core reason behind the NOR policy."

My understanding of this wording is that the exemption of a figure from OR is granted provided the figure truly depicts the content of the WP text it illustrates, and that text has been deemed to be not OR. Creative spark can be found in many places, and artistic license is fine, as long as fidelity to the text is maintained.

It would seem some additional language is needed to get the italicized portion of the policy across. For example, editor Snowded is found to apply a different criterion, namely acceptance based upon the genesis of a figure, rather than its ability to illustrate the text. Snowded says the way a figure was formed: "by trying to integrate two pictures" the attempt is being made to "summarize several sources into a picture. The danger of OR and synthesis is just too high..."

This focus upon how a figure happens to be arrived upon has nothing to do with whether its final form fits the WP text it attempts to illustrate. Assessment of the figure is based upon its congruence with the text, not upon how it came to be.

Some rewording could emphasize that there is no supposition that a figure in WP must have some connection to a published figure. The only criterion is that it represent faithfully the content of WP text known to be not OR itself. Brews ohare (talk) 23:52, 5 August 2013 (UTC)

From the section Original images of this policy,

"Original images created by a Wikipedian are not considered original research, so long as they do not illustrate or introduce unpublished ideas or arguments, the core reason behind the NOR policy."

What are the unpublished ideas or arguments that are illustrated or introduced by the image in question? --Bob K31416 (talk) 00:27, 6 August 2013 (UTC)

From a very brief look, Snowded's primary concerns is that it combines elements from two unrelated sources.

Snowded, if you'd like an example of translating words into a figure without introducing even a small risk of OR, look at File:Autorecessive.svg. It is even possible to combine several sources to produce one figure. For example, consider a pie chart that shows the size of one US state (given in source A) as a proportion of the size of the entire US (given in source B).

But in this particular instance, I think that the question should be handled at the talk page's RFC or, failing that, at NORN. I don't think we benefit from changing our policy over this. WhatamIdoing (talk) 01:59, 6 August 2013 (UTC)

Re "From a very brief look, Snowded's primary concerns is that it combines elements from two unrelated sources." — This case might bring up a point re Synth vs allowed combining that might need clarification: combining elements from two unrelated sources is Synth only if there is a new conclusion that doesn't explicitly appear in any of the sources. If this were clearer, perhaps Snowded wouldn't be concerned, if this is the cause of his concern. --Bob K31416 (talk) 02:55, 6 August 2013 (UTC)

Bob: I think you have exactly the point that needs more attention. Brews ohare (talk) 05:43, 6 August 2013 (UTC)

File:Autorecessive.svg is a good example of something which can be verified with little ambiguity against the text. In the case concerned it is difficult to do so as the use of language etc. differs in the material and to my mind its synthesis to combine them. That said Brews originally argued a combination of two documents, but then shifted to saying it summarised the text so the ground is shifting. It does however belong as a conversation on the talk page of the article concerned. This is the second time now that Brews has abstracted a content dispute on one page into a request for a change of policy. ----Snowded ^TALK 06:28, 6 August 2013 (UTC)

It would be helpful if you could specify the new conclusion that is illustrated by the image. --Bob K31416 (talk) 13:22, 6 August 2013 (UTC)

I already did that Bob, on the talk page of the article here ----Snowded ^TALK 13:54, 6 August 2013 (UTC)

Snowded: Your linked criticism of a WP figure is based upon its difference from two published figures within two published sources concerned with their own agendas. Those published figures inspired the form of the WP figure, but their role in their particular sources does not concern the evaluation of the WP figure, which should be based upon its own context within the WP article. In short, does it reflect the WP text it illustrates?

As said by [WP:OR in its discussion of original figures, "Original images created by a Wikipedian are not considered original research, so long as they do not illustrate or introduce unpublished ideas or arguments.

Snowded, your proposed process labeling a WP figure as 'synthesis' and OR is not according to present policy. Snowded, your incorrect understanding of this policy simply emphasizes the need for WP:OR to specifically state beyond all doubt and misinterpretation, that a WP figure is to be judged upon its accuracy in conveying the WP text it is meant to illustrate. Brews ohare (talk) 14:19, 6 August 2013 (UTC)

As I said Brews you are shifting your ground. You started referencing those two pictures, then you shifted to a summary of the sources. My criticism applies to both. You consistently don't get synthesis (and seem not have learnt over 4 RfCs), the fact that you can find some of the same or familiar words does not entitle you just to sweep them all up into a picture of your own creation. Of course it can only be my failure to understand policy that leads me to disagree with you, or maybe I didn't read the text this time? Never sure what sin editors who disagree with you have committed so its difficult to keep track. ----Snowded ^TALK 14:28, 6 August 2013 (UTC)

The ground is simply this: even if nothing like a particular WP figure exists anywhere else at all, it is acceptable on WP so long as it faithfully depicts text that is not OR. Snowded, your attempt to confuse this discussion with extraneous other matters is irrelevant. Brews ohare (talk) 14:40, 6 August 2013 (UTC)

This comment by Snowded indicates a complete failure to understand Wikipedia:No_original_research#Original_images so far as I interpret this policy as I have outlined above. Either Snowded is out to lunch or I am. In either case, the policy needs to be clarified. Brews ohare (talk) 15:09, 6 August 2013 (UTC).

Brews, can you point to a single Dif where you have ever accepted that you were wrong on a subject? Or even that another editor might have a point? So far it seems that other editors don't read the material, don't understand the material, don't understand policy, reference extraneous material (your extensive and unrepentant history of blocks and topic bans), are confusing the subject etc. etc. You have opened up an RfC, let it run for Gods sake. ----Snowded ^TALK 15:26, 6 August 2013 (UTC)

Again, this is a content dispute involving a single article. I have no particular opinion on the merits of the image, but Snowded's position is neither unreasonable nor unusual. You need to take this dispute back to an appropriate forum. There is no good reason to change our policy here. WhatamIdoing (talk) 15:46, 6 August 2013 (UTC)

WhatamIdoing: The issue is general, although Snowded wishes to embroil it in specifics regarding this RfC. I have formulated the general issue in the following thread. Brews ohare (talk) 15:54, 6 August 2013 (UTC)

Regarding Snowded's response to my last message, it looks like the link he gave attempts to show that there is a new conclusion in the image. I couldn't tell from a quick look at it whether that is correct or not, but it seems like a sufficiently reasonable objection to discuss. I also looked at the discussion following the link's message of 14:02, 5 August 2013, in the section Removal of figure.

In any case, it appears that Snowded understands the basic idea of the policy regarding an image violating NOR if it illustrates a new conclusion. To settle this article-specific issue, which may not be easy, would seem to require someone who is interested in getting into the details of the specific subject, and interested editors might be invited to participate on the article's talk page, e.g. with an RfC.

A discussion between only the two editors doesn't seem to be making progress towards agreement, and seems pointless. It may be that there are no other editors who wish to get involved. For situations like this in the future, the two editors might try to reach some general understanding about what to do when they disagree on an issue and no other editors are interested in getting involved. --Bob K31416 (talk) 18:35, 6 August 2013 (UTC)

Sound advice, but if you look at Brews response to two other editors below, then you can see that they "are wrong" :-) Brews has got to learn to work with other editors and accept that from time and time he may not gain agreement and has to move on. If not then I think we are going to end up with yet another RfC on Brews ----Snowded ^TALK 19:01, 6 August 2013 (UTC)

Hi Bob: How Snowded thinks is a bit of a mystery as he will never say specifically what is wrong with that particular figure, but it doesn't affect the need for a clearer statement of general policy insofar as the comments by WhatAmIDoing indicate some confusion over what policy is, and Snowded has explicitly contradicted policy by saying figures are subject to sourcing requirements just like text, not just a requirement for accurate illustration of accepted text. Brews ohare (talk) 19:24, 6 August 2013 (UTC)

Do you ever both to read what people say Brews? I said illustrations were subject to the same evidence rules as anything else. ----Snowded ^TALK 19:32, 6 August 2013 (UTC)

Snowded: That is exactly what I said you said, and it is not correct. WP images can be completely original and do not need to be sourced if they portray accepted text accurately. Brews ohare (talk) 21:39, 6 August 2013 (UTC)

The rules of wikipedia Brews do not say that everything has to be quoted (although you seem to take that view on many articles), but they do say synthesis and original research are not allowed. If you spend a little time and check out my response on the File:Autorecessive.svg example above its pretty clear. So yes you can create a picture, but the material in that must not be original research or synthesis. I have explained this on the talk page. Just having the same words in different sources is not enough, you have to show that the way your diagram puts those words together is not adding something no present in the original. As I explained here that is what I think you have done. This should really all be moved to the talk page of the article you currently are in a minority of one (again) in wanting a policy change ----Snowded ^TALK 21:47, 6 August 2013 (UTC)

Snowded: Of course figures cannot make points that are synthesis or original research. However, the point that you cannot accept is that if the figure faithfully illustrates material in a text that does not exhibit these traits, then the figure is deemed not to exhibit these traits either. Therefore, a discussion of whether a figure has acceptable content first is directed at the text it illustrates, not at the figure, and with that discussion of the text out of the way, the criticism leveled at the figure itself is whether it accurately portrays the text. Brews ohare (talk) 01:40, 7 August 2013 (UTC)

Brews I worry about you from time to time. If the figure does not exhibit OR or Synthesis then it is eligible, it might still be ruled out on grounds of taste, usefulness or whatever. Please deal with the objections people raise, rather your attempts to restate them, which in my experience, have always been perversions of the original ----Snowded ^TALK 03:03, 7 August 2013 (UTC)

Snowded: The objections you have raised historically against figures are the policies concerning OR and SYN. However, it is text that is subject to these policy considerations. Figures illustrating such text are not subject to these policies. Instead these figures are subject to fidelity to the texts they illuminate. Naturally, if the figure is faithful to the text and the text is free from OR and SYN, then the figure is acceptable. Do you agree? Brews ohare (talk) 05:23, 7 August 2013 (UTC)

But it isn't Brews and the figure is subject to those policies like everything else. I'm giving up trying to explain this to you. It belongs on the talk page of the article anyway ----Snowded ^TALK 05:50, 7 August 2013 (UTC)

Snowded: In my experience you have not tried to explain anything. For example, in the present case, an explanation could take the form of suggesting why the proposed criterion of faithfulness of a figure to the text it explains wouldn't work, or suggesting why some additional requirements might be necessary, or outlining how OR and SYN are to be applied to a figure when a figure is entirely original work as allowed by Wikipedia:No_original_research#Original_images. Instead, we have simply the bald reassertion that OR and SYN are required of figures just like "anything else". In short, no thought. Brews ohare (talk) 13:31, 7 August 2013 (UTC)

WP:AGF Brews, not to mention WP:NPA. Specific reasons were given here you just don't agree. You really should not define 'thought' as 'mental processes that result in the belief that Brews is right' it can (and has) taken you to some bad places over the years. ----Snowded ^TALK 13:37, 7 August 2013 (UTC)

Goodbye. Brews ohare (talk) 14:07, 7 August 2013 (UTC)