Talk:Bayesian probability/Archive 2

This is an archive of past discussions about Bayesian probability. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Archive 3

Gillies on only fair betting quotient

I have a suggestion: first of all, could Logicus please provide a quote from Gillies about the betting quotient and universal hypotheses. (Sorry if you already did. This page is rather long.) The quote is probably just for us to look at on this talk page. Then, my suggestion is that we have the article say something like "Gillies states that ...", so that it's attributed to Gillies, not to Wikipedia. I still have a number of problems with the statement [1] about the universal hypothesis. How does one win a bet -- by convincing a human judge of the truth of a proposition? Or by it being objectively true? Or what? and, for whom is it fair? It seems to me that it's unfair for the other player, who has to put some money down but may never get it back. Those are on top of the concerns I've raised already, which I'm not convinced have been adequately addressed. --Coppertwig 23:10, 31 August 2007 (UTC)

Logicus to Coppertwig: I don't undertand your problem here. There is no attribution to Wikipedia, rather Wikipedia is just reporting a recognised problem in the literature, and gives references for such. And you own original research views or problems with this problem are simply not relevant here. There are far more dubious wholly unsourced claims in this article to pick on that Logicus's genuine additions. But I will look for a quote for you. By the way, there is also the addition about the problem of fallibilism to come. --Logicus 14:53, 1 September 2007 (UTC)

I appreciate your taking the time to look for a quote for me. If you wish to challenge unsourced claims, I believe the usual method is to put {{fact}} after them in the article, which makes a footnote like this ^{[citation needed]}, and then wait a period of time -- I think several weeks at least is usually expected -- and then delete them if nobody has provided sources. --Coppertwig 17:21, 1 September 2007 (UTC)

Coppertwig, if you really must put something from Gillies in the article, I suggest the following footnote: "e.g. see Gillies 2000, p55: "My own view is that betting does give a reasonable measure of the strength of a belief in many cases, but not in all. In particular, betting cannot be used to measure the strength of someone's belief in a universal scientific law or theory." " as a footnote to the very first sentence of my addition.--Logicus 18:09, 4 September 2007 (UTC)

Logicus, why is Gillies' personal view on betting odds, cited from a 2000 publication, relevant to the history of Bayesian probability? If this objection is so critical to Bayesianism's history, surely there would be some citation in an older publication by a more prominent author. -- 158.83.15.85 14:33, 18 September 2007 (UTC)

Logicus to 158.83.15.85: Thank you for this anonymous mistaken comment. Please note that contrary to your assumption, I have never claimed Gillies' views on betting odds are relevant to "the history of Bayesian probability" nor to "Bayesianism's history", nor that his objection is critical to Bayesianism's history as you variously suggest. The point at issue here is neither about 'Bayesian probability' nor about 'Bayesianism', and nor indeed about their histories. Rather it is ONLY about the Bayesian PHILOSOPHY OF SCIENCE, that is, a specific APPLICATION of Bayesian epistemic probability to the specific domain of scientists' beliefs and reasoning about scientific propositions, an application listed amongst other such specific applications as 'e-mail spam filtering' in that specific section of the article called 'Applications'. As I understand it, this application to philosophy of science is a relatively novel application of Bayesianism that only really took off in the 1990s. SO I REPEAT, contrary to what you and some other Wiki editors such as Coppertwig, Jefferys, BenE etc sometimes mistakenly presume, the subject at issue here is not the much older topic of 'Bayesian probability', that is, a specific interpretation of the meaning of 'probability' in the probability calculus to mean 'strength of belief that a proposition is true', but rather only about the specific application of that general interpretation of probability and the probability calculus to the domain of scientists' beliefs about the propositions of science to try and explain such as their acceptance and rejection of them.

As for Gillies, his view that the probability of all universal laws is zero is relevant to Bayesian PHILOSOPHY OF SCIENCE at least because (i) he is a professional academic philosopher of science (ii) he was a Cambridge double first maths wrangler, (iii) was a PhD student in the philosophy of probability of one of the most brilliant philosophers of science and of maths of the 20th century, Imre Lakatos (who also incidentally maintained the probability of all scientific laws is zero), (iv) is an ex President of the British Society for the Philosophy of Science and (v) part of his 2000 book on 'Philosophical theories of probability' does deal with Bayesian philosophy of science, which only took off in the previous decade.

As for your surmise, when appropriately re-interpreted, that if this objection is critical to Bayesian philosophy of science then "surely there would be some citation in an older publication by a more prominent author.", it is indeed correct, as you might have discovered yourself had you bothered to read the article's footnote references for this objection and the literature listed in the article's References and done some thinking before putting pen to paper, or rather fingers to keyboard. For the saleswise more prominent authors Howson & Urbach discussed it on pages 72 and 263-4 in their 1989 'Scientific Reasoning: The Bayesian Approach' listed in the article's References , as mentioned in the article's footnote. Gillies' specific views on this objection were put into the article simply because it was specifically his views Coppertwig requested I provide, as you will see from the heading of this particular Talk section, although I have no idea why Coppertwig picked on Gillies. And nor, I suspect, does Coppertwig.

If you wish to trawl through the literature in the article's References for further earlier citations of this objection, please feel free to do so. But please note that whether or not particular authors agree or disagree about the validity of this objection is irrelevant to the issue of correcting this article's highly biassed pro-Bayesian viewpoint that signally fails to mention hardly any of the many problems of Bayesianism and Bayesian philosophy of science by at least mentioning some of them, and thus giving it a somewhat more NPOV. --Logicus 14:46, 23 September 2007 (UTC)

I can't easily look through the literature. For example, I checked the local public library for the Gillies book and it doesn't have it. However, I have several requests. I think you misunderstood an earlier request I made, and that request still stands. I was not asking for a quote from Gillies for the purpose of inserting the quote into the article. Rather, because you want to insert into the article a statement about the only fair betting quotient in certain circumstances being zero, and since I can't easily check the reference you attached to the statement, I asked you to present a quote on this talk page as a substitute for me looking in the book myself. Different people interpret things differently, so I wanted to check that whatever in that book you're interpreting as supporting that statement would also be interpreted by myself (and others) as supporting that statement. Although you've provided a quote from Gillies, it does not, in my opinion, indicate that Gillies believes that the only fair betting quotient in some particular situation is zero; for example, it does not include the words "fair" or "zero" or synonyms of them in my opinion.

I would like to ask you, Logicus, to do six things: each of the three following requests applied to each of the two following statements you want to insert into the article: the statement about the only fair betting quotient in certain circumstances being zero, and the statement about the probability of scientific laws being zero. For each of these two, would you please:

Since you presumably have the books at hand and I may not easily be able to access them, would you please provide on this talk page for the convenience of myself and other editors here a quote from the book that makes the statement so that we can verify that, in our opinion, the statement made in the book is essentially the same as the statement being made here. (I'm not proposing that the quote be included in the article. Possibly I or someone else might later propose including the quote in the article, but that is not the purpose of this request.)
Please explain the relevance of the statement to "Bayesian probability", the topic of this article.
When inserting the statement in the article, rather than asserting the statement, assert that a certain book has asserted it, perhaps like this: "Gillies (2000) states that ...".

Thank you for considering my requests. Of course you don't have to do them, but doing them successfully may lessen my opposition to the insertion of those statements.

Since Gillies is not Bayes, I wonder what the relevance of Gillies' opinion is here. Similarly for Popper. Perhaps statements by these people can only be considered relevant to this article if they mention "Bayes" or "Bayesian" in the context. --Coppertwig 15:32, 23 September 2007 (UTC)

I'm not convinced that the idea about the only fair betting quotient being zero is a previously-published idea, so I've edited it. Also, it may only be Gillies' opinion that there is a problem, so this page should not state that there is a problem -- maybe state that Gillies says there is a problem, or (as in my edit) that there may be a problem. Perhaps some of the sentences that followed it also need to be deleted or modified for similar reasons. --Coppertwig 01:14, 1 October 2007 (UTC)

I'm with you here.--BenE 14:16, 1 October 2007 (UTC)

Logicus to Coppertwig of 1 October: Would you please restore the original text before your edit of 1 October, and instead post your proposed change and its justification here on the Talk page first for critical discussion before any implementation. Also note your edit introduces a fatal omission. Let us see if you can discover what it is by yourself.

Also note your view that the only fair betting quotient is zero was not previously published before Gillies is mistaken. It was also in Howson & Urbach 1989, the basic teaching text in 'Bayesian' philosophy of science listed in the article’s references, as I pointed out. Please pay attention and stop intervening in issues in a literature and subject, philosophy of science, you are patently unfamiliar with and not competent in. Your personal problems of lack of access to the basic literature are surely sufficient to debar you from commenting. Why should anybody suffer the burden of convincing you of anything ? Are you an employed editor of Wikipedia ?

However, I should say I am not wholly opposed to the spirit of your proposal of only saying some people say there is a problem, and will consider a modification. --Logicus 12:43, 5 October 2007 (UTC)

As it seems Gillies is not a Bayesian, and that you are only quoting a genera encyclopaedia, this literature has little authority on the subject. I don't have the encyclopaedia here but one of the quote you put on the page seems patently false:

See p50-1, Gillies 2000 "The subjective theory of probability was discovered independently and at about the same time by Frank Ramsey in Cambridge and Bruno de Finetti in Italy."

People were debating the 'subjective' theory well before de Finetti see here This doesn't give much credibility to Gillies. —Preceding unsigned comment added by BenE (talk • contribs) 14:55, 5 October 2007 (UTC)

I removed the following which is in any case too specific. It might go on a page about De Finetti since it only applies to his specific flavor of bayesianim.

"This problem of the Bayesian philosophy of probability becomes a fundamental problem for the Bayesian philosophy of science that scientific reasoning is subjective Bayesian probabilist, which thereby seeks to reduce scientific method to gambling, but some regard it as solvable.^[1] But it is also noteworthy that by 1981 De Finetti himself came to reject the betting conception of probability.^[2]"--BenE 15:45, 5 October 2007 (UTC)

Logicus to Coppertwig of 1 October: How about the following quote as evidence of recognition in the literature that the positive undecidability of universal hypotheses poses a fundamental problem for the degree of belief as betting-quotients interpretation of subjective probability ?

"...critics [of the standard Dutch Book argument] have not been slow to point out that the postulate that degrees of belief entail willingness to bet at the odds based on them is vulnerable to some telling objections. One is that there are hypotheses for which the wise choice of odds bears no relation to your real degree of belief: if 'h' is an unrestricted universal hypothesis over an infinite domain, for example, then while it may in certain circumstances be possible to falsify 'h', it is not possible to verify it. Thus the only sensible practical betting quotient to nominate on 'h' is 0; for you could never gain anything if your betting quotient was positive and 'h' was true, whilst you would lose if 'h' turned out to be false. Yet you might well believe that 'h' stands a non-zero chance of being true. " [p90. Howson & Urbach 1993]

Please don't come back with the standard positivist rap about omniscient oracles as supposedly solving this problem, which is logically irrelevant to the issue of whether it is recognised in the literature as a fundamental problem requiring solution, whether or not you personally believe such alleged solutions are valid or invalid.--Logicus 19:01, 13 October 2007 (UTC)

The problem of fallibilist philosophy of science for epistemic Bayesian probabilist philosophy of science

The following is proposed as an addition to the second paragraph of the ‘Applications’ section of the article. Its relative length is apparently required to overcome the difficulty some Wikipedia editors have in understanding the point at issue and their objections.

However a fundamental problem for all probabilist philosophy of science is posed by radical fallibilist philosophy of science which maintains all scientific laws are false and will be refuted and replaced by hopefully better false laws that will in turn be refuted and revised again and so on ad infinitum in a potentially endless series of false laws.F1 For insofar as scientists believe this radical fallibilist philosophy, as it seems most do nowadays F2, then according to the canons of the subjectivist Bayesian method according to which probabilities are assigned to propositions in proportion to strength of belief in their truth, they must therefore assign zero prior probability to all scientific laws since they believe them to be false.F3 But this would render probabilist epistemology practically inoperable, since by Bayes' Theorem all evidentially posterior probabilities must therefore also be zero, thus putting all laws on an epistemic par and so eliminating any way of choosing between them epistemically within probabilist epistemology.F4 Thus philosophers of science who maintain scientific reasoning is consistently probabilist must deny most scientists are radical fallibilists, or at the very least show some scientists believe their theories are true, or at least not definitely false, in order for their probabilist theory of scientific reasoning to have any valid domain whatever.F5

F1 [As Duhem expressed the key tenet of this philosophy "Thus, the struggle between reality and the laws of physics will go on indefinitely: to every law that physics may formulate, reality will sooner or later oppose a rude refutation in the form of a fact, but, indefatigable, physics will improve, modify, and complicate the refuted law in order to replace it with a more comprehensive law in which the exception raised by the experiment will have found its rule in turn." p177 Duhem's The Aim and Structure of Physical Theory, Athaneum 1962]

F2 [Even Bayesian statistician George Box has admitted: "All models are wrong; but some models are useful." (Bill Jefferys, would you please kindly provide the reference here for this Box quotation you gave ?)

F3 [This is because a hypothesis that is refuted and thus falsified must be assigned zero probability in Bayesian epistemic probability theory: 'If a hypothesis h entails a consequence e, then P(h / ~ e) = 0. Interpreted in the Bayesian fashion, this means that h is maximally disconfirmed when it is refuted. Moreover...once a theory is refuted, no further evidence can ever confirm it, unless the refuting evidence or some portion of the background assumptions is revoked.' [p119, Howson & Urbach 1993] ]

F4 [Hence this problem is apparently fatal to the Bayesian theory of scientific method. For as Howson & Urbach admit, if it were correct that the prior probability of all unrestricted universal laws must be zero, "then that would be the end of our enterprise in this book" [p391 Howson & Urbach 1993], which is to demonstrate that scientific reasoning, and most especially its grounds for the acceptance and rejection of hypotheses, is subjective Bayesian probabilist reasoning. [p1] ]

F5 [For an example of some probabilist philosophers of science who do deny all scientists are radical fallibilists, see the desperate appeal to Einstein's (ironic?) dogmatic view on the truth of his GTR by Howson & Urbach on p394 of their 1993 Scientific Reasoning. But of course, one swallow does not a summer make.]

The main stumbling block on the part of some Wikipedia editors such as Jefferys, Johnston, BenE, Coppertwig etc in understanding that scientists' belief in radical fallibilist philosophy of science is fatal to probabilist philosophy of science seems to be their refusal to accept that according to the Bayesian philosophy of science literature and thus also in this article Bayesian epistemic probability interprets 'probability' as 'strength of belief that a proposition is TRUE', whereby if a proposition is believed to be false it must therefore be assigned probability zero. For they do not contest, and indeed apparently agree, that scientists believe all scientific laws are false. But they seek to avoid the conclusion that they must therefore assign them probability zero by doing original research and illegitimately redefining 'probability' as 'strength of belief that a proposition is USEFUL for making novel predictions', But since this is a non-Bayesian interpretation of probability, if they are right they are in effect demonstrating that scientific reasoning is not Bayesian probabilist.

LOGICUS 16 September 2007 —Preceding unsigned comment added by Logicus (talk • contribs) 14:45, 16 September 2007 (UTC)

I think the idea that all scientific laws have probability zero counts as "original research" under WP:NOR. I think that idea is not stated by any of the given sources but is a conclusion reached by Logicus, and furthermore that it is not generally accepted by scientists. Therefore, it should not be stated in the article -- unless a source can be found that states it, and then at most in could be mentioned in a quote or indirect speech, as in "so-and-so says that all such probabilities are zero," not "All such probabilities are zero" as if that's what Wikipedia is asserting. Probabilities are manipulated within mathematical frameworks in which certain sets of scientific laws are "assumed" to be true. Besides, the edit is too long and I disagree with the premise of the argument for including such a long quote. --Coppertwig 17:18, 16 September 2007 (UTC)

Coppertwig, please stop giving me your baloney! Do be a good fellow and go and read the literature, including the sources I give, where you should discover what you say is nonsense ! It is most definitely not Wiki-original research, whereas what you claim is. For example, it is well know that Popper maintained the probability of all laws must be zero, whether or not he was right or wrong. Please stop lecturing me on subjects about which you are either clearly ignorant or logically confused. Best wishes.

Logicus 18:14, 17 September 2007 (UTC)

The problems of Bayesian Philosophy of Science

The problems of Bayesian Philosophy of Science as distinct from those of the Bayesian Philosophy of Probability

Further to my comments of 11 October above in the 'What is probability' section, they do not really belong to this section on the concepts of probability. Maybe some confusion has arisen because BenE's critical comments of 20 September in this section commenced with his assertion that he held the same views as Bill Jefferys. But Jefferys' views and my debate with him concerned the philosophy of science, that is, the nature of scientific reasoning, and whether it is Bayesian probabilist or not. Thus it was reasonable to interpret BenE's statements as about the same issue, rather than about the philosophy of probability, which might be the issue he actually had in mind, even if unclear from his comments. As I have pointed out before to try and clarify this crucial distinction, one may hold a Bayesian philosophy of probability, but be a vehement anti-probabilist and thus anti-Bayesian in the philosophy of science, regarding it as utterly absurd that scientists' beliefs in the truth of theories obey the probability calculus or are even logically consistent.

Anyway, my point here is that the above discussion does not really belong to this section, but rather to a section on the problems of Bayesian philosophy of science, that is, the thesis that scientific reasoning is probabilist and Bayesian, not to be confused with theories about what is the best interpretation of the notion 'probability'. And so I copy it to another section devoted specifically to discussing the problems of Bayesian philosophy of science. This is, for example, is the appropriate place to discuss whether the belief that all scientific laws are false, whereby they must be assigned probability zero when 'probability' is defined as 'strength of belief a proposition is true', is recognised as posing a fundamental problem for Bayesian and probabilist theories of scientific reasoning. Or what episodes in the history of science are recognised as being successfully accounted for by a Bayesian probabilist theory of scientific reasoning, such as 'the Copernican revolution', 'the anti-Cartesian Newtonian revolution', 'the Einsteinian revolution'.--Logicus 16:15, 12 October 2007 (UTC)

BenE's further comments I'm going to add, since you keep calling me a fundamentalist for believing in Jaynes' theories, that I am far from the only one with these views.

There is an influencial group of Bayesians part of the Future of Humanity institute which is part of the Faculty of Philosophy of Oxford University which the Philosophical Gourmet Report as recently ranked "the most important ranking of Graduate Programs in Philosophy in the English speaking world." They have a blog which frequently talks about Bayesianism as the probability theory AND as the philosophy of science. One of their contributors, Eliezer Yudkowsky wrote here :

"Previously, the most popular philosophy of science was probably Karl Popper's falsificationism - this is the old philosophy that the Bayesian revolution is currently dethroning. Karl Popper's idea that theories can be definitely falsified, but never definitely confirmed, is yet another special case of the Bayesian rules; if p(X|A) ~ 1 - if the theory makes a definite prediction - then observing ~X very strongly falsifies A. On the other hand, if p(X|A) ~ 1, and we observe X, this doesn't definitely confirm the theory; there might be some other condition B such that p(X|B) ~ 1, in which case observing X doesn't favor A over B. For observing X to definitely confirm A, we would have to know, not that p(X|A) ~ 1, but that p(X|~A) ~ 0, which is something that we can't know because we can't range over all possible alternative explanations. For example, when Einstein's theory of General Relativity toppled Newton's incredibly well-confirmed theory of gravity, it turned out that all of Newton's predictions were just a special case of Einstein's predictions."

Another good article of him extolling Bayesianism can be found here

I may be young and naive and I may be suffering from intellectual "growing pains" but at least I am up to date with recent developments. And I doubt Oxford's Faculty of Philosophy is considered a fundamentalist group.--BenE 00:18, 13 October 2007 (UTC)

Logicus continues to misunderstand my position. I am not talking about philosophy of science, but what scientists actually do. For some reason, many philosophers of science have the view that their ruminations about philosophy have something to do with what scientists actually do. This is generally false, since most philosophers of science have never done science and therefore have no notion of what scientists actually do.

What scientists actually do is to construct models, compare with data, and try to find models that explain the available data well and predict future data well. A Bayesian analysis of a specifically identified set of models is a good way to choose between models that have been identified (even if we do not believe that any of the models in this restricted set is "ultimate truth"), and model-fitting criteria at any particular time will guide us in our quest to invent models that do a better job.

Such a specifically identified set of models does not have the defect that Logicus claims to be a problem, that is, that you have to put zero prior probability on each. Indeed, once you restrict yourself to a specific set of models, you are required to set priors that add to unity, so not all can have zero prior probability. Whether the "ontologically true" model is within that set is not relevant for model comparison.

I provided a joke a while ago about the Dean who, when approached by the Physics department chair about an expensive piece of equipment, complained that the mathematicians needed only pencil, paper and a wastebasket, and that the philosophers needed only pencil and paper. Logicus answered with a lame retort supposed to show the superiority of philosophers, that unfortunately completely missed the point, to wit:

Of course the philosophers' joke on the joke is that the dean was herself a philosopher and correctly believed philosophers are infalible [sic], so don't need wastebaskets.

Logicus' recent comments continue to prove that he has no clue about what scientists actually do or the way they actually think. I think, also, that it would be useful for anyone following this discussion to read Logicus' recent [rant] (which he later removed, but which, thanks to the WikiPedia gods, is preserved in perpetuity). Bill Jefferys 00:59, 13 October 2007 (UTC)

Logicus to Bill: Bill you seem to have a literacy problem here. I never removed my contribution of 11 October you link up to here. It is still there as above for the literate to see and be enlightened from the Bayesian positivist nightmare. Thanks for advertising it though. Also thanks for regaling us with your philosophy of science yet again. The question you have to answer is that IF 'probability' means 'strength of belief a proposition is true' and somebody believes a proposition is false, what probability should they assign a false proposition other than zero ? In answering this question you must set aside the fact that you yourself reject this conception of probability on which this article is based, albeit but have failed to propose an alternative conception that agrees with the literature referenced at the end of the article. Also remember that like fish have no good ideas about hydrodynamics, most scientists haven't a clue about what they actually do because their heads are usually filled with some ideological philosophy of science view about what they do i.e they suffer false consciousness. The task of critical philosophy of science is to analyse what they actually do rather than what they say they do. --Logicus 19:26, 13 October 2007 (UTC)

If you will click on the [rant], you will find on the left hand side of the page a lot of stuff that you wrote, in red, that does not appear on the right hand side. This is a "diff", which shows what you deleted and what you added. I think you need your reading glasses checked.

As for what you claim to be the task of critical philosophy of science, you may think anything you wish about what it thinks it does. This does not mean that it does it, and I think you will have a hard time showing that this is what scientists actually do.

And, as to your claim that I have proposed no alternative conception that agrees with the literature referenced at the end of the article, I have provided citations that support my so-called "alternative" conception about the real role of models in scientific inference. Bill Jefferys 23:36, 13 October 2007 (UTC)

Logicus to Jefferys: If anybody who is literate clicks on the 'rant', providing they are wearing any reading glasses they may need, they should find 'the stuff I wrote in red on the left hand side of that page' also appears on this page above in my 11 October contribution, and hence that I never removed it, contrary to Jefferys' bizarre claim that I did remove it and insinuation that one has to go to this link to find it. And there they should also see that it is not a rant, but helpful advice to Jefferys' fellow 'Bayesian fundamentalist' philosopher of science BenE about the existence of non-probabilist philosophies of science.

But what the literate reader will not find anywhere from Jefferys is a simple concise definition of his conception of 'probability', nor an answer to the key question of what probability a subjective Bayesian probabilist should assign to propositions they believe to be false. His problem here is that of avoiding the appalling unthinkable conclusion that scientific reasoning is not Bayesian probabilist. For if scientists proceed as he claimed they do on 13 October and 15 August, and thus assign non-zero positive prior probabilities to propositions they actually believe to be false, then on the normal view of it they cannot be subjective Bayesians for whom 'probability' means 'strength of belief a proposition is true' whereby a proposition believed to be false is assigned probability zero, that is, no strength of belief it is true simply because it is believed to be false. Thus scientists are not subjective Bayesians on Jefferys' view of scientific practice. Precisely my point. QED.

What Jefferys fails to grasp in this key issue of whether scientific practice is subjective Bayesian probabilist or not is that in this instance it is not his account of scientific practice that is being challenged, but rather whether that practice as he represents it is a subjective Bayesian practice or not on the standard definition of Bayesian probability this article is based upon, that is, 'strength of belief a proposition is true'. And if scientists assign non-zero positive priors to hypotheses they believe to be false as Jefferys claims they do, then clearly it is not. Thus to establish his thesis that scientific practice is Bayesian probabilist, Jefferys must reject this article's definition of that conception of probability and replace it with an alternative definition, such as his own declared conception of it as 'strength of belief that a proposition is likely to be useful for predicting novel facts'. But of course for many reasons he cannot, including the fact that this does not square with even the pro-Bayesian literature, and at least the problem of yet again avoiding the conclusion that scientific reasoning is not probabilist at least because scientists breach Axiom 2 of that calculus in not assigning probability 1 to tautologies or else explaining why completely useless propositions for predicting novel facts such as 'The Moon is the Moon' are not given probability zero. Thus the very learned self-alleged Emeritus Professor Jefferys remains impaled on the horns of the dilemma constituted on the one horn by his equivocation between an instrumentalist idealist philosophy of science that scientific hypotheses are neither true nor false but instruments of prediction and a contrary radical fallibilist realist view that they are all false but possibly useful instruments of prediction, versus his fundamentalist belief on the other horn that scientific reasoning is Bayesian probabilist. Hardly surprising that in this unenviable situation he simply protests that he really does have a coherent and empirically adequate Bayesian philosophy of science that Logicus has simply not understood, but demurs presenting it on these pages, claiming he can only present it to Logicus by personal e-mail axtra Wikipedia, rather than on the Wikipedia Talk page where anybody can see it. Or else he suddenly becomes quasi-Emeritus and makes excuses that he must rush off to prepare his lessons for the new semester instead of answering Logicus's challenges, put as follows:

Logicus to Jefferys 18 September: On another point, since you yourself at least agree with the radical fallibilist philosophy of science that all scientific laws are false according to your 15 August testimony on this Talk page, then what probability would you assign a scientific law if you assigned probabilities to propositions according to strength of belief in their TRUTH, as subjective Bayesian epistemic probabilists do according to the literature ? (I appreciate you do not accept the subjective Bayesian epistemic interpretation of probability, but have your own utilitarian pragmatic interpretation of it as 'likely usefulness of a hypothesis in making novel predictions', but just imagine you did accept it. What probability would you assign a hypothesis you believe to be definitely false ?)

I would also be grateful to know why you assign tautologies probability 1, and thus why you evaluate them as maximally useful for making novel predictions. For instance, how is 'The Moon is the Moon' useful for making novel astronomical predictions ?

Jefferys to Logicus 19 September: Since the semester started, I have been rather busy, and expect to be so for quite some time, so will answer only the question about the Box quotation, and make one more comment. The other questions will have to wait, weeks probably. I can only say that you profoundly misunderstand my position. I still invite you to contact me directly. Believe me, I do have coherent and justifiable reasons to write what I did.

But the learned now quasi Emeritus Professor manages one last comment by way of trying to teach Logicus his rotten Bayesian statistics before rushing off to spread his gospel wider:

I'll make one more comment. The reason why you have to consider more than one hypothesis is that if only one hypothesis is possible (that is, the universe of hypotheses under consideration consists of that one hypothesis alone), its prior and posterior probabilities are by definition 1. This is well-known, and any decent book on Bayesian statistics should set you straight on this point. Try Jim Berger's book.

But this is of course illogical nonsense. There is no reason in the probability calculus nor in Bayesian probability why a lone hypothesis on any subject matter should be believed to be certainly true or a tautology, and indeed it may be believed to be false because it is believed to have a counterexample somewhere sometime, and thus assigned prior probability zero. Consider the case of there being only one theory of the Moon's constitution, namely that it is made of green cheese. What definition of what concept means it must have prior and posterior probability 1 Jefferys notably does not reveal. --Logicus 18:10, 15 October 2007 (UTC)

BenE to Logicus: As I have written again and again, bayesian probability theory is interpreted as a degree of belief that is calculated through Bayes theorem. Hence the name Bayesian. Your statement that a scientific theory must assign 0 to scientific theories is unsensical in this context as it is not possible to arrive at this probability value through Bayes theorem (Unless you pull this zero value out of your ass). We have no clue what the absolute prior is for a scientific theory for two reason: First we don't have any idea of the size of the hypotheses space and second, we don't even know that the theories are mutually exclusive. (e.g. the theory that the human body temperature is between 95-105 degrees does not exclude the theory that it is between 90-110) the only thing we can do is assign a maximum entropy prior between alternative theories and thus represent our state of ignorance. This is equivalent to setting the priors to be equal. When we calculate the odds ratio, the prior terms vanishes in the equation as P(H1)/P(H2)=1 and we never actually need to assign a number to these priors! By following Bayes theorem and the MaxEnt principle we arrive on a ratio based solely on the data, and that is therefore based on how the theory fits the data: how well it predicts the data.

Choosing theories based on 'strength of belief that a proposition is likely to be useful for predicting novel facts' is not a supposition but a result of the Bayesian approach. It's a result of applying Bayes theorem with a maximum entropy (equal) prior.

I provided many citations supporting this. However, why do you even make a fuss? Even if the people posting on this discussion page and the guys in Oxfords's philosophy department think it is the best candidate for a theory of science, the article in its present form doesn't even mention the word science once! Nothing in the article should bother you. --BenE 19:50, 15 October 2007 (UTC)

I apologize to Logicus, for he is correct, he did not remove the material I thought he had removed. I was misled by the [diff page], which showed the material deleted on the left (in red, with minus sign indicating deletion) but not restored on the right. There is evidently something that I do not understand about the way the diff is presenting things. But when I searched the version that Logicus edited for the material, it was there. I am sorry.

I also apologize for using intemperate language. I hope that all of us will change our ways and try to be more respectful of the views of our fellow editors, even when we don't agree with them. Bill Jefferys 22:48, 16 October 2007 (UTC)

Logicus to Bill: Well thanks for that. I also find diff confusing, and often cannot find history of edits obviously made.--Logicus 18:15, 17 October 2007 (UTC)

BenE stated "As I have written again and again, bayesian probability theory is interpreted as a degree of belief that is calculated through Bayes theorem." This is not correct. Bayes theory may or may not use a subjective "degree of belief" as an input (it can also use objective inputs) but the degree of belief is not the result of Bayes theorem.ERosa (talk) 07:19, 10 February 2008 (UTC)

Logicus endorses ERosa

The complaint of ERosa of 10 Feb, reproduced below also with some insertions added in square brackets, that there is nothing inherently subjectivist about Bayesian probability, whereby this article’s definition of Bayesian probability is mistaken and the article is systematically confusing in its untenable conflation of Bayesian probability with subjective epistemic probability that defines ‘probability’ of a propositiion as subjective strength of belief that the proposition is true, is surely entirely justified. The truth of the matter seems to be that there is no such thing as distinctly Bayesian probability in the sense of using Bayes’ Theorem since the conditional probability calculus entails Bayes’ Theorem and it seems all statisticans use it, as ERosa testifies. So non-Bayesian probability can only mean the unconditional or absolute probability calculus.

Thus the article remains the ultimately confused nonsense it has been for some time on this basic issue. The way forward is surely to retitle it as what it is essentially about by virtue of its beginning definition, namely ‘Subjective Epistemic Probability’. In Epistemic Probability, probability is about the TRUTH of propositions or degree of certainty in their TRUTH as opposed to being about properties of extra-linguistic events, and also as opposed to being about the usefulness of propositions or degree of certainty in their usefulness rather than their truth, such as their usefulness for making predictions, for example. In the subjectivist variant of epistemic probability the probability of a proposition is the subjective strength of belief in their truth measured on a scale of fortitude from 0 to 1. Thus the article’s current definition of ‘Bayesian probability’

“Bayesian probability interprets the concept of probability in the probability calculus as the degree of strength of belief in the truth of a proposition”

is in fact rather a definition of subjective epistemic probability, and so should instead say

“Subjective epistemic probability interprets the concept of probability in the probability calculus as the degree of strength of belief in the truth of a proposition”

This would then enable the elimination of all the pedagocially confusing stuff that there are Bayesian probabilists who reject Bayesian probability as defined here because they reject subjective epistemic probability.

I have today edited the opening definition of ‘Bayesian probability’ pro tem just to clarify the concept of subjective epistemic probabilty that then follows as being concerned with belief in the TRUTH of propositions, but in fact like ERosa I deny this conflation of Bayesian probability with subjective epistemic probability.

I propose this article now be retitled ‘Subjective epistemic probability’ and then purged of its current confusing content about ‘objectivist Bayesian probability’, and an article on Bayesian probability be started briefly explaining there is really no such thing except for the whole conditional probability calculus, the only non-Bayesian probability being the absolute or unconditional probability calculus.

ERosa's Complaint

ERosa: Actually, if the title of the article was "Subjectivist probability" and not "Bayesian probability" it would cause much less confusion. Again, there is nothing inherently subjectivist about Bayes theorem. The only possible tie is that, since Bayes theorem allows the use of prior knowledge - whether subjective or based on observed frequences - it has sometimes been mistaken as based *only* on subjective probabilities. ERosa (talk) 07:09, 10 February 2008 (UTC) Logicus to ERosa:Amen to that ! But the remaining problem is what then is the differentiating specificity of 'Bayesian probability', whereby on the one hand it is not necessarily subjectivist as you say, but on the other hand does not include all conditional probability ? The article's current definition of Bayesian probability is in fact rather a definition of subjective probability. Defining Bayesian probability is a difficult business and indeed possibly impossible if it is only a historically mistaken pseudo-category. Maybe more later.... --80.6.94.131 (talk) 15:26, 24 February 2008 (UTC) --Logicus (talk) 15:29, 24 February 2008 (UTC) ERosa: I'm not sure I can fully construct the meaning of that first sentence. Are you saying that Bayes Theorem pertains to any conditional probability? If so, then yes. [Logicus 4 March: Yes!] It is another example of an unfortunate naming convention confusing a lot of people. [Logicus 4 March: I agree] The subjectivist philsophy of probabilistic knowledge has little to do with Bayes Theorem, which is derived mathematically from fundamental axioms of probability theory - the same rules any frequentist would feel subject to. But I don't think its really that difficult to define. THere is really nothing in statistics called a "Bayesian probability". There is Bayes Theorem which is used to compute a conditional probability, but a conditional probability isn't uniquely Bayesian. It makes no more sense than to talk about gram as a "digital gram" because it was weighed on a digital scale. Anytime the term "Bayesian probability" is used they really mean "subjectivist view of probability". Bayes theorem gives exactly the right answer for p(x|y) when you give it p(x), p(y) and p(y|x). Any "frequentist" would use the same formula to compute p(x|y). A subjective probability can be used as one input to Bayes Theorem, but like every other formula in math or science, the formula doesn't care HOW we come up with the numbers. It just gives us the answer with the numbers we give it.ERosa (talk) 02:44, 25 February 2008 (UTC) [Logicus 4 March: Right On]--Logicus (talk) 20:25, 4 March 2008 (UTC)

History needs updating

If someone wants to edit the history section there is a great starting point for their research here Currently, appart from the mention of Bayes himself, the history starts way too late(1930!)--BenE 01:52, 21 September 2007 (UTC)

I've read that paper and the Bayesian movement really took off as late as in the 1960's. iNic (talk) 00:42, 4 March 2008 (UTC)

Painting a picture of too much conflict?

This article seems very POV to me in that it seems to paint a picture of the Bayesian view of probability being more controversial than it actually is, and also that there is more conflict (see the use of the word "antagonism") between the two "schools of thought" than there actually is. Although people talk about the "schools of thought" I think that the evidence that these schools are as all-encompassing as they are is shaky at best.

For example, compare the books by Casella & Lehmann (Theory of Point Estimation) which arguably takes a heavily frequentist perspective...and the J.O. Berger (Statistical Decision Theory and Bayesian Analysis) book, which is arguably written from a (very strongly) Bayesian perspective. The berger book, which is about as "rabid" a Bayesian text as you can find, still embraces the frequentist interpretation of probability as one way of looking at things and fully lays out how to use such an interpretation. Similarly, the Lehmann book is very heavily slanted towards the frequentist perspective but it offers over a full chapter dedicated to Bayesian methods and discusses the philosophical aspects of the Bayesian interpretation of probability at great length.

I think that this article needs to be seriously rewritten to reflect the real state of things. All the modern texts talk about the "controversy" between Bayesian and frequentist methods as something that is more or less historical. People quibble about use of this technique and that, how it is applied, when it is appropriate, but most people agree that each interpretation has a certain domain where it is useful and another where it is not, and furthermore, there is a general consensus that both interpretations can be combined in a given problem for both philosophical and practical reasons. Do people agree with that? Cazort 23:37, 3 December 2007 (UTC)

Well both yes and no. If you read just some random section of this talk page you will probably find some strong feelings with heated discussions. If you read philosophy papers about this you will find some strong feelings too. However, if you read mathematical books in probability theory and statistical methods you will find only dry expositions, as math books are in general the wrong forum for debates. (But there are some exceptions here too.) So it all depends on where you look if you will find heated debates or not.

I think it's good that the article stresses the differences in view that exists, and the ongoing debate. This is the kind of information readers new to the subject/concepts want to know. What could be made a bit more clear, I think, is that there isn't only two views but many. The debate is historical in the sense that it's an old debate—it can easily be traced back to the old debate between materialism versus idealism—not in the sense that it's over. iNic (talk) 23:26, 17 December 2007 (UTC)

No, the article simply perpetuated a point of confusion about the debate. Bayes theorem and Bayesian methods are routinely used by statisticians. Bayes theorem itself is a matter of mathematical proof and is not just subjective. But this article seemed to continue with a common misconception that confuses the philisophical debate about the nature of uncertainty with the methods statisticians actually use. I'm a statistician and I use both methods. When you have prior knowledge then a Bayesian analysis will actually be a better result if you track it over time and compare it to methods that ignore prior knowledge. Often, I've used Bayesian analysis when the prior knowledge was actually derived from other standard sampling methods, not subjective estimates. I've made the changes and, unlike the previous version, inserted specific citations for the claims. Hopefully, the first year stats students who have written the material previously will provide better citations if they want to refute what I just wrote.

Actually, if the title of the article was "Subjectivist probability" and not "Bayesian probability" it would cause much less confusion. Again, there is nothing inherently subjectivist about Bayes theorem. The only possible tie is that, since Bayes theorem allows the use of prior knowledge - whether subjective or based on observed frequences - it has sometimes been mistaken as based *only* on subjective probabilities. ERosa (talk) 07:09, 10 February 2008 (UTC)

Logicus to ERosa:Amen to that ! But the remaining problem is what then is the differentiating specificity of 'Bayesian probability', whereby on the one hand it is not necessarily subjectivist as you say, but on the other hand does not include all conditional probability ? The article's current definition of Bayesian probability is in fact rather a definition of subjective probability. Defining Bayesian probability is a difficult business and indeed possibly impossible if it is only a historically mistaken pseudo-category. Maybe more later.... --80.6.94.131 (talk) 15:26, 24 February 2008 (UTC) --Logicus (talk) 15:29, 24 February 2008 (UTC)

I'm not sure I can fully construct the meaning of that first sentence. Are you saying that Bayes Theorem pertains to any conditional probability? If so, then yes. It is another example of an unfortunate naming convention confusing a lot of people. The subjectivist philsophy of probabilistic knowledge has little to do with Bayes Theorem, which is derived mathematically from fundamental axioms of probability theory - the same rules any frequentist would feel subject to. But I don't think its really that difficult to define. THere is really nothing in statistics called a "Bayesian probability". There is Bayes Theorem which is used to compute a conditional probability, but a conditional probability isn't uniquely Bayesian. It makes no more sense than to talk about gram as a "digital gram" because it was weighed on a digital scale. Anytime the term "Bayesian probability" is used they really mean "subjectivist view of probability". Bayes theorem gives exactly the right answer for p(x|y) when you give it p(x), p(y) and p(y|x). Any "frequentist" would use the same formula to compute p(x|y). A subjective probability can be used as one input to Bayes Theorem, but like every other formula in math or science, the formula doesn't care HOW we come up with the numbers. It just gives us the answer with the numbers we give it.ERosa (talk) 02:44, 25 February 2008 (UTC)

The acid test of whether or not one's a Bayesian is not (and never has been) whether or not one believes in Bayes theorem. Everyone does. It's a theorem.

Rather, the acid test is whether or not one believes it can ever be meaningful to talk about a probability P(x), if X is an event which has already happened, but about which you do not know the outcome. To a Bayesian, this is not only meaningful, it should be the central quantity of inference. To a Frequentist, it is not meaningful, and one should only talk about estimators, confidence limits and so forth; and discuss questions like "unbiassedness", which to a Bayesian can seem wholly misleading.

That's been the meaning of "Bayesian" since the word was coined, in the 1950s.

European universities tend to allow their lecturers more flexibility; but I have it on authority that, at least until very recently, there were still U.S. colleges where a lecturer would find themselves barred from teaching the course again, if they ever talked about P(x) in a first year statistics course, where X was to represent an event which had already happened. Jheald (talk) 10:00, 25 February 2008 (UTC)

Your claim that lecturers were barred from talkinga bout P(x) makes no sense. Every one of the six undergraduate stats text books on my book shelves start with chapters that uses the term "P(x)". Are you saying using P(x) for "probability of x" is somehow uniquely Bayesian? The same term is used throughout statistics whether the author is "frequentist" or not. I agree with ERosa that this entire article confuses Bayesian with subjectivist. As you both pointed out, Bayes Theorem is mathematically proven. But the problem is that the word "Bayesian" has come to mean two very different things. Over the last couple of decades, statisticians are using Bayes Theorem more often (even though Bayes Theorem is much older than that) to properly incorporate prior known constraints on potential values. We have to separate the philisophical argument which I think is better labeled frequentist vs. subjectivist, from Bayes entirely. And I think it mischaracterizes the frequentist view that p(x) must related only to a past event. The frequentist view is that p(x) only has meaning as the frequency of x over a large number of trials. But I like the argument presented earlier that "degrees of belief" can also be tested by frequentist means. If, of all the times a person says they are 80% confident, they are right 80% of the time, then you have confirmed their "degree of belief" with the frequency of being right. So, even in a purely philosophical sense, I see no real conflict, much less within the pragmatic use of statistics.ChicagoEcon (talk) 15:03, 25 February 2008 (UTC)

No, my claim is that a Bayesian will feel free to discuss the probability of x, where X is an event which has already taken place. A frequentist would resist this; and would resist talking about probability even of events in the future, if they could not be related to a frequency over a large number of trials.

"Bayesian" has been used in this sense, ie usages of probability not related to frequency over a large number of trials, ever since the word was coined, in the 1950s. An out-and-out frequentist may use Bayes theorem; but they are unlikely to describe either themselves, or their calculation, as "Bayesian". Jheald (talk) 15:42, 25 February 2008 (UTC)

In that case no statisticians are frequentists since all statistical estimators of means of populations are actually the P(a<x<b)where a and b are the bounds on some interval. This P(x) means that if we continued to sample the population until we got every possible member of the population, then the actual population mean has the stated chance of faling within those bounds. But, since it would be absurd that no statisticians are frequentists, and since that would be the logical conclusion from your claim, I would say that your initial characterization of frequentists resisting using P(x) is wrong. A frequentist uses P(x) but holds the position that the only meaning is what it means for the frequency of occurance over a large number of trials. A subjectivist would say it has another meaning - that it can mean degree of belief. I say frequentists' analysis of degrees of beliefs also show it meets the frequentists criterion (a large number of trials of degrees of belief statements can be observed with frequentists' observations). Both groups use P(x) and neither resists using it in any sense.ERosa (talk) 20:30, 25 February 2008 (UTC)

I think if you look closer, you will find that that is not the case. A frequentist statistician will not make assertions about the probabilities of a parameter

\theta

of a distribution. Rather, they will make assertions about the probability of an estimator

{\hat {\theta }}

, and how often it might or might not be an amount

\delta

different from

\theta

if hypothetically a large number of similar trials were to be carried out.

A Bayesian will feel free to discuss the probability of

\theta

itself. But for a by-the-book frequentist

\theta

is a fixed parameter, not a random variable; so not something about which they can ever talk about a probability distribution. Jheald (talk) 21:23, 25 February 2008 (UTC)

Then, as you define it, I've never met a frequentist statistician. And that would be somethig since I'm a statisician that worked, among other places, with the Census Bureau (where there are over 1000 professional statisticians). I've also worked with the statisticians ad the EPA and with many academic researchers. My contact list has over 100 people with advanced degrees in statistics. And I've never met anyone who would does not talk in terms of he probability of a parameter falling within stated bounds. Its simply the normal language among every statistician I know. And, believe me, I've "looked closely". Now, you should compute the odds that, even with a somewha biased sample, I would by chance have never met a frequentist statistician if there are any more than a tiny minority. (Hint: You can use Bayes Theorem)ERosa (talk) 15:49, 27 February 2008 (UTC)

That's interesting. So are these people actually calculating probabilities P(θ|data) ? Or are they calculating confidence intervals and then misrepresenting the meaning of their calculation ? Jheald (talk) 16:35, 27 February 2008 (UTC)

If you understand what "confidence interval" means, you know that the "confidence" that the "interval" a to b contains the population parameter x is is P(a<x<b|data). Its not a misrepresentation. They are saying that there is a 90% chance that, if we continued to sample the entie population,we would find the mean to be within the 90% CI. In fact, simulations of samples of populations will actually prove that. Where are you learning what you have "learned" about statistics? I honestly can't think of a single professor of stats, text, or PhD researcher who makes these fundamental errors you seem to be making. Can you provide a citation?ERosa (talk) 19:12, 27 February 2008 (UTC)

If you really want P(θ|data) you do it the Bayesian way: you start with a prior P(θ|I), and update it according to Bayes theorem. Confidence interval calculations don't do that: they calculate P(interval|θ), without any consideration of the priors on θ. As a result, there are cases where frequentist methods can report very high "confidences" in parameter ranges which may nevertheless still actually have rather low probability. Jheald (talk) 20:22, 27 February 2008 (UTC)

You didn't answer my question about a source. You were surprised that calculating a CI is actually calculating a particular P(X) (in this case, P(a<x<b) where x is a population parameter). Why would you think this is a misrepresentation and what is your source? ERosa (talk) 23:28, 27 February 2008 (UTC)

But of course it is not calculating a P(X). The notation P(a<θ<b) is misleading, because θ is not a random variable. The interval is the random quantity, and it is fixed so that

P(a<{\tilde {\theta }}<b\;|\;\theta )=0.95,{\mathrm {if} }\;\theta ={\hat {\theta }}

. It's a very odd calculation, when you actually write it out properly; but it has nothing to do with getting a probability distribution for θ. Jheald (talk) 00:51, 28 February 2008 (UTC)

But of course it IS and you are seriously mislead. Again, provide a citation for your claim. The notation P(a<x<b) is quite standard and what is, in fact, random, is the estimate of x relative to the true population mean of x. A large number of simulations of samples from known populations show that the 90% CI contains the known population mean 90% of the time. By the way, a colleague of mine once wrote for the Journal of Statistics Education, which talks a lot about bizzare misconceptions about statistics. I think you will make an excellent subject.74.93.87.210 (talk) 04:49, 28 February 2008 (UTC) Forgot to sign in.ERosa (talk) 04:58, 28 February 2008 (UTC)

By the way, the wikipedia article on confidence intervals seems to use notation entirely consistent with what I'm saying and contrary to what you say. You should also set out to "correct" that error. And all the errors in every stats text I pick up. You have a lot of work to do.ERosa (talk) 04:58, 28 February 2008 (UTC)

P(a<θ<b|data), calculated using Bayes theorem, is called a Bayesian credible interval. It coincides with a frequentist confidence interval only if the prior probability P(θ|data) is uniform. Otherwise, as you can verify for yourself, the calculations are different. And it's a well known fact, that if you bet against a Bayesian who has an accurate prior, you will tend to lose. Jheald (talk) 09:51, 28 February 2008 (UTC)

By the way, with regard to the Wikipedia article on confidence intervals, note the confidence intervals#definition is in terms of

\Pr(U<\theta <V|\theta )

,

ie probabilities of the interval given theta.

Note also the section Meaning and Interpretation:

"It is very tempting to misunderstand this statement in the following way... The misunderstanding is the conclusion that $\Pr(u<\theta <v)=0.9,\,$ so that after the data has been observed, a conditional probability distribution of θ, given the data, is inferred... This conclusion does not follow from the laws of probability because θ is not a "random variable"; i.e., no probability distribution has been assigned to it."

(emphasis added). Jheald (talk) 10:01, 28 February 2008 (UTC)

Wow, this debate has generated a lot of text! Actually, I think the entry in the confidence interval argument needs to be corrected if it means that a 90% confidence interval doesn't have a 90% *propability* of containing the true value. And those who insist on the distinction between "credible interval" and "confidence interval" make the same mistake Jheald makes since the distinction has no bearing on observed outcomes. I believe what ERosa was referring earlier to was the fact that if you take, say 30 samples from a large population where you already know the mean, compute the 90% confidence interval, and repeat this thousands of times, you will find that 90% of the time the known population mean actually fell between the upper and lower bounds of the 90% confidence interval. This claim is experimentally verifiable. Neither the math nor experimental observations contradict ERosa. This is another example of how people have some strange ideas about probability theory.Hubbardaie (talk) 13:43, 28 February 2008 (UTC)

I noticed that the section of the confidence interval article that Jheald cites had no citations for its arguments (much like Jheald's arguments in here). So I added fact flags. When I get a chance I will rewrite that fundamentally flawed section. This is the problem when people who barely understand the concepts try to get philosophical.Hubbardaie (talk) 14:18, 28 February 2008 (UTC)

Arbitrary section break (Confidence limits)

Here's a concrete example of the problems you can get into with confidence limits.

Suppose you have a particle undergoing diffusion in a one degree of freedom space, so the probability distribution for it's position x at time t is given by

P(x|t)dx={\frac {1}{\sqrt {2\pi \mu t}}}\exp {\frac {-x^{2}}{\mu t}}dx

Now suppose you observe the position of the particle, and you want to know how much time has elapsed.

It's easy to show that

{\hat {t}}={\frac {x^{2}}{\mu }}

gives an unbiased estimator for t, since

E({\hat {t}}|t)=t.

We can duly construct confidence limits, by considering for any given t what spread of values we would be likely (if we ran the experiment a million times) to see for

{\hat {t}}

.

So for example for t=1 we get a probability distribution of

P(\;{\hat {t}}\;)d{\hat {t}}\propto {\sqrt {\hat {t}}}\exp {\frac {-{\hat {t}}}{\mu }}d{\hat {t}}

from which we can calculate lower and upper confidence limits -a and b, such that:

P(-a<t-{\hat {t}}<b)=0.95

Having created such a table, suppose we now observe

x={\sqrt {\mu }}

. We then calculate

{\hat {t}}=1

, and report that we can state

P({\hat {t}}-a<t<{\hat {t}}+b)

with 95% confidence, or that the "95% confidence range" is

1-a<t<1+b\;

.

But does that give a 95% probability range for the likely value of t given x? No, it does not; because we have calculated no such thing.

The difference becomes perhaps clearest if we think what answer the method above gives, if the data came in that

x=0\;

.

That gives

{\hat {t}}=0

. Now when t=0, the probability distribution for x is a delta-function at zero, as is the distribution for

{\hat {t}}

. So a and b are both zero, and so we must report a 100% confidence range,

0\leq t\leq 0

.

Does that give a 100% probability range for the likely value of t given x? No, because we have made a calculation of no such quantity. The particle might actually have returned to x=0 at any time. The likelihood function, given x=0, is actually

L(t;x)={\frac {1}{\sqrt {2\pi \mu t}}}

Conclusion: confidence intervals are not probability intervals for θ given the data. Jheald (talk) 15:54, 28 February 2008 (UTC)

Certainly confidence intervals are not probability intervals. Here's a simple example: two independent observations are uniformly distributed on the interval from θ − 1/2 to θ + 1/2. Call the larger of the two observations max and the smaller min. Then the interval from min to max is a 50% confidence interval for θ since P(min < θ < max) = 1/2. But if you observe min = 10.01 and max = 10.02, it would be absurd to say that P(10.01 < θ < 10.02) = 1/2; in fact, by any reasonable standard it would be highly improbable that 10.01 < θ < 10.02 unless you had other information in addition to that given above (e.g. if you happened to know the actual value of θ). And if you observed min = 10.01 and max = 10.99, then it would be similarly absurd to say that P(10.01 < θ < 10.99) = 1/2; again, it would be highly improbable that θ is not in that interval. Michael Hardy (talk) 20:54, 28 February 2008 (UTC)

I think where Hardy and Jheald are differeing with Hubbardaie and myself is in two ways. First, as I've said before, Jheald need only repeat this process in a large number of trials to show that the CI will capture the mean exactly as often as the CI would indicate. In other words, if the 95% CI is a to b, and we compute a large number of intervals a to b based on separate random samples, we will find that the known mean falls within 95% of the computed intervals. Second, Hardy is calling the result absurd because he is taking prior knowledge into account about the distribution. But, again, if this sampling is repeated a large number of times, he will find that only 5% of the computed 95% CIs will fail to contain the answer. If we move away from the anecdotal to the aggregate (where the aggregate is the set of all CI's ever properly computed on any measurement) we find that P(X within interval of Y confidence)=Y.ERosa (talk) 21:40, 28 February 2008 (UTC)

I did not call it "absurd" because of prior knowledge; I said it's absurd UNLESS you have prior knowledge. It is true that in 50% of cases this 50% confidence interval contrains the parameters, but in one of my cases the data themselves strongly indicate that this is one of the OTHER 50%, and in the other one of my cases, the data strongly indicate that this is one of the 50% where θ is covered, so one's degree of confidence in the result would reasonably be far higher than 50%. Michael Hardy (talk) 16:13, 29 February 2008 (UTC)

Also, Jheald commits a non-sequitur and begs the question. He shows a calculation for a CI and up to the point of that answer, he is doing fine. But then he asked "But does that give a 95% probability range for the likely value of t given x?" and then states "No, it does not; because we have calculated no such thing". You correctly compute a confidence interval, but then make an unfounded leap to what it means or doesn't mean. You have not actually proved that critical step and your claim that you have not computed that is simply repeating the disputed point (i.e. begging the question).ERosa (talk) 21:44, 28 February 2008 (UTC)

Well, the 95% CI calculation is different to what a calculation of a 95% probability range for the likely value of t given x would look like. But rather than labour the point, surely the coup-de-grace is what follows?

If you observe x=0 in the example I've given above, the CI calculation gives you a 100% confidence interval for t=0.

But the likelihood

L(t;x)={\frac {1}{\sqrt {2\pi \mu t}}}

So there is the key ingredient for the probability of t given x (give or take whatever prior you want to combine it with), and it is not concentrated as a delta-function at zero. Jheald (talk) 13:02, 29 February 2008 (UTC)

But now in your new response I see you are backing off from your original claim that given a particular set of data x, the CI will accurately capture the parameter 95% of the time. Now I see that you are replacing that with the weaker claim that given a particular parameter value, t = t*, the CI will accurately capture the parameter 95% of the time. Alas, this also is not necessarily true.

What is true is that a confidence interval for the difference

{\hat {t}}-t

calculated for a correct value of t would accurately be met 95% of the time.

But that's not the confidence interval we're quoting. What we're actually quoting is the confidence interval that would pertain if the value of t were

{\hat {t}}

. But t almost certainly does not have that value; so we can no longer infer that the difference

{\hat {t}}-t

will necessarily be in the CI 95% of the time, as it would if t did equal

{\hat {t}}

.

If you don't believe me, work out the CIs as a function of t for the diffusion model above; and then run a simulation to see how well it's calibrated for t=1. If the CIs are calculated as above, you will find those CIs exclude the true value of t a lot more than 5% of the time. Jheald (talk) 14:23, 29 February 2008 (UTC)

I've written in the confidence interval article talk something this entire discussion has been seriously lacking...citations! See the rest there.Hubbardaie (talk) 14:37, 29 February 2008 (UTC)

By the way, I also made a similar argument to ERosa that, over a large number of trials, 95% of 95% CIs will contain the true mean of a population. I haven't backed off of it and I don'tsee where ERosa has. In fact, the student-t distribution (which I wrote about in my book) was initially empirically derived with this method. So, alas, it IS true that the 95% CI most contain the true value 95% of the time. If you don't believe me, run a simulation on a spreadsheet where you randomly sample from a population and compute a CI over and over again. So much for the coup-de-grace. But this is getting us nowhere. Refer to the citations I provided in the confidence interval article. You also have to provide verifiable citations for anything you say or you run the risk of violation the NOR rule.Hubbardaie (talk) 14:45, 29 February 2008 (UTC)

But I'm not calculating the mean of a population. I'm trying to get a confidence limit for an unknown time, given a position measurement.

The CIs I get don't reflect the probability distribution P(t|x) for that unknown time, given the measurement.

That is sufficient to dispose of the assertion that confidence intervals necessarily reflect the probability distribution for their parameter given the data.

You might also like to reflect that WP:NOR specifically does not apply to talk pages, and per WP:SCG the creation of crunchy examples and counter-examples is not considered OR. Jheald (talk) 15:47, 29 February 2008 (UTC)

First, I didn't say NOR applied to talk pages. Of course, knock yourself out and apply all the original research you want in here (that is, one might presume its original since you never provide a citation). I'm just cautioning you for when and if you decide to modify the actual article. In there you will need citations, so why not show them here,too? And you seem to have backed off of your original position quite a lot. As I review your conversations with me and others over the last couple of weeks, you originally said that a frequentist would resist using P(X) at all. This morphed into a conversation about whether a confidence interval a to b has a probability of P(a<x<b). The fact that William Sealy Gosset derived the first t-stats by empirical methods settles that issue. The citations I showed in the confidence interval page contradict your position. You haven't proven anything. You, again, made an unrelated point followed by an unfounded leap to the original debated assertion. And you have, again, confused situations where a possible, but unlikely set of observations can in one situation produce a range that doens't contain the true value, when the true value is known, with situations where you don't know the true value to begin with and are trying to assess the probability distribution of possible population parameter values.Hubbardaie (talk) 16:03, 29 February 2008 (UTC)

In the particular case of a Student t-test, the 95% confidence interval does match a Bayesian 95% credible interval. (For a derivation, see eg Jeffreys, Theory of Probability). Student in fact used an inverse probability approach to derive his distribution; similar to Edgeworth, who'd used a full Bayesian approach back in 1883. The reason the two match is (i) we assume that we can adopt a uniform prior for the parameter; (ii) that the function P(θ'|θ) is a symmetric function that depends only on (θ'-θ), with no other dependence on θ itself; and also that (iii) θ' is a sufficient statistic.

Under those conditions, a 95% confidence interval will match a Bayesian 95% credible interval. But in the general case, ie in other situations, as in the example I gave higher up, the two do not match. Jheald (talk) 18:12, 29 February 2008 (UTC)

So: does a confidence interval a to b in general have a probability of P(a<x<b|data) = 0.95 ? In general, no. And even when it does (like the case of the t-test), in moving from one to the other, one is (either consciously or unconsciously) making a transition of worldview, from the frequentist to the Bayesian.

I don't back down from what I said above. The notion of a conditional point probability, or an interval probability, for P(θ|data) is not a Frequentist notion. A proper frequentist would not talk about P(θ) at all. Talk about P(θ|data), where θ is a non-random parameter, is only meaningful in the context of a Bayesian outlook. If somebody does believe in P(θ|data), then they either don't care about Frequentism, or don't understand it. Jheald (talk) 18:28, 29 February 2008 (UTC)

Look, I respect where you are coming from. You are clearly not a total layman on the topic. But I won't repeat how your claim doesn't address the issue of how, when you have no a priori knowledge of a population's mean or it's variance, that the CI is meant to mean what the sources I cite say it means. I understand you continue to insist that when someone says a CI is a range that contains the true values with a given probability, that they must be wrong or misleading, contrary to the authoritative sources I cite clearly state. Let's just lay out the ciations and show both sides in the article. Hubbardaie (talk) 22:43, 29 February 2008 (UTC)

Undo justification

I've removed Logicus' addition, as it contains sentences such as "However the dominant 20th century and contemporary philosophy of science apparently believed by most scientists, namely realist instrumentalist fallibilism that maintains all scientific laws are false but are more or less useful instruments of prediction, poses an intractable problem for the thesis that scientific reasoning is subjective epistemic probabilist, based on degrees of strength of belief that scientific laws are true, since if all scientific laws are believed to be false, they must be assigned prior probability zero whereby all posteriors must also be zero, thus offering no epistemic differentiation between theories." This is utterly incomprehensible and poor style. Tomixdf (talk) 17:45, 4 March 2008 (UTC)

Logicus on Tomixdf's deletion:

The whole Logicus text that Tomixdf has removed is as given below in curly brackets, preceded by the text and claim it commented upon.

Tomixdf’s justification for its removal was

“(Removed unreferenced/confusing POV section)”,

but as can be seen in fact it contained many references including links contrary to Tomixdf’s claim. Its basic purpose was to correct the articles’s current uncritical pro-subjective probabilist POV on the ‘Bayesian’ logical positivist philosophy of scientific method. Whether it is confusing or clarificatory, or indeed any more confusing than this currently highly confused article, I leave it to the reader to judge.

Whether Tomixdf’s other insulting justification that its first sentence is “utterly incomprehensible and poor style.” is also false, again the reader may judge for themselves or even suggest improvements.

It should be noted that since 18 February Tomixd has made some 25 or so edits of this article without first offering a single prior discussion of any of their proposed edits on this Talk page beforehand, prior to this attempted justification for removing somebody else's addition.

Overall it should be born in mind that Logicus's additions deal with the most important critical issue of all for 'Bayesian' probabilist philosophy of scientific method, portraying the 'Logic of Science' being THE main concern of the probabilist philosophy of such as Jaynes and others. Not to mention the main problems of and alternatives to probabilist theories of scientific method is extremist POV.

The text in question in the 'Applications' section:

“Some regard the scientific method as an application of Bayesian probabilist inference because they claim Bayes's Theorem is explicitly or implicitly used to update the strength of prior scientific beliefs in the truth of hypotheses in the light of new information from observation or experiment. This is said to be done by the use of Bayes's Theorem to calculate a posterior probability using that evidence and is justified by the Principle of Conditionalisation that P'(h) = P(h/e), where P'(h) is the posterior probability of the hypothesis 'h' in the light of the evidence 'e', but which principle is denied by some [8] Adjusting original beliefs could mean (coming closer to) accepting or rejecting the original hypotheses. { Logicus's addition: However the dominant 20th century and contemporary philosophy of science apparently believed by most scientists, namely realist instrumentalist fallibilism that maintains all scientific laws are false but are more or less useful instruments of prediction, poses an intractable problem for the thesis that scientific reasoning is subjective epistemic probabilist, based on degrees of strength of belief that scientific laws are true, since if all scientific laws are believed to be false, they must be assigned prior probability zero whereby all posteriors must also be zero, thus offering no epistemic differentiation between theories. < Fn. As Duhem expressed the falsity and endless refutation of all scientific laws posited by fallibilism, "Thus, the struggle between reality and the laws of physics will go on indefinitely: to every law that physics may formulate, reality will sooner or later oppose a rude refutation in the form of a fact, but, indefatigable, physics will improve, modify, and complicate the refuted law in order to replace it with a more comprehensive law in which the exception raised by the experiment will have found its rule in turn." Duhem's 1905 Aim and Structure of Physical Theory, p177 of the Athaneum 1962 edition.> Alternative non-probabilist fallibilist philosophies of science such as those of Duhem, Popper, Lakatos, Laudan, Worrall and others based upon objective normative criteria of supercession of theoretical systems that specify when one system is better than another, such as when predicting more novel facts or having less anomalies than a competitor and thus constituting scientific progress, do not suffer this difficulty. So they raise the question of what problems of the theory of scientific theory change are solved, if any, by subjectivist probabilism that are not solved by non-probabilist objectivist theories. Another defect of the purely subjective expert-opinion based approach of subjectivist epistemic probabilism is the political criticism that it is elitist and anti-democratic because it makes the rationale of science and evaluation of its theories depend solely upon the purely subjective strengths of belief of an elite in their truth, as opposed to their objective publicly ascertainable ability to predict novel facts and solve problems, for example. Subjectivist probabilism runs counter to the democratic ethos that public research funds are not to be doled out on the basis of the subjective strength of some academic’s belief that their theory is true. Thus for example, the early 18th century changeover to heliocentric astronomy from geoheliocentric astronomy dominant in the 17th century is apparently easily explained by heliocentrism’s successful prediction of the novel fact of stellar aberration experimentally confirmed in 1729, not predicted by any form of geocentrism. And the earlier 17th century changeover from pure geocentrism to geoheliocentrism is easily explained by the 1610 telescopic confirmation of the novel fact of the phases of Venus first predicted by Capellan geoheliocentric astronomy. What more is needed to explain such theory changes that only subjectivist epistemic probabilism can explain, or beyond explaining problems of its own making ? }

--Logicus (talk) 21:16, 4 March 2008 (UTC)

You are spamming the discussion page, please keep it short and to-the-point. About your contribution: (a) it contained overly long sentences that are unreadable (very bad style), (b) It does contain many unreferenced statements (starting at the first sentence!) and (c) it contains lots of POV/OR (example: Another defect of the purely subjective expert-opinion based approach of subjectivist epistemic probabilism is the political criticism that it is elitist and anti-democratic because it makes the rationale of science and evaluation of its theories depend solely upon the purely subjective strengths of belief of an elite in their truth, as opposed to their objective publicly ascertainable ability to predict novel facts and solve problems, for example.'. In conclusion, I stand by decision to delete your section. Again: if you respond to this, keep it short and to-the-point, thanks. -Tomixdf 130.225.125.174 (talk) 10:57, 5 March 2008 (UTC)

Too technical?

I removed the to technical tag from this page. It is the opposite of to technical. It almost needs a more mathematically technical treatment in atleast one section. —The preceding unsigned comment was added by Jeremiahrounds (talk • contribs) 09:36, 22 June 2007.

This article reads like it was written by a collection of warring graduate students. I would really have liked an introductory non-technical paragraph (at least) on bayesian probability. Right now, the article is almost incomprehensible to me (two years of college math & statistics). —Preceding unsigned comment added by 71.6.94.66 (talk) 23:28, 5 August 2008 (UTC)

What is this article about???

Can anyone clarify what this article is about? Logicus writes <quote>this article is NOT about 'Bayesian STATISTICS', but rather about 'Bayesian PROBABILITY'</quote>, and removes a reference to Cox's axioms. In what sense, may I ask, is Bayesian probability unrelated to Bayesian statistics? This article is a complete mess!!!Tomixdf (talk) 21:40, 4 March 2008 (UTC)

I agree. Cox's axioms (and theorems) are obviously related to Bayesian PROBABILITY, since they claim to provide an axiomatic basis for it. Bill Jefferys (talk) 00:44, 5 March 2008 (UTC)

At one point Logicus mentions "the article's definition of Bayesian probability as subjective epistemic probabilism". It seems the article should be renamed to reflect what it is actually about then (ie. subjective epistemic probabilism, or so it seems). That would be reasonable since there is a fairly decent article on "Bayesian inference", and this article seems to be redundant. In that case, I'll direct my efforts to adding a history section there, instead of trying to improve this mess. Tomixdf (talk) 06:26, 5 March 2008 (UTC)

Logicus comments: I agree with Tomixdf that this article is currently a terrible conceptually confused mess. I tried to sort this confusion out some time ago in creating the Talk Section 21 ‘What is probability ?’ last October. But rather than its resulting in historically informed logical clarifications of the different conceptions of the term ‘probability’ in order to sort out this confusion, rather it promoted what I regarded as the authoritarian mystical ravings of a radical Jaynesian who apparently did not even understand Jaynes was an objectivist probabilist whose views were therefore irrelevant to an article on subjective probability as the article then (mistakenly) defined ‘Bayesian probability’, other than to mention a contrasting alternative philosophy of probability in passing. So I gave up.

To correct yours and Jefferys’ misinterpretation of why I removed the statement about Bayesian statistics in the introduction, it was not because I think Bayesian probability is unrelated to Bayesian statistics as you surmise, but rather because if the article is to be about subjective probability as it was defined to be, then Bayesian statistics that are the same in both objective and subjective probability and indeed even in non-epistemic material probability a la Kolmogorov etc are irrelevant to the issue of explicating the specific subjective epistemic notion of probability as ‘strength of belief in the truth of a proposition’, as distinct from such as objectivist epistemic ‘degree of certainty of a proposition’, for example.

By the way, Jefferys is on record on these pages as rejecting epistemic probability in his probabilist philosophy of science in favour of a subjectivist realist fallibilist instrumentalist concept of probability which maintains all scientific laws are false but may be more or less useful instruments of prediction, and that ‘probability’ is just the degree to which they are believed to be useful for making predictions, rather than the degree of belief that they are true. Certainly realist instrumentalist fallibilism seems to be the dominant philosophy of science amongst scientists, and Jefferys claims to be an astronomer. Realist fallibilist instrumentalism is to be distinguished from idealist instrumentalism which holds that scientific theories are neither nor false (idealism) but are useful logical classifying devices and instruments of prediction, a position that such as the physicist Duhem equivocated on. Scientific realism holds that scientific theories do have truth values, being either true or false contrary to idealism's claim that they are neither, and realist fallibilism holds that in particular this value is always the value 'false'.

I propose the article should be retitled ‘Subjective probability’, in which ‘probability’ is interpreted as ‘strength of belief in the truth of a proposition’, and accordingly purged of all the confusing stuff about other conceptions. I note that a Wikipedia search on the term subjective probability already redirects to this article, whereas searching on ‘Objective probability’ gives no article on that topic.

I have no idea what the article’s current definition of ‘Bayesian probability’ as “a state of knowledge about a proposition” means. Does anybody ?

I note that the Wikipedia ‘Bayesian inference’ article currently starts by mistakenly defining Bayesian inference as concerned with “the probability that a hypothesis may be true”, thus mistakenly restricting Bayesian inference to objective probability, whereas Bayesian inference is also used in subjective probability in which ‘probability’ is ‘strength of subjective belief that a hypothesis is true’ rather than ‘the probability that it is true’.

--Logicus (talk) 18:11, 5 March 2008 (UTC)

I did not read your dense prose completely but (a) I am in favor of renaming as you suggest and (b) probability as a state of knowledge is straight out of Jaynes. Tomixdf (talk) 18:47, 5 March 2008 (UTC)

"Epistemic probability" is what the article is about. I'm not sure that "Subjective epistemic probability" is right, because there are some - and historically have been even more - who take the view that epistemic probabilities can be assigned objectively.

However, in the real world, "Epistemic probability" is overwhelmingly called "Bayesian probability", and associated with the Bayesian counter-revolution against Frequentism and Frequentist statistics.

A key WP principle is to name articles in accordance with the most familar usage. In this case, that is "Bayesian probability", being the view of probability adopted by modern-day Bayesians. Jheald (talk) 23:13, 6 March 2008 (UTC)

OK, but then surely something crucial as Cox's axioms or Jaynes' view of probability as measuring a state of knowledge should be discussed in the article. The current mainstream view on Bayesian probability is distinctly objective/logical, a la Jaynes (see for example Bishop's recent hugely popular text book). In pratice, that is reflected in the use of Maxent priors, priors based on invariance etc. There's nothing subjective about it, at least not in principle. If this article is to be about Bayesisan probability in general, than I do not understand why it should focus solely on this so-called 'subjective probability', which has now become a minority view (but of course it can be dicussed and mentioned!). Tomixdf (talk) 07:21, 7 March 2008 (UTC)

I don't have time to enter this debate again, but I feel the need to warn everyone that Logicus has, for a while now, been trying to add densely written paragraphs which turn out to make little sense when you take the time to understand them thoroughly. Go back and look at the modifications he made last year if you don't believe me. Also have a look at the discussions above.--BenE (talk) 15:54, 7 March 2008 (UTC)

Yes, I've also noticed something seems to be wrong. So what do we do? We can either (a) give up and rename the page to something like "Subjective epistemic probability" (which will make the problem disappear into obscurity) or (b) try to get to a decent history and overview of the different Bayesian schools (Keynes, Jaynes & Jeffreys, de Finetti,...) into the article. The current state of the article is unacceptable for such an important topic. Note that disruptive users can be reported to the administrators (had some very good experiences with that way of getting rid of disruptive edits recently). Tomixdf (talk) 19:32, 7 March 2008 (UTC)

The section "The Controversy between Bayesian and Frequentist Probability" is IMO very poor. I'm planning to delete it. Opinions? Tomixdf (talk) 19:58, 7 March 2008 (UTC)

Logicus to Tomixdf: Jaynes is an objectivist, not a subjectivist. Having agreed this article should be about subjective probabilism, you have now launched into defining it as objectivist using a Jaynes 'definition'. This will cause even deeper confusion. More later ... --Logicus (talk) 15:46, 16 March 2008 (UTC)

We've agreed no such thing. This article is about epistemic or personal probability, and should cover both those who believe that such personal probabilities can be arrived at objectively, and those who don't. Jheald (talk) 16:02, 16 March 2008 (UTC)

Logicus to Jheald: I did not say that you and I did agree any such thing, but rather in my message addressed to Tomixdf I said that Tomixdf did agree that, for as he said on 5 March above:

“At one point Logicus mentions "the article's definition of Bayesian probability as subjective epistemic probabilism". It seems the article should be renamed to reflect what it is actually about then (ie. subjective epistemic probabilism, or so it seems).”

and also

“…but (a) I am in favor of renaming as you [i.e. Logicus] suggest.”

May I respectfully suggest you should perhaps consider reviewing whether you are sufficiently functionally literate in English to be attempting editing Wikipedia articles, let alone arrogantly asserting Humpty Dumpty style what articles on the philosophy of probability are about ? I merely pointed out that given the article's previous definition of Bayesian probability as subjective probability, it should be renamed as such. --Logicus (talk) 19:43, 22 March 2008 (UTC)

Indeed, I strongly agree with Jheald. Moreover, the article should also make clear that the predominant view of Bayesian probability today is Jeffreys/Jaynes's objectivist view. Tomixdf (talk) 16:10, 16 March 2008 (UTC)

Erm. I think you'll find that by the end of Jaynes's life, he was backing away from the objectivist view. Yes, you may often be able to derive a prior probability distribution from a transformation group; but the choice of that transformation group my nevertheless itself be personal, and therefore subjective. I believe the article's summary is correct, that most present-day Bayesians are content to accept that there is a subjective element; though some are more objectivist.

Jaynes' backing away from the objectivist/logical view? His magnum opus "The logic of science" was the last thing he worked on (he died while working on it), and his views in that book are strongly objectivist/logical, as far as I know (I am reading it at the moment). Could you point me to specific references for this? (I'm asking out of interest!) Also, see page 655 for his quite critical views on de Finetti. Tomixdf (talk) 17:18, 16 March 2008 (UTC)

In any case, a good Bayesian should stress-test their conclusions against the effects of different priors. Jheald (talk) 16:35, 16 March 2008 (UTC)

Logicus to Tomixdf: The current introduction to this article on the PHILOSOPHY of probabilIty after your recent re-editing of it is now as follows:

"Bayesian probability interprets the concept of probability as 'a measure of a state of knowledge' [1]. Broadly speaking, there are two views on Bayesian probability that interpret the 'state of knowledge' concept in different ways. For the objectivist school, the rules of Bayesian statistics can be justified by desiderata of rationality and consistency and interpreted as an extension of Aristotelian logic[2][3]. For the subjectivist school, the state of knowledge corresponds to a 'personal belief' [4]. Most modern machine learning methods are based on objectivist Bayesian principles [5]."

But this has introduced even greater conceptual confusion as a result of your edits, and whereby it increasingly seems that you may lack the requisite learning and philosophical and logical competence in the subject matter here, namely the PHILOSOPHY of probability in the interpretation of the term 'probability', to contribute to its clarification at this stage in your education in that subject. It would seem maybe you are some kind of statistician who is only just now learning about the philosophy of probability, as suggested by the fact that you are only just now reading Jaynes' work on objective probability as the logic of science, and thus can hardly have had time to make any serious critical philosophical assesment of it and why Jaynes' objectivism is widely and rightly regarded by philosophers of science as puerile nonsense. (For example, you should consult the 1993 Howson & Urbach book listed for the elementary subjectivist demolition of Jaynes's objectivism as pseudo-objective.)

Here I just briefly point out the conceptual confusions now arising from your decision not to make this article about subjective probability contrary to what you had apparently agreed on 5 March, but rather about what is an utterly spurious non-subject, 'Bayesian probability', such as ERosa cogently pointed out. Overall the mystery you have now even more confusingly reproduced is that of what on earth is distinctly 'Bayesian probability', which the article now claims takes two forms, namely objective and subjective ? So what then is non-Bayesian probability, pray ? The real answer is of course that there is no such thing as specifically Bayesian probability, inasmuch as ALL conditional probability calculus uses Bayes Theorem.

Your first sentence now defines 'Bayesian probability' in terms of an objectivist Jaynes characterisation as 'a measure of a state of knowledge', typical incoherent Jaynes-twaddle presumably meaning rather 'a degree of knowledge', and which is objectivist inasmuch as 'knowledge' is taken to be justified true statements whose truth-value is independent of the knowing subject and determined objectively by the real world. So a statement whose truth is only half justified would have probability 0.5 on this view.

Your second sentence then tells us there are two different interpretations of the 'state of knowledge' concept within this illusory 'Bayesian probability'. But the remaining 3 sentences collapse into nonsense or irrelevance in explicating these alleged two different interpretations. The second of these remaining sentences tells us that in subjectivist probability " 'state of knowledge' corresponds to a 'personal belief' ", which is presumably an illiterate attempt to say " 'state of knowledge' is interpreted to mean 'strength of personal belief that a statement is true' ". But of course the latter is not in itself a state of knowledge at all, as illustrated by the strong belief of Russell's turkey that the farmer would feed it as Xmas rather than wring its neck at Xmas.

But the preceding sentence totally fails to tell us how objectivism interprets 'state of knowledge' in comparison, and instead irrelevantly reports the philosophically foolish nonsense that objectivists believe in, namely that 'the rules of Bayesian statistics can be justified by desiderata of rationality and consistency', a view that any capable philosophy of science undergraduate should easily be able to demolish.

And what machine learning methods happen to be is surely quite irrelevant to explicating different concepts of 'probability'.

So what is the objectivist interpretation of 'state of knowledge' ? One problem here is that you are likely end up resorting to the classical definition of probability, widely regarded as non-Bayesan or pre Bayesian', wrongly or rightly, or some such mumbo-jumbo as ‘rational belief in the degree of certainty in an event’. Or else end up misdefining probability as verisimilitude, that is, as a degree of truth and thus a degree of knowledge.

As it is, the introduction is now in an even worse state of illogical obscurantist confusion.

I propose it should at least be replaced to what it was before you started meddling with it, and the article retitled 'subjective probability' and re-edited in line with this. In keeping with that change, there should also be another article created on 'Objective probability', and the pedagogically confusing conceptual nonsense of 'Bayesian probability', an empty category, committed to the flames of meaningless metaphysics where it belongs.

May I recommend you read the listed literature for this article, and especially the works by such as Howson & Urbach and by Gillies if you are trying to educate yourself about the philosophy of probability and also about probabilist philosophy of science. For what it may be worth, my simple opinion is that whilst subjectivism may be the philosophically most tenable interpretation of the concept ‘probability’, on the other hand the logic of science, contra Jaynes and also contra the subjectivists, is definitely not probabilist, i.e. a subjectivist philosophy of probability but an anti-probabilist philosophy of science. But the purpose of the article to report viewpoints in the literature and their problems.--Logicus (talk) 20:17, 22 March 2008 (UTC)

The first sentence currently reads, "Bayesian probability interprets the concept of probability as 'a measure of a state of knowledge' [1]." This does not define what Bayesian probability is, but what it does. Consider the readable definition under the Thomas Bayes article, "Bayesian probability is the name given to several related interpretations of probability, which have in common the notion of probability as something like a partial belief, rather than a frequency."

A concise, straightforward first sentence will help make this article more accessible. Kevin.j.hutchison (talk) 00:42, 1 May 2008 (UTC)

The Controversy between Bayesian and Frequentist Probability

This section is such a mess, I'm cutting it to here, so we can discuss it.

The theory of statistics and probability using frequency probability was developed by R.A. Fisher, Egon Pearson and Jerzy Neyman during the first half of the 20th century. A. N. Kolmogorov also used frequency probability to lay the mathematical foundation of probability in measure theory via the Lebesgue integral in Foundations of the Theory of Probability (1933). Savage, Koopman, Abraham Wald and others have developed Bayesian probability since 1950.

Note: Abraham Wald was not a Bayesian. He was a frequentist. But he showed that in decision-theory terms, any frequentist inference rule would be "inadmissable" - ie demonstrably sub-optimal - unless it was equivalent to a Bayes rule. Jheald (talk) 18:42, 19 March 2008 (UTC)

The epistemological difference between Bayesian and Frequentist interpretations of probability has no important consequences in statistical practice.^{[dubious – discuss]} But in regards to the use of the term "Bayesian" in a mathematical sense, the practical difference is whether prior information is included in a calculation of a posteriori probability. Some Bayesians claim that non-Bayesian sampling statistics assume that one knew nothing of the thing being sampled prior to the sampling. But this claim is not true. Of course, the assumption is almost never true in the real world.^[3] A frequentistic approach would be to consider the distributions of random variables representing all past sampling efforts as well as the present effort - summary statistics can be used instead of complete samples and the contributions could be represented by likelihood functions in parallel to the Bayesian approach. Bayesian analysis simply allows prior probabilities to be taken into account when interpreting observations. This may appear simpler than the frequentist approach but is essentially equivalent when "prior information" reflects past samples. The Bayesian approach has the hidden danger that the apparent simplicity of the formulae used means that an important underlying assumption is forgotten, specifically that the random quantities in the sample being analysed should be statistically independent of the information summarised by the prior distribution.

In fact, Bayesian analysis can even use prior information that was based on other statistical sampling methods and need not be associated with subjective methods at all. Furthermore, it can be shown that over a large number of trials, even a subjective "calibrated probability assessment" will agree with observed distributions (i.e. an "80% certain prediction" will be right 80% of the time).^[4] This result means that Bayesian analysis, even when it is based on subjective probabilities, will agree with the frequentist's approach.^{[dubious – discuss]} Finally, it must be noted that Bayes Theorem itself is mathematically proven and is not at all subjective. It is derived from axiomatic elements of probability theory and makes no reference to whether prior knowledge is subjective or based on a large number of observations. These points together largely blunt any practical difference between the two philosophical positions when applied to real statistical problems^{[dubious – discuss]} and, in reality, it is not uncommon for statisticians to use both. However, there continues to be confusion about the use of the term Bayesian in regards to the epistemological position, and its practical use in statistics.^{[dubious – discuss]}

The cluelessness here is simply extraordinary.

Firstly, probably the most important advantage of Bayesian methods in practise has nothing to do with priors. It is that Bayesian methods allow marginalization over the values of nuisance parameters. This opens up huge possibilities for realistic hierarchical models simply not available in Frequentist methods.

Secondly, these paragraphs discuss Bayesian methods as if the only thing they are relevant for is pseudocounts. Again, this is way off the mark. Bayesian methods give a systematic and probabilistically coherent method for setting up inferences of any quantities in a generative model. Compare that to Frequentist methods where often there would be no clue as to how to set up an appropriate model.

Intricate hierarchical models, with marginalization over nuisance parameters, are bread-and-butter for Bayesian MCMC engines. How would you even start to estimate such models in a Frequentist way?

Thirdly, "no important consequences in statistical practice" -- this is very blinkered nonsense. There are ample examples where Frequentist methods give either physically impossible answers, or (as per the example discussed in Confidence Limits section above) where Frequentist answers are hugely misleading. Compare that to a Bayesian answer, which really does give the best estimate of the combined posterior probability distribution, given the data and the model it's been fed.

Fourthly, "the apparent simplicity of the formulae used means that an important underlying assumption is forgotten". If you start from the generative model and Bayes theorem, nothing is forgotten. If you happen to have mangled some of your data into summary statistics (probably not the best idea, if it can be helped), and there are non-trivial dependencies, then the full Bayesian approach needs must include those dependencies, and the inference will have to take them into account. Starting from first principles: a lot more transparent than the black-art cookbook that is Frequentist statistics. Jheald (talk) 18:42, 19 March 2008 (UTC)

Well yes, but some versions of "Bayesian" stuff suffer from the same cookbook effect. For example, the presently remaining part of the article says "Bayes's Theorem is explicitly or implicitly used to update the strength of prior scientific beliefs" without implying this may be difficult or requiring particular thought. Melcombe (talk) 10:56, 25 March 2008 (UTC)

More criticisms could without doubt be piled up about the section above (eg the extraordinary claim about "calibrated probability assessments" -- rather depends who you ask, and what they're estimating, I suspect). But the more fundamental point is this: If we're going to have a section on the "controversy" (and we should), it should focus on the charges the leading participants on each side actually levelled, and it should be referenced (WP:RS). Random unsourced claims and attestations are not good enough.

To clear the ground to allow such a section to be built, it seems appropriate to cut all the existing material to here, and simply start again. Jheald (talk) 18:48, 19 March 2008 (UTC)

My thoughts are:

(i) perhaps it would be better to divert attention to some other existing article, possibly somewhere under "inference", with only a very brief discussion in the present article;

(ii) perhaps it would be good to start from where Bayesian and Frequentist ideas are in most agreement, specifically with the likelihood function.

Melcombe (talk) 10:56, 25 March 2008 (UTC)

I came to this article hoping to find enlightenment on what this Bayesian vs. Frequentist debate was about. The term 'frequentist' appears a couple times in the article, but without explanation apart from a link to an outside webpage. And this discussion page is full of debate, but since I don't know yet what the (alleged?) difference is, I can't make heads or tails of the debate.

Can someone please add an explanation to the article on what this debate is about? Or else start a new article on the differences? Mcswell (talk) 19:43, 25 April 2008 (UTC)

The word "probability"

"Probability" is a precise term in mathematics, isn't it? Though I'm not competent to do it, it seems to me that several occurences of the word in this article ought to be replaced by "credibility," "likelihood," or "cogency," and that the semantic complications need further elucidation. The article suggests that there are disputes over what the word "probability" ought to mean, rather than advances in understanding which sort of problems may be amenable to analysis and how to solve them. Unfree (talk) 17:55, 26 April 2008 (UTC)

Editing long lines

I've decided to alter somebody else's contribution to this discussion because it contains some very long lines which make it hard to read the page and navigate through it. Unfree (talk) 20:08, 26 April 2008 (UTC)

No, I haven't done it, but the problem appears to occur because of the presence of

style="white-space: nowrap;"

in the html source. I don't know how to fix it, and hope somebody else will. Unfree (talk) 20:17, 26 April 2008 (UTC)

Laplace and medical statistics

Hi! I don't beleive that Laplace was active in the field of medical statistics. Who wrote that? Can he/she give a reference? --Gaborgulya (talk) 21:25, 25 June 2008 (UTC)

Here mentions his interest in medicine, but not how active he was aside from a few quotes. --Gwern (contribs) 23:26 25 June 2008 (GMT)

Incomplete sentence

In the "Varieties" section there is the sentence:

Other Bayesians state that such subjectivity can be avoided, and claim that each prior state of knowledge uniquely defines a prior for well posed problems.

I don't understand this sentence. There seems to be a word missing: defines a prior what? Maybe this is closer to what was intended:

Other Bayesians state that such subjectivity can be avoided, and claim that the state of knowledge is uniquely defined a priori for well posed problems.

But I'm not sure that's right. Hairy Dude (talk) 20:39, 3 December 2008 (UTC)

Ah... judging from use of the word 'priors' elsewhere in the article it seems it's being used as a noun, presumably with some technical meaning since to the layman "prior" is exclusively an adjective, but this meaning isn't given in the article so it's still incomprehensible. Hairy Dude (talk) 20:44, 3 December 2008 (UTC)

I fixed it - 'prior' stands for 'prior probability distribution'. Tomixdf (talk) 20:57, 3 December 2008 (UTC)

Bias towards Cox and Jaynes

I would submit that this article needs significant revision.

It seems to have been written by enthusiasts of Jaynes and Cox, but pays little attention to alternative views, and is obsessed with a binary division between "objective" and "subjective" Bayesian schemes.

C.f. Irving Jack Good on tens of thousands of Bayesian positions!

46 656 varieties of Bayesians, letter in The American Statistician 25 (Dec. 1971), 62-63. Kiefer.Wolfowitz (talk) 16:24, 3 June 2009 (UTC)

Statistical Science had a nice review of Bayesian statistics a couple years ago, where the leaders discussed the obvious problems with the objective methods for priors, e.g. for the multivariate normal distribution! Where are the references to mainstream Bayesian statistics? Kiefer.Wolfowitz (talk) 15:08, 30 May 2009 (UTC)Kiefer.Wolfowitz (talk) 18:38, 5 June 2009 (UTC)

You are making a large number of edits in a short time span without discussion. And in many cases the article is not getting any better by it. For example, replacing the title "Bayesian probability calculus" by "Bayesian updating using conditional probability" is not improving things! The subjective (de Finetti) versus the objective view (Jaynes, Jeffrey) is mentioned in many references. Why do you claim the article is "obsessed" by it? Surely this controversy dserves to be mentioned! If you feel that other views are missing, then why not simply add them? Finally, your edits have made the intro unintelligible to non-statisticians. Tomixdf (talk) 15:27, 3 June 2009 (UTC)

I've put the simple intro back. If you feel that other views need to be mentioned, or that the division between subjective/objective is too emphasized, let us discuss. Also, keep in mind that this is NOT an article about the philosophy of probability; let's keep things down to earth, and with a focus on methods and approached that are used in practice. Tomixdf (talk) 15:42, 3 June 2009 (UTC)

REPLY:

Hypotheses can be combined into "composite hypotheses". Alternatives to 0/1 hypothesis testing include "selecting the best", etc. and such topics are introduced in intermediate books. Thus the restoration of the claim that "frequentists can only reject or accept hypotheses" restored nonsense.Kiefer.Wolfowitz (talk) 16:36, 3 June 2009 (UTC)

Skepticism towards the miracles of the physicist Cox is expressed in articles referenced (by Pearl Kiefer.Wolfowitz (talk) 20:19, 3 June 2009 (UTC))in the Pearl/Shafer anthology: Why not here?

Kiefer.Wolfowitz (talk) 16:21, 3 June 2009 (UTC)

English Please

Could someone please insert a paragraph at the very top that explains to the non-mathematician/non-statistician what is usually meant when we encounter "Bayesian probability theory" in our general reading, without all the jargon? Something along these lines (which I paraphrased from http://math.ucr.edu/home/baez/bayes.html) would be nice.

The Bayesian view is that in calculating probability we start by assuming some probabilities, and then use these to calculate the likelihood of various events. We then do experiments to see if our predicted event occurs, and use the new data from our experiments to update our assumptions. The Bayesian interpretation provides some formulae for doing this. But it is all based on some assumed probabilities to begin with. This is known as the prior probability distribution - often abbreviated to prior. Subjective Bayensians see the choice of prior as inevitably subjective. Objective Bayensians look for rules to guide our choice of prior.

I understand this... but does it accurately capture the meaning?
Anthony (talk) 20:01, 2 June 2009 (UTC)

The suggested introduction is a vast improvement on the previous version, with the dichotomy of frequentist and Bayesian, etc. That said, "likelihood" should be reserved for the likelihood function in this context not for posterior.
Kiefer.Wolfowitz (talk) 17:14, 3 June 2009 (UTC)

Distinguishing objective from subjective

"For the objectivist school, the rules of Bayesian statistics can be justified by desiderata of rationality and consistency. Such desiderata of rationality and constency are also important for the subjectivist school," - this potentially leaves the reader confused about how the two are different. For objectivists, probabilities are uniquely determined for certain well-defined problems given specific, well-defined background knowledge (as explained in the Cox and Jaynes refs). For subjectivists, rationality and consistency constrain the probabilities a subject may have, but allow for a lot of variation within those constraints (this is the position defended by Colin Howson). This is because objectivists interpret consistency and rationality as encompassing more than just probability but also additional principles such as Maximum Entropy. MartinPoulter (talk) 15:51, 3 June 2009 (UTC)

I agree. The article has recently deteriorated badly due to a large number of poor and confusing edits. Let's get things back on track. Will you fix this issue? Tomixdf (talk) 15:57, 3 June 2009 (UTC)

I'd like to but it would be dishonest for me to promise I'll get around to it, given other areas of WP I'm working on. I approve of your edits and I'd be happy for any editor to adapt what I've written above and fit it in.MartinPoulter (talk) 16:51, 3 June 2009 (UTC)

Introduction

My comments on Talk:Bayesian_probability/Archive_2 still pertain, unchanged, under "Painting a picture of too much conflict?". I think the current article, as before, paints a picture of there being more conflict between the Bayesian and frequency interpretations of probability. It also paints them as beliefs (almost like a religion) that you "accept" or "don't accept"...which is totally different from how an overwhelming majority of statisticians view this. And even mentioning two "schools" of Bayesian thought in the introduction seems inappropriate; these are fairly esoteric and philosophical and I think belong in a subsection (even though I personally find them very interesting). For us statisticians, I have figures to back this up. While the term "Subjectivist school" has been used in the literature: [2] it is hardly mainstream (48 hits, and "Bayesian School" gets 422: [3] , compared to 8270 hits for "Bayesian Perspective": [4] and 4,100 hits for "Bayesian interpretation": [5]). I think these numbers would convince almost everyone that, to an overwhelming majority of Statisticians, Bayesian probability is a perspective or interpretation, more than a school of thought. I think the notion of "schools of thought" exists in the context of a specialized philosophical/intellectual debate, and belongs in a section about this debate, not the introductory section of this article. Cazort (talk) 14:36, 4 June 2009 (UTC)

See reference 13 for a very different view of this issue by Bernardo. Also, many new Bayesian techniques are now developed by the machine learning community (Minka, Bishop, Gharamani, Hinton,...). In practice, there is a large gap between the frequentist and the Bayesian community. Much of the Bayesians are however found outside the classical statistics departments (typically computer science, bioinformatics and physics), so maybe that gap is not always too clear within statistics. But we can include a referenced sentence where it is stated that some do not perceive this gap, of course. Tomixdf (talk) 17:42, 4 June 2009 (UTC)

You pointed out that Bayesian statistical techniques are generally developed by different people from those developing frequentist techniques. I completely agree--this is just the specialized nature of academic research. But this does not establish that they are competing "schools of thought". As I pointed out in my archived comments with the examples of the two classic textbooks, even people who strongly favor one perspective over the other do acknowledge the purpose and importance of the other perspective. They are two different ways of viewing probability, that they have different uses, applications, strengths, weaknesses, and that nowadays, any statistician is trained in and exposed to both perspectives. Some choose to focus on one or the other, and they might even believe very strongly that it's the correct way of viewing/approaching certain types of problems--but that doesn't mean they view it as "the correct way" of looking at things overall. People keep editing the page to make it seem like they are almost religious sects or something. Such editing is not representative of the consensus in the field of statistics, and is not constructive to creating a quality page on this topic. Cazort (talk) 19:51, 4 June 2009 (UTC)

Interesting! In the machine learning field the situation is not at all like that, IMO. Frequentist methods are frowned upon, and there are many books, slides and articles around that point out the benefits of the Bayesian view. I'm starting to realize that this might look completely different from the POV of statistics, where Bayesian methods are just part of the tool box. This is an interesting fact, and I think we should add it to the page in some way (if we find this in print somewhere). OTOH, it's clear from the sometimes heated comments on this talk page that this is a sensitive topic. Let's keep our heads cool and make this an error-free yet accessible page! Tomixdf (talk)

Yes! I have seen the same patterns when it comes to machine learning. I've also seen arguments that Bayesian methods more closely resemble the sort of reasoning used by the human mind. On the other end of the spectrum, although there is such a thing as Bayesian linear regression (that page has serious accessibility issues, if anyone wants to tackle it, BTW), the overwhelming majority of regression used in most applications is based on the frequentist framework. Is this historical? Or are there reasons that bayesian regression is less practical? Or is it that it is harder to wrap your mind around the theory of it? Or does it have to do with there being less developed support for it in most software packages? The answer is hardly clear-cut. I think painting a picture of two competing "schools of thought" glosses over all these subtleties...there are so many different arguments for why people use (or should use) a particular interpretation in a particular context. I think that we need to limit the article to such specific arguments and avoid sweeping generalizations. Cazort (talk) 00:23, 5 June 2009 (UTC)

Again, if there is (a) a feeling in the statistics community that Bayesian and frequentist methods are on equal footing and (b) a feeling in the ML/physics community that Bayesian methods rule supreme, then we should mention both situations, and not just pick one. See Bishop's "Pattern recognition and machine learning" for an example of the ML attitude, or Jaynes' "The logic of science" for a common view in physics. There's very little 'ecumenical' attitude there. And again, much of the research in Bayesian methods comes from outside of the statistics community. Tomixdf (talk) 08:23, 5 June 2009 (UTC)

I don't think it's so much that they're on an equal footing, as it is that they're not necessarily "competing" ideas...just different ways of looking at things. The more "ecumenical" approaches I have encountered mainly come out of mainstream grad-level stat textbooks I've read, incl. Casella & Lehmann "theory of point estimation", J.O. Berger's decision theory book. Casella & Lehmann is very strongly focused on frequentist estimators, but it still acknowledges the Bayesian paradigm...and Berger is kinda at the opposite end of things, advocating very strongly for the merits of the Bayesian approach, but still acknwoldeging and sometimes using the frequentist one. Cazort (talk) 15:37, 5 June 2009 (UTC)

Comment(s

APPLAUSE (to the ecumenical comment CazortKiefer.Wolfowitz (talk) 19:11, 4 June 2009 (UTC))! Indeed, many of my comments had been made before by participants in the archive (especially about false dichotomy with Bayesian and "orthodox").

In this article, Ronald A. Fisher is called frequentist.
- Fisher preferred H. Jeffrey over frequentism (which he often criticized, as documented in Savage's classic "On Re-Reading Fisher");
- Fisher also used Bayesian methods (Box & Tiao, page 18 make that 12 Kiefer.Wolfowitz (talk) 18:50, 4 June 2009 (UTC)).
The "sampling-distribution" statistician Jerzy Neyman and his followers used Bayesian methods when warranted.
- Kiefer proved admissiblity of many multivariate procedures using priors and Wald's result, as documented in Giri's multivariate book.
- LeCam uses them in his asymptotic book).

Perhaps a page Bayesian methods in frequentist statistics would be useful, to review such material for those unfamiliar with this usage? Jheald (talk) 15:30, 4 June 2009 (UTC)

About Fisher being called a frequentist: see reference 8, where frequentist methods are clearly attributed to Fisher (and others such as Neyman and Pearson). Tomixdf (talk) 17:47, 4 June 2009 (UTC)

REPETITION: Again, please examine the source I indicated before (page 461 has the precise citations to Fisher)

Leonard J. Savage. On Rereading R. A. Fisher Ann. Statist. Volume 4, Number 3 (1976), 441-500. doi:10.1214/aos/1176343456

(Sir)David R. Cox spends so much time asserting that he thinks it's important to calibrate confidence intervals in terms of repeated-sampling frequencies because it's heretical---anti-Fisherian (post Neyman). Look at his recent book on Statistical Principles. (Of course, Fisher used sampling probabilities at some time. All statisticians do. If you have a nontrivial random vector with a mean, then there's the law of large numbers. So every probabilist and statistician likes frequencies!) Kiefer.Wolfowitz (talk) 18:35, 4 June 2009 (UTC)Kiefer.Wolfowitz (talk) 18:44, 5 June 2009 (UTC)

Asymptotic Justification of Bayesian inference

As data increases, the posterior is dominated by the likelihood of the data (under regularity conditions) (according to the asymptotic theory of Laplace or Bernstein and Richard von Mises). Thus, objective truth and consensus are attained by Bayesian methods (more precisely than intimated by Peirce or Habermas).

In the next few weeks, I'll try to write a paragraph on this justification, which comforts even "subjective" Bayesians that (objective) truth will win (regardless of initial ignorance).Kiefer.Wolfowitz (talk) 19:06, 5 June 2009 (UTC)

Frequentist statistics and hypotheses

Many hypotheses are either true or false, so their objective probabilities are either one or zero. Thus it's confusing to talk about nontrivial probabilities on such hypotheses, according to frequentists, and this point deserves mentioning early in the article (out of neutrality and fairness). Kiefer.Wolfowitz (talk) 19:11, 5 June 2009 (UTC)

I don't think this is a helpful way to phrase it for people who don't already understand the issues. There's a big question-begging jump about the meaning of probability in your "so" above.MartinPoulter (talk) 22:52, 5 June 2009 (UTC)

I don't get this. What are "objective probabilities" and "nontrivial probabilities"? How does the fact that a hypothesis is true or false lead to a probability that is either 0 or 1? That does not makes sense at all. Tomixdf (talk) 12:17, 6 June 2009 (UTC)

On a related note: "One of the crucial features of the Bayesian view is that a probability is typically assigned to a hypothesis, whereas under the frequentist view, a hypothesis is typically rejected or not rejected without directly assigning a probability." In which cases does a frequentist use another method, actually? And what method? Tomixdf (talk) 12:17, 6 June 2009 (UTC)

Dutch book again

"The importance of such Dutch book arguments is limited, since non-Bayesian agents have no more need to gamble with Dutch bookies than they have to indulge in other problem gambling." This sounds like a POV, and in any case lacks a reference. Also, IMO it's not clear at all what point is being made here. That one can endorse any arbitrary probability calculus just by picking the right gambling problem? Tomixdf (talk) 11:45, 6 June 2009 (UTC)

REPLY

Declarative: Why would a rational agent bet when he/she would surely lose? A discussion is in van Fraasen's Laws and Symmetries, which I don't have handy today (although many academic libraries have access to the electronic version at Oxford UP). Again, the editor presents us with a false dichotomy, since there are non-gambling justifications of Bayesian and other updating rules. Rhetorical: The Wikipedia guidelines discourage colorless writing, and that the relevant link provides some readers with humor is insufficient reason for removing it, imho, since the link strengthens the impact of "problem gambling". —Preceding unsigned comment added by Kiefer.Wolfowitz (talk • contribs) 13:38, 7 June 2009 (UTC)

The link to problem gambling leads me to conclude that this is some kind of joke. I'm taking it out. Tomixdf (talk) 11:46, 6 June 2009 (UTC)

The literature on "probability kinematics" and the specific reference to van Fraasen sufficed for references. (The Howson Urbach book also discusses probability kinematics.)
The editor is leaving this article one-sided, once again, leaving in only one account as a justification for Bayesian methods, when there are many more methods availble with the no-Dutch book property.Kiefer.Wolfowitz (talk) 14:03, 7 June 2009 (UTC)

The recent restorations are satisfactory, and I thank the editor for consideration. I shall strive to find a page reference in van Fraasen in the next week, or supply an even better specific page reference.Kiefer.Wolfowitz (talk) 17:36, 7 June 2009 (UTC)

Wald again

"Wald's result also established the Bayesian formalism as a fundamental technique in frequentist statistics." None of the provided references actually state this explicitly. This again is POV, it seems to me. My question is: did Wald's results establish the Bayesian formalism in ALL of frequentist statistics? If so, then why is hypothesis testing still the ruling paradigm in frequentist statistics (for example)? It seems to me that Wald's results are accepted by the frequentist community in the narrow area of frequentist decision theory. Tomixdf (talk) 12:01, 6 June 2009 (UTC)

I agree (and I am a confirmed Bayesian). What Wald showed was that there is a close correspondence between Bayesian and frequentist decision theory. But certainly this result does not affect frequentist hypothesis testing (for example). While I personally regard frequentist hypothesis testing as incoherent (from my Bayesian point of view of course), this does not mean that it is not widely used by frequentists, and thus is an important paradigm in frequentist theory that can't be brought into a Bayesian context.

This doesn't mean that I don't think that frequentists ought to think about Wald's result very deeply. I think that it challenges the underpinnings of frequentist hypothesis testing. Bill Jefferys (talk) 23:08, 6 June 2009 (UTC)

I've updated to "some areas of frequentist statistics". We can discuss of course. Tomixdf (talk) 07:40, 7 June 2009 (UTC)

I would suggest that you name an area of statisticial inference that cannot fit inside Wald's theory, before narrowing the scope of the statements. Again, intermediate textbooks (patiently) explain how "hypothesis testing" is only a small part of the infererential practice covered by statistical decision theory. Please consider SELECTION problems, for example, which I have mentioned as an explicit counter-example to the pigeon-holing of this article and on this TALK page many, many times.
Again, I repeat the reference to Giri's book on multivariate analysis, which refers to Kiefer's proofs of admissibility of multivariate procedures by exhibiting a prior (sometimes unnatural). Once admissibility is proven, there's no need to exhibit priors to civilians. —Preceding unsigned comment added by Kiefer.Wolfowitz (talk • contribs) 13:49, 7 June 2009 (UTC)

The suggested edit is reasonable (if not optimal, imho), and so I consider this matter closed. Good work, editor! (I would recommend deleting this TALK section, after a week of inactivity.)Kiefer.Wolfowitz (talk) 17:33, 7 June 2009 (UTC)

I believe that it is contrary to the rules and spirit of WikiPedia to delete sections on the talk page of an article. Eventually it will get archived, but a record of how the article came to be as it is is very important. I oppose deletion. Bill Jefferys (talk) 20:13, 10 June 2009 (UTC)

I thank the experienced editors for alerting me about Wikipedia archiving practice. (I wanted merely to help new readers find the current discussion points.) I resist the impulse to delete my earlier suggestion! Thanks, Kiefer.Wolfowitz (talk) 14:43, 11 June 2009 (UTC)

In the footnote, I gave citations to articles on point-estimation, hypothesis-testing, and confidence-interval estimation for the multivariate normal and for exponential families. The footnote only exhibits Bayes-admissibility (as claimed) for the most important parametric families. (The article makes no claim that admissibility exhausts frequentist statistics, e.g. Tukeyian exploratory data analysis.) Kiefer.Wolfowitz (talk) 16:11, 9 June 2009 (UTC)

What you have written now makes it sounds like frequentists are routinely using Bayesian methods in hypothesis testing, which is not the case. I do not dispute the significance of Wald's results at all, but I do question your evaluation of their effect and impact on the routine and practice of frequentist statistics. This just seems to be your POV. The references you provided do not address this concern AT ALL. Tomixdf (talk) 09:50, 10 June 2009 (UTC)

To prove that a frequentist procedure is admissible, it is standard to exhibit a prior for which the procedure is Bayes, which is what the textbook references state and what the journal articles prove, for many bread-and-butter procedures of statistics, e.g. for the multivariate normal and for exponential familiess. My paragraph does not state that the many practioners who use such methods know about their Bayesian/admissible justifications.

Similarly, physicists, chemists and electrical engineers use procedures and ideas from quantum mechanics without being able to participate in theoretical discussions of Hilbert spaces. I believe that physicists no longer argue about matrix-mechanics or wave-equation versions of quantum mechanics, since von Neumann resolved both theories in 1927 (or so).

Similarly, Wald resolved the decision-theoretic and Bayesian disputes in 1943 (and later publications), a resolution recognized by the serious textbooks on statistics (e.g. by David Cox, who is regarded as a Bayesian in accounting but rarely in statistics . . . .).Kiefer.Wolfowitz (talk) 16:18, 11 June 2009 (UTC)

I do not believe that Neyman-Pearson hypothesis testing of a point null hypothesis can be brought into a Waldian decision-theoretic framework. The results of Berger and Delampady, and of Berger and Sellke, would seem to indicate this. Bill Jefferys (talk) 20:13, 10 June 2009 (UTC)

This would be relevant if I wrote that ALL Neyman-Pearson procedures could be recast in Wald's form, or in the form of LeCam, et alia.Kiefer.Wolfowitz (talk) 16:18, 11 June 2009 (UTC)

Introduction

The introductory paragraph should accurately define the subject in terms the general reader can understand. It has the following vague or opaque (to the general reader) phrases:

probability as "a measure of a state of knowledge"
requirements of rationality and consistency
a hypothesis is typically rejected or not rejected

Each requires further reading to grasp its meaning in this context.

Tommaso Toffoli, reviewing The Logic of Science (E. T. Jaynes. Edited by G. Larry Bretthorst. Cambridge University Press, 2003) for an informed readership in American Scientist, adds just a few words to the first, rendering it, to my mind, slightly more understandable; "A probability is thus a measure of a state of knowledge and may change as this state is updated... "I followed the link to "requirements of rationality and consistency" and it took me to Cox's theorem. You must be joking. The link on "rejected or not rejected" took me to Statistical hypothesis testing, an opaque 3000 word essay.

For now, I'm pasting back the paragraph Tomixdf removed, and shall continue to do so until one of you comes up with something better in terms of accessibility and accuracy. If my paragraph is inaccurate, correct it. If it can be said more clearly, please do so. But it had better be in layman's language.
Anthony (talk) 00:37, 7 June 2009 (UTC)

I tried to come up with a very simple introduction; let's discuss if you think it needs more work. OTOH, I don't think the three lines you quote are that inaccessible to the general public.
Tomixdf (talk) 07:17, 7 June 2009 (UTC)

Beautiful, Tomixdf. Thank you.
Anthony (talk) 19:13, 8 June 2009 (UTC)

The introduction's definitely improving. Couple of suggestions: Kiefer has made a good case against describing objectivists and subjectivists as "schools". How about "For the subjectivist school, the state of knowledge corresponds to a 'personal belief'." -> "For subjectivists, probability corresponds to a degree of personal belief" or "In the subjectivist approach...". Maybe even "measures" rather than "corresponds to". MartinPoulter (talk) 11:06, 9 June 2009 (UTC)

"One of the crucial features of the Bayesian view is that a probability is typically assigned to a hypothesis, whereas under the frequentist view, a hypothesis is typically rejected or not rejected without directly assigning a probability." I'm still wondering when frequentists adopt another method. Tomixdf (talk) 14:47, 9 June 2009 (UTC)

I've answered this question many times:

Hypotheses can be combined into "composite hypotheses". [Casella-Berger discuss union-intersection tests, which are important in multivariate analysis. Kiefer.Wolfowitz (talk) 15:53, 9 June 2009 (UTC)]
Alternatives to 0/1 hypothesis testing include "selecting the best", etc. and such topics are introduced in intermediate books.Kiefer.Wolfowitz (talk) 16:36, 3 June 2009 (UTC)

Kiefer.Wolfowitz (talk) 15:53, 9 June 2009 (UTC)

I desire a graphical introduction using a binomial distribution. Can anybody recruit a Java-programmer to make a Java-script that would let somebody experiment with various beta priors and with binomial parameters, for a few different sample sizes? I would suggest Beta (0.5,0.5), (1.0, 1.0), and (2.0, 2.0).

(This summer, I could write R-code and prepare graphic files, but we really could use an interactive explanation).

The article should flag the improper priors coming from Wald-style limiting Bayesian priors and from Jaynes-style maximum-entropy priors (when they exist). Calling such measures "probabilities" is a bit improper!
Kiefer.Wolfowitz (talk) 17:51, 7 June 2009 (UTC)

Going the wrong way again

I'm extremely unhappy with the latest slew of edits again. The editor does not seem to be interested in reporting existing views with proper references but clearly pushes an agenda, which seems to consist of (a) downplaying the ideas of Cox and Jaynes and (b) representing frequentist and Bayesian statistics as one big happy family where everybody agrees. On top of that, the lengths of the footnotes is getting ridiculous an the tone is not AT ALL suitable for Wikipedia. Tomixdf (talk) 07:45, 13 June 2009 (UTC)

Tomixdf removed the reference where Fisher used Bayesian methods, and where Neyman used Bayesian methods, etc. In general, it is true that statistics is now a happy family (despite the Pearson-Fisher-Neyman dysfunction), where people use different methods for different problems and where statisticians are judged according to their results (rather than their conformity to catechism). Thus, everybody pays attention to David Cox or Efron or Lauritzen or Rubin, regardless of their Bayesian or non-Bayesian leanings.

If you can find a reference that discusses this view (ie. the happy family thing) we should add this (otherwise it's just your POV). I can then add some references that promote a very different view (Bernardo or Jaynes for example). Again, multiple views are fine! Tomixdf (talk) 13:18, 14 June 2009 (UTC)

Even Fisher is being represented as a Bayesian! The arguments between Jeffreys and Fisher are legendary. At the end of his life Fisher pushed fiducial probability (and claimed that Bayes followed similar ideas), but he was always an acrid critic of Laplace and all forms of 'inverse probability'. See the article R. A. Fisher on Bayes and Bayes' Theorem for a clear discussion of these issues. I'll incorporate this in the article in the coming days. Tomixdf (talk) 07:45, 13 June 2009 (UTC)

Fisher did use Bayesian methods and Fisher did support Bayesian statistics, e.g. with George Box, etc.---two propositions Tomixdf denied here or flagged on the article as needed references. When I provide the references, Tommixdf deletes everything.

You did not AT ALL provide a reference that states "Fisher did support Bayesian statistics". You just provided some technical articles that you base your POV on. But in Wikipedia, your POV is irrelevant. Take a look at my reference (R. A. Fisher on Bayes and Bayes' Theorem) to understand what I mean. This is a direct discussion of the relevant matter, not some excuse reference that I am using to back up a POV. So again: please provide a reference that DIRECTLY states "Fisher did use Bayesian methods and Fisher did support Bayesian statistics", and ADD that as another properly referenced view of the topic. It's quite OK to mention conflicting views, BTW, something you don't seem to understand. Tomixdf (talk) 13:11, 14 June 2009 (UTC)

Nobody denies that that Fisher disliked Bayes with automatic priors, but this lumpen-article previously lumped Fisher together with Neyman-Pearson as frequentists. Fisher repeatedly rejected "frequentism" and (in the reference given) stated that he preferred Jeffreys over Neyman Pearson. This material corrected the raw false dichotomy cooked up by Tomixdf, who then deleted the counter-examples to the false claims he presented.

frequentist statistics as an amalgam of Fisher/Pearson/Neyman methods is an almost VERBATIM quote from the referenced article (which is peer reviewed and published in a reputable journal). You say Fisher preferred Jeffreys over Pearson (I actually agree with this). Fine, that can be added. So just find a ref and add it!!! Tomixdf (talk) 13:11, 14 June 2009 (UTC)

Fisher's preference of Jeffreys over Neyman-Pearson appears in the page reference in the Box-Fisher biography, which I gave before and which you removed.

Regarding Aldrich's article. Fisher's abuse of Laplace is accurately quoted, but the habit of parrotting Fisher's abuse of his betters is beyond me. [FYI: Aldrich's article makes claims that are inconsistent with (and do not discuss the evidence in ) previous articles published by Hald, Pratt, and Stigler, particularly on Edgeworth's contributions---some of which were published in Statistical Science or less discursive journals (e.g. Annals). Thus, an educated reader should treat claims from Aldrich somewhat carefully.]Kiefer.Wolfowitz (talk) 14:08, 15 June 2009 (UTC)

One published article in a peer-reviewed journal does not license any of us, even Tomixdf, to paraphrase controversial POV without any qualification or justification. See the Sokal Affair, for example.Kiefer.Wolfowitz (talk) 11:11, 18 June 2009 (UTC)

The article is recent, peer reviewed and published in a reputable journal, and contains many relevant references to back up its statements. The article is not controversial at all - that is just your POV because you have a strong agenda and only want to see your views expressed in the article. If you have a reference that calls the views expressed in (Aldrich, 2008) "controversial" and "POV without any qualification or justification" (despite peer review and ample references), then please go ahead and add it as a contrasting opinion! Tomixdf (talk) 15:48, 18 June 2009 (UTC)

^ See Gillies 'Induction and Probability' Parkinson (ed) An Encyclopedia of Philosophy 1988; p263-4, Howson & Urbach 1989
^ He said "...betting strictly speaking does not pertain to probability but to the Theory of Games". See "The role of 'Dutch Books' and 'Proper Scoring Rules' " in British Journal for the Philosophy of Science 32 1981 55-6.]
^ Douglas Hubbard "How to Measure Anything: Finding the Value of Intangibles in Business" John Wiley, 2007
^ Douglas Hubbard "How to Measure Anything: Finding the Value of Intangibles in Business" John Wiley, 2007

[1] See Gillies 'Induction and Probability' Parkinson (ed) An Encyclopedia of Philosophy 1988; p263-4, Howson & Urbach 1989

[2] He said "...betting strictly speaking does not pertain to probability but to the Theory of Games". See "The role of 'Dutch Books' and 'Proper Scoring Rules' " in British Journal for the Philosophy of Science 32 1981 55-6.]

[3] Douglas Hubbard "How to Measure Anything: Finding the Value of Intangibles in Business" John Wiley, 2007

[4] Douglas Hubbard "How to Measure Anything: Finding the Value of Intangibles in Business" John Wiley, 2007

[1]

[2]

[3]

[4]