Jump to content

Talk:Dunning–Kruger effect

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Good articleDunning–Kruger effect has been listed as one of the Social sciences and society good articles under the good article criteria. If you can improve it further, please do so. If it no longer meets these criteria, you can reassess it.
Did You Know Article milestones
DateProcessResult
February 23, 2023Good article nomineeNot listed
June 27, 2023Peer reviewReviewed
August 23, 2023Good article nomineeListed
Did You Know A fact from this article appeared on Wikipedia's Main Page in the "Did you know?" column on September 4, 2023.
The text of the entry was: Did you know ... that "the first rule of the Dunning–Kruger club is you don't know you're a member of the Dunning–Kruger club"?
Current status: Good article

DK Effect is Simply Autocorrelation

[edit]

The article otherwise largely fails to communicate the degree to which the DK effect is pseudoscience. The statement "the statistical explanation interprets these findings as statistical artifacts" needs to be expanded and made much more prominent to explain why the effect is simply autocorrelation and should not be basis for any cognitive or metacognitive claims despite its appeal. The autocorrelation claim is easy to understand and should be a convincing argument for changing the first paragraph to make clear that while the concept is appealing, it is not based on a valid statistically methodology and should not be taken too seriously.

The Dunning–Kruger effect is a claimed cognitive bias[2] whereby people with low ability, expertise, or experience regarding a certain type of task or area of knowledge tend to overestimate their ability or knowledge. Some researchers also include in their definition the opposite effect for high performers: their tendency to underestimate their skills. Despite its intuitive appeal the effect can be statistically explained as autocorrelation[1] and should not be used to formally explain cognitive biases or metacognitive phenomena regarding self-evaluation of ability, knowledge or experience.

Simply put for any sample of test scores on a 0-10 scale, the likelihood that someone who scores 0 will overestimate the performance is necessarily higher than someone who scores 10. The reverse is also true: anyone who scores 10 will necessarily underestimate their performance more that someone who scores 0.

The ironies are replete, as pointed out in the article: "there is a delightful irony to the circumstances of their [Dunning and Kruger's] blunder. Here are two Ivy League professors arguing that unskilled people have a ‘dual burden’: not only are unskilled people ‘incompetent’ ... they are unaware of their own incompetence. [...] In their seminal paper, Dunning and Kruger are the ones broadcasting their (statistical) incompetence by conflating autocorrelation for a psychological effect."

The popularity of the DK effect may be an interesting study in how bad science can take hold in the popular mind given how many people seem to take it seriously without considering the fatal flaws in the methodology used to identify the alleged phenomenon. DK also serves as an example of how bad science can get through the scientific peer review process, especially if it comes from a highly reputable institution. --Chassin (talk) 16:10, 7 January 2023 (UTC)[reply]

This has already been discussed several times. See, for example, Talk:Dunning–Kruger_effect/Archive_5#Neutral_Point_of_View and Talk:Dunning–Kruger_effect/Archive_5#Is_this_true?. For details on the criticisms of the Dunning–Kruger effect, see the section Dunning–Kruger_effect#Criticism_and_alternatives in our article. And your source, https://economicsfromthetopdown.com/, is not reliable. Phlsph7 (talk) 16:35, 7 January 2023 (UTC)[reply]
Other sources may indeed be preferred, such as Nuhfer et al (2006). The outcome of previous discussions notwithstanding, the fact remains that the opening paragraphs of the introduction still fail to offer caveats regarding flaws in empirical methods used to support the claims, and instead tend to give credence to it based on its broad application and intuitive appeal. If anything, the only effect Dunning claims to have identified is the tendency for everyone to overestimate their ability, not just people with low ability, and certainly not for those with high ability to underestimate theirs. The first paragraph is misleading in multiple respects and should be revised to address these shortcomings.
The first mention of criticism is in paragraph 4 and characterizes it as "debate" and dismisses it as "not denying the empirical findings", when that is precisely what the statistical criticism does unequivocally. If we compare how criticism of this topic is addressed to the pseudoscience of physiognomy, we can clearly see in the opening of the second paragraph that it is "regarded among academic circles because of its unsupported claims; popular belief in the practice of physiognomy is nonetheless still widespread". The same can be said of the DK effect insofar as the shapes of peoples heads differ and people self-evaluate inaccurately, but the differences in head shapes and self-evaluation errors do not provide the necessary support for their respective theories regarding ability, expertise, or experience. On the contrary, the empirical data when analyzed correctly falsifies them both equally, which in the final analysis is only method we have at our disposal. --Chassin (talk) 19:58, 7 January 2023 (UTC)[reply]
the empirical data when analyzed correctly falsifies them both equally Which reliable source says so? Paradoctor (talk) 20:25, 7 January 2023 (UTC)[reply]
See https://fortune.com/2023/05/08/what-is-dunning-kruger-effect-smart-intelligence-competence-john-cleese/ 77.204.39.138 (talk) 17:49, 8 June 2023 (UTC)[reply]
@Chassin Perhaps the article itself, as well as the studies, are proof enough, not because of, perhaps in spite of what they thought they studied. User:Dagelf|Dagelf]] (talk) 07:43, 9 March 2024 (UTC)[reply]

I looked into this, the archived discussion doesn't seem to be particularly convincing on why not to mention dissent here. The scientific articles quoted by Fix seem rather convincing (if not damning) on the maths. But I get that they haven't been cited as often as the Dunning Kruger article they're pointing at. I'm pretty sure I can't get away with AFDing the article or something crazy like that. But... I do think that NPOV allows me to put the counterveiling point of view that Dunning-Kruger's paper is bad because (given sources claim) they messed up their maths. --Kim Bruning (talk) 00:34, 27 November 2023 (UTC) (Even if they didn't mess up their maths, they definitely did maths in a way that has been confusing to skilled scientists. They may have ended up confusing themselves, this seems plausible based on the cited sources. Either way, not Wikipedia's battle: But for sure we can write that not everyone thinks the effect is real!) [reply]

So I might be a little rusty. What's the exact policy reasons for removal of each of the sources? The published papers demonstrate that you can get the Dunning-Kruger graph from random noise (oops). The web source confirm-ably summarizes the papers, thus can usefully be seen as a secondary source. Usually when people actually dig in and read sources, they do also take 1 minute extra time to post their findings on the talk page (or link to where it was previously discussed) But I'll go read them again just to be sure, did I miss anything? --Kim Bruning (talk) 11:23, 27 November 2023 (UTC)[reply]

Annnd... just came back from reading papers, especially Nuefer 2017 concludes with: "Because of insufficient attention to Numeracy, Current prevalent explanations of the nature of human self-assessment seem to rest on a tenuous foundation.".

Due to the replication crisis in (among others) psychology, we're likely to see many papers like these going forward. Maybe I'm late to this party: is there standing Wikipedia policy when it comes to bad replications or methodological flaws? Else I'd just apply NPOV, and at least report that there have been reported issues with a particular study. (whether the report is correct or not is a different story, but it got published, so we can say it has and by whom.) --Kim Bruning (talk) 11:40, 27 November 2023 (UTC)[reply]

Hello Kim Bruning and thanks for your attempts to improve this article. I agree that it can be challenging to get arrive at a balanced overview of the academic literature on the Dunning-Kruger effect.
The lead section is supposed to summarize the body of the article. If you want to add new content to the article, it's usually best to add them first to the body of the article. If they are accepted, a later step might be to consider whether the new content is important enough to get a short mention in the lead section, see WP:LEAD
If you want to add some content to the body of the article, you should make sure that it is based on reliable sources, see WP:RS. For example "economicsfromthetopdown.com" is not a reliable source for the Dunning-Kruger effect. The body of the article already discusses the statistical explanation. So it might be best to read through it first and familiarize yourself with the sources cited there before adding new text to it. Phlsph7 (talk) 12:56, 27 November 2023 (UTC)[reply]
Ah, our posts crossed, yup I was doing that.
On third rereading of article alongside sources , we actually have a section Statistical. The actual papers say it's a statistical artifact alright. In regular English they're saying "there's actually no effect".
I sort of get why Gignac 2020 gets cited throughout: because it has a very thorough review. The thing is, Gignac et al do this thorough review so that they can then set up a very thorough attack. They follow it by their own empirical study with a very large sample size. In this empirical study -with corrected methods- they fail to replicate Dunning-Krueger. The conclusion (very carefully and politely worded, as is proper) states that they expect that pretty much every Dunning-Krueger study will fall if subjected to more detailed scrutiny with the improved statistical methods we have available today.
A lot of people read citations as being supportive of a claim or view. Rather -here- citation of Gignac is often merely supportive of the fact that a particular view has been published. If that was all, I think at least something in the intro that says that more recent sources have replication issues would be fine.
But we actually do have a section that says "it's a statistical artifact". Somehow I feel this section is not quite clear in pointing out that those papers are saying "Dunning-Krueger effect is not a thing". Possibly because the underlying papers use somewhat couched wording?
I fully understand that patrollers can't always go and read sources in-depth. I'd appreciate any tips (or links to tips) on how to make their life easier here. --Kim Bruning (talk) 13:06, 27 November 2023 (UTC) It'd be funny if wikipedia had an 'externally disputed' banner, for effects that -at times- have failed to replicate. While a fun idea, it's sadly probably above wikipedia's pay grade.[reply]
I removed the added passage for now. If you have suggestions on how to improve the section on the statistical effect then they are welcome. But it might be helpful to give the reliable sources cited there a close reading before. For example, Gignac 2020 do not deny the existence of the Dunning-Kruger effect in general. They primarily state that its magnitude is smaller than initially thought because part of it can be explained through statistics. You also have to be careful whether the studies in question assess the Dunning-Kruger effect measured in relative terms or in absolute terms, as explained in the section "Measurement, analysis, and investigated tasks". Many only target the Dunning-Kruger effect measured in relative terms. Phlsph7 (talk) 13:22, 27 November 2023 (UTC)[reply]
To avoid further edit warring, I suggest that you create a draft first and propose it at the talk page rather adding and re-adding your content directly to the article, see WP:EDITWAR. Phlsph7 (talk) 13:00, 27 November 2023 (UTC)[reply]
Right, so procedurally speaking, so far I've adhered to WP:1RR, which falls rather short of WP:EDITWAR, I should think. I do agree that there are only very few exceptions to the rule that one should not revert a revert, but in this case "other party did not engage on talk page" is actually one of them. The reason is that since they don't reply on talk, there's simply no (D)iscussion on talk, and thus we can technically go back to (B)OLD and try again. It's not pretty but (sometimes) it works; and isn't that just WP:BRD in a nutshell to begin with. :-/ --Kim Bruning (talk) 13:18, 27 November 2023 (UTC)[reply]
No problem so far, but for one reason or another, this article attracts edit wars. Phlsph7 (talk) 13:24, 27 November 2023 (UTC)[reply]
A more ignorant me would've said Ego... but the common use of the word is in stark contrast with William James and Freud's definition. But it speaks to peoples' sense of self importance, which appears to be one of the few self evident things in life. Perhaps we should create the latter page and redirect it here...? Dagelf (talk) 08:39, 9 March 2024 (UTC)[reply]

Three refereed papers on this topic have convenient Wikidata entries:

  • <ref name="Nuhfer2016">{{cite Q|Q56566524|url-status=live}}</ref>[1] or repeat ref <ref name="Nuhfer2016" />[1]
  • <ref name="Nuhfer2017">{{cite Q|Q56566525|url-status=live}}</ref>[2]
  • <ref name="Gignac2020">{{cite Q|Q108932700|url-status=live}}</ref>[3]

Boud (talk) 21:05, 27 November 2023 (UTC) (PS: I see that the sfn structure for citation is used... Boud (talk) 21:29, 27 November 2023 (UTC))[reply]

I started a small section in the content, currently called Validity dispute, but there's probably a better section title to consense on. I think it's clear that the existence of the effect is disputed. Gignac & Zajenkowski 2020 don't just say that the effect "is" small, what they say in their abstract is On the basis of a sample ... contrary to the Dunning-Kruger hypothesis. Additionally, the association ... contrary to the Dunning-Kruger hypothesis. It is concluded that, although the phenomenon described by the Dunning-Kruger hypothesis may be to some degree plausible for some skills, the magnitude of the effect may be much smaller than reported previously. In other words, they found evidence contrary to the Dunning-Kruger hypothesis while accepting that a small effect for some skills might exist (since they didn't do tests for all well-known skills).
Something should go up to the lead, but in proportion to the length of this section in the body, so currently it would have to be a very brief sentence. Waiting to see how this section develops would make sense: there is no deadline. Boud (talk) 22:08, 27 November 2023 (UTC)[reply]
Hello Boud and thanks for taking the time to write this section. I slightly modified the text and merged it into the subsection "Statistical", which already discusses this interpretation. It would be great if you could add the precise page numbers since the claim seems to be quite strong. Phlsph7 (talk) 23:38, 27 November 2023 (UTC)[reply]
I am not sure where this is going. How do you go from "there is a statistical explanation for DK effect" to "there is no DK effect"? I get the debate over whether DKE is a cognitive bias or just a statistical artifact, but gee, an effect is an effect no matter how it is explained. Constant314 (talk) 00:08, 28 November 2023 (UTC)[reply]
Scientists often word things very carefully. When scientists describe something as a "statistical artifact," they are often implying a significant doubt about the validity of the observed phenomenon as a true effect.
In the case of the papers we looked at today, one of the papers [1] demonstrates that you can outright feed random data into a Dunning-Kruger-style statistical analysis and still get a very convincing looking outcome if you're not very very careful.
Their subsequent paper [2] argues that people were indeed not careful enough.
Note that [2] does find other self-assessment effects, just not the Dunning-Kruger effect.
--Kim Bruning (talk) 04:27, 28 November 2023 (UTC)[reply]
Please quote (with pg #s) the relevant passages from the papers you cite. EEng 08:26, 28 November 2023 (UTC)[reply]
I agree with Constant314: I don't think anyone disputes the effect shown in the diagram: if you ask people about how skilled they think they are and compare it with how skilled they actually are, you get the diagrams shown in the article. In this sense, there is an undisputed effect there. The disagreements are only about how to explain these diagrams.
Trying to find claims in scientific articles that the researchers do not explicitly make can very easily lead to original research. If a paper does not directly support a claim then we should be very careful about making this claim in our articles. I found a page number for the bottom and ceiling effect. But this seems to be included in the regression toward the mean that is already explained earlier so I'm not sure that we need it. And the more widesweeping claim still needs page numbers.
Even if we could find direct support for these radical claims in sources, we still have to be careful about undue weight. For example, from Mazor & Fleming 2021: In one of the most highly replicable findings in social psychology, Kruger and Dunning1 showed that participants who performed worse in tests of humour, reasoning, and grammar were also more likely to overestimate their performance. This is a high-quality source (Nature Human Behaviour) that is more recent. Phlsph7 (talk) 08:36, 28 November 2023 (UTC)[reply]
@Phlsph7: If you're happy with my edits to the paragraph, then please remove the page needed tag. I agree that "most of" was unjustified - without a full review of the post KD1999 literature, that claim cannot be made. Nuhfer do make a claim something along the lines of "most", but without trying to prove it, and their aim is not do a literature survey.
I'm not sure if people here have already mentioned the March 2022 reply by Dunning, but as a comment in a professional journal by a known expert (the D of DK), it should be usable, especially for some of its sources. Boud (talk) 18:57, 28 November 2023 (UTC)[reply]
Thanks for adding the page numbers, this helps. However, the quotes don't mention any specific statistically flawed studies so maybe we should not either. I reformulated the passage to be closer to the conclusion of the study and I removed the page needed tag. I also removed the mention of Gignac and Zajenkowski since their position in relation to statistical artifacts and alternative explanations is already discussed in detail earlier. Given the page number, I don't think that a lengthy quote is required but feel free to restore it if you disagree. I also added page number for Nuhfer et al. 2016 but I not sure how relevant it is for that particular conclusion. My main remaining doubt would be that it's not clear that this specific individual study merits to be mentioned. Phlsph7 (talk) 08:25, 29 November 2023 (UTC)[reply]
Perhaps these graphs show what they mean, to anyone who don't understand the explanations elsewhere... I don't think stating the obvious is a "radical claim"? Dagelf (talk) 00:24, 10 March 2024 (UTC)[reply]
When scientists describe something as a "statistical artifact," they are often implying a significant doubt about the validity of the observed phenomenon as a true effect.
The article says, The Dunning–Kruger effect is a cognitive bias in which people with limited competence in a particular domain overestimate their abilities. This is true even if that overestimation is caused by regression to the mean: very incompetent people cannot underestimate their competence because their competence is already at the bottom, so they can only judge it correctly or overestimate it, which, on average, means that they overestimate it. --Hob Gadling (talk) 11:01, 29 November 2023 (UTC)[reply]

https://en.wikipedia.org/w/index.php?title=Dunning%E2%80%93Kruger_effect&diff=1212672204&oldid=1212638810 So I picked up this edit by User:Dagelf, looks like some additional sources from The Usual Suspects. I think the consensus so far seems to be to put this under statistics for now, at least until/unless more scientists start to agree. I'm not married to the wording there, except that the word 'autocorrelation' should probably in the article *somewhere* at least. This group of statisticians did write a number of peer-reviewed articles on the topic, after all.

btw.. In general, I think it's appropriate to post a reasoning on the talk page if you're editing/reverting good-faith edits, where the reasoning might not be immediately obvious. Reverting with "This should be discussed[...]" is slightly ironic. ;-) --Kim Bruning (talk) 01:39, 9 March 2024 (UTC)[reply]

Well, that was fun. I actually double-checked the citations, but none of the provided citations actually mentioned autocorrelation themselves. I've trimmed the remaining wording down to the minimum that the citations do support. And that's why one does double-check I guess. :-/ --Kim Bruning (talk) 02:19, 9 March 2024 (UTC)[reply]
@Kim Bruning Oh boy. There was one I citation I misplaced, but found again. Will link it. Dagelf (talk) 07:51, 9 March 2024 (UTC)[reply]
I shall await your refs wrt autocorrelation.
By the way, does NPOV really want to be anthropomorphized that badly? Take it easy, the wiki will still be here tomorrow! --Kim Bruning (talk) 10:35, 9 March 2024 (UTC)[reply]
I took Nuefer’s 2017 discussion of correlations to be the basis for the use of the term “autocorrelation” in the title of the article [2] that started all this. It seemed correct to me and was the reason I thought it right to raise the issue in the first place. Chassin (talk) 19:32, 26 June 2024 (UTC)[reply]


References

  1. ^ a b c Edward Nuhfer; Christopher Cogan; Steven Fleischer; Eric Gaze; Karl Wirth (January 2016). "Random Number Simulations Reveal How Random Noise Affects the Measurements and Graphical Portrayals of Self-Assessed Competency". Numeracy: Advancing Education in Quantitative Literacy. 9 (1). doi:10.5038/1936-4660.9.1.4. ISSN 1936-4660. Wikidata Q56566524. Archived from the original on 26 November 2023.
  2. ^ a b c Edward Nuhfer; Steven Fleischer; Christopher Cogan; Karl Wirth; Eric Gaze (January 2017). "How Random Noise and a Graphical Convention Subverted Behavioral Scientists' Explanations of Self-Assessment Data: Numeracy Underlies Better Alternatives". Numeracy: Advancing Education in Quantitative Literacy. 10 (1). doi:10.5038/1936-4660.10.1.4. ISSN 1936-4660. Wikidata Q56566525. Archived from the original on 26 November 2023.
  3. ^ Gilles E. Gignac; Marcin Zajenkowski (May 2020). "The Dunning-Kruger effect is (mostly) a statistical artefact: Valid approaches to testing the hypothesis with individual differences data" (PDF). Intelligence. 80: 101449. doi:10.1016/J.INTELL.2020.101449. ISSN 0160-2896. Wikidata Q108932700. Archived (PDF) from the original on 8 March 2023.

Did you know nomination

[edit]
The following is an archived discussion of the DYK nomination of the article below. Please do not modify this page. Subsequent comments should be made on the appropriate discussion page (such as this nomination's talk page, the article's talk page or Wikipedia talk:Did you know), unless there is consensus to re-open the discussion at this page. No further edits should be made to this page.

The result was: promoted by Bruxton (talk15:10, 27 August 2023 (UTC)[reply]

References

  1. ^ Howard 2018, p. 354.
  2. ^ Dunning 2011, pp. 260–261.
  3. ^ Duignan 2023.

Sources

Improved to Good Article status by Phlsph7 (talk). Self-nominated at 14:19, 24 August 2023 (UTC). Post-promotion hook changes for this nom will be logged at Template talk:Did you know nominations/Dunning–Kruger effect; consider watching this nomination, if it is successful, until the hook appears on the Main Page.[reply]

General: Article is new enough and long enough

Policy compliance:

Hook: Hook has been verified by provided inline citation
QPQ: Done.

Overall: Epicgenius (talk) 14:38, 25 August 2023 (UTC)[reply]

@Epicgenius: There is currently a discussion on the talkpage about whether the line from ALT0 should be included in the article. If it is removed then we would not able to use ALT0 and we would need to default to ALT1. I suggest that we put the nomination on hold until this issue is resolved. Phlsph7 (talk) 08:46, 26 August 2023 (UTC)[reply]
It was a clever hook (ALT0) but it is not apparent in the article so I will promote ALT1. But ALT1 checks out and is interesting and verifiable. I see Earwig alerts to a direct copy of our article. Lightburst (talk) 15:28, 26 August 2023 (UTC)[reply]

"There is also disagreement about whether the effect is real at all"

[edit]

There is disagreement about whether incompetent people really overestimate their competence? From people who know what regression to the mean is? But regression to the mean predicts the effect. --Hob Gadling (talk) 11:27, 9 March 2024 (UTC)[reply]

Does it predict it? Below average people think they're average, but the average is higher, and above average people think they're average, so they low-ball it... but it doesn't seem to hold for everyone and in every topic, as the disagreement in studies show. That's the first problem. The second problem is that the original study was flawedDagelf (talk)
but it doesn't seem to hold for everyone Duh. It's a statistical effect, of course it does not. What sort of reasoning is that? And "the original study was flawed" has no connection to "There is also disagreement about whether the effect is real at all". --Hob Gadling (talk) 13:48, 11 March 2024 (UTC)[reply]
Not a full answer (You'll need to look at Dagelf for details on why they picked *those* sources in particular), but meanwhile see also: #DK_Effect_is_Simply_Autocorrelation. --Kim Bruning (talk) 15:04, 9 March 2024 (UTC)[reply]
Thanks for the efforts to improve the article. One difficulty with this topic is that there is a lot of misleading information about it on the internet. This is why it's dangerous to rely on low-quality sources like blogs. For example, from the high-quality source Mazor & Fleming 2021 (Nature Human Behaviour): In one of the most highly replicable findings in social psychology, Kruger and Dunning showed that participants who performed worse in tests of humour, reasoning, and grammar were also more likely to overestimate their performance. There are different ways to explain this but there are very few reliable sources that claim that there is nothing there. Even statistical explanations usually acknowledge this. For example, Gignac & Zajenkowski 2020 hold that statistics only explain some part of the effect and Nuhfer et al. 2017 only deny that the effect is "pronounced". Phlsph7 (talk) 08:41, 10 March 2024 (UTC)[reply]
Mazor & Fleming 2021 is interesting! --Kim Bruning (talk) 14:54, 11 March 2024 (UTC)[reply]


I really appreciate the time Phlsph7 put in to improve the page!

Meanwhile, somewhere along the way we lost the recent comments by Gaze (one of the 'et al' in Nuhfer et al. ). I'll leave it here as a note in case we need it again later.

--Kim Bruning (talk) 14:04, 10 March 2024 (UTC)[reply]

Another write-up worth considering:
Cooljeanius (talk) (contribs) 21:56, 11 April 2024 (UTC)[reply]
Not a WP:RS. Constant314 (talk) 22:09, 11 April 2024 (UTC)[reply]

Darwin quote

[edit]

@Phlsph7, I see you removed a Darwin quote with comment 'not the right place for quotes'. I feel the quote is relevant, since it shares key concepts with the article. Is there a better place within the article for its inclusion? Chumpih t 11:51, 4 August 2024 (UTC)[reply]

Hello Chumpih and thanks for your suggestion. I agree that the quote has some relevance. Maybe it could fit into the section "Practical significance" because it applies the topic to science. Phlsph7 (talk) 12:49, 4 August 2024 (UTC)[reply]
Done. Thanks. Chumpih t 12:56, 4 August 2024 (UTC)[reply]