Jump to content

Wikipedia:Reference desk/Archives/Language/2018 August 9

From Wikipedia, the free encyclopedia
Language desk
< August 8 << Jul | August | Sep >> August 10 >
Welcome to the Wikipedia Language Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


August 9

[edit]

Probability that a word is known

[edit]

Are there statistics about how probable it is that an educated/native/competent speaker knows word x or structure y (which is outside his professional field)? I see how this would be useful for producing, for example, instruction manuals. Otherwise, it would be more of a guessing game for the copywriter trying to write understandable material. --Doroletho (talk) 14:04, 9 August 2018 (UTC)[reply]

The best analogue I can think of would be analysis of word usage, this statistic is well known in statistical linguistics and is called an ngram. Google even has a nice utility here that lets you analyze the frequency of any word or phrase in the entire corpus of Google's digitally scanned English books. --Jayron32 14:22, 9 August 2018 (UTC)[reply]
It occurs to me that instruction manuals are unlikely to need to use any language that is outside of that which is known to an educated/native/competent speaker, and if the use of such language is being contemplated it may be best to err on the side of using alternative, more likely to be understood language. An instruction manual should be easily understood by a relatively broad range of people. Maybe I am being unfair but I can't even imagine an instance in which it is necessary to use any language that might not be understood. I think in most instances the language needed for an instruction manual can be limited to that which is in common use. Bus stop (talk) 14:49, 9 August 2018 (UTC)[reply]
Meanwhile, I found Flesch–Kincaid readability tests, which seems a good approximation to an empirical analysis of reading difficulty. It's even included in MS-Word, under statistics. Doroletho (talk) 18:20, 9 August 2018 (UTC)[reply]
"And who are most likely to fall prey to hypercognition? Experts. Experts who are confined by their own expertise. Experts who overuse the constricted set of concepts salient in their own profession while neglecting a broader array of equally valid concepts. Given a patient, a heart specialist is more likely to diagnose heart disease than an infectious disease expert, who is more likely to see the work of a virus. The bias towards what is known may lead to wrong or delayed diagnoses that bring harmful consequences."[1]
"But let’s give credit where credit is due. The human mind is an amazing organic hard drive of information. The typical English speaker will know the equivalent of 48,000 dictionary entries by age 60."
"Nevertheless, even with that capacity, hypocognition is unavoidable. The vocabularies we gain in a lifetime pale against the 600,000 entries contained in the Oxford English Dictionary, and that is even before we turn to the myriad of concepts residing in other languages."
The above is off-topic but I thought it might be of interest in relation to the question raised. Bus stop (talk) 17:42, 12 August 2018 (UTC)[reply]
I don't have a clue why Bus stop is posting this here. Doroletho (talk) 17:06, 14 August 2018 (UTC)[reply]
Well, given the information he posted, the answer to your original question would seem to be a probability of 1 in 10. ←Baseball Bugs What's up, Doc? carrots20:26, 14 August 2018 (UTC)[reply]