Jump to content

Wikipedia:Reference desk/Archives/Language/2024 June 4

From Wikipedia, the free encyclopedia
Language desk
< June 3 << May | June | Jul >> Current desk >
Welcome to the Wikipedia Language Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


June 4

[edit]

How many solutions of perfect pangram exist?

[edit]

In Wikipedia:Reference desk/Archives/Language/2024 May 17#Pangram, I asked “Use the symbol of the 118 chemical elements, and the abbreviation of the 88 constellations, and the abbreviation of the 50 states of America, what is the least words we need to make a pangram?”, and someone gave a solution of perfect pangram:

Aql, B, Cu, Fm, Gd, KY, NJ, Oph, Sex, VT, WI, Zr

I also found some other solutions from this solution:

  1. Replace B to Yb, replace KY to K
  2. Replace B to Bi, replace WI to W
  3. Replace Cu to C and U
  4. Replace KY to K and Y
  5. Replace B to Bh, replace Oph to Po
  6. Replace Sex to S and Xe
  7. Replace Cu to Tc and U, replace VT to V

So, how many solutions of perfect pangram exist? Also, how many things (chemical elements, constellations, states of America) are contained in every perfect pangram? Or not contained in any perfect pangram? (I only know, by the answer of the one who gave the solution, every perfect pangram contains NJ, and thus no perfect pangram contains others containing the letter N such as N, In, Nor) 2001:B042:4005:546F:E921:9E13:BB77:C79F (talk) 00:12, 4 June 2024 (UTC)[reply]

These solutions have at least 12 elements. I find in total 1,687,855 perfect pangrams, 476 of which have 11 elements. The lexicographically first 11-element solution is:
Ag, Bk, Ds, Equ, Hf, LMi, NJ, Oct, Pyx, WV, Zr
 --Lambiam 10:41, 4 June 2024 (UTC)[reply]
The only things (chemical elements, constellations, states of America) contained in both of your solution and the solution given in Wikipedia:Reference desk/Archives/Language/2024 May 17#Pangram are NJ and Zr, but for Zr, we can replace Ag and Zr to AZ and Rg, thus NJ is the only thing (chemical elements, constellations, states of America) contained in every perfect pangram. (There are only 3 things (chemical elements, constellations, states of America) containing the letter Z: AZ, Zn, Zr, but Zn cannot be used since NJ must be used)
Also, how many things (chemical elements, constellations, states of America) are not contained in any perfect pangram? 220.132.230.56 (talk) 13:13, 4 June 2024 (UTC)[reply]
Counting abbreviations that only differ by lower/upper case, I find 60 unusable ones: And, Ant, Aqr, Ar/AR, Ari, Au, Aur, Cae, Car, Cen, Cn, CrA, Cru, CVn, Del, Dra, Er, Eri, Gru, Her, In/IN, Ind, Leo, Lep, Lu, Lup, Lyn, Men, Mn/MN, Mon, N, Na, Nb, NC, Nd/ND, Ne/NE, Nh/NH, Ni, NM, No, Nor, Np, NV, NY, Per, Ra, Re, Ret, Rn, Ru, Ser, Sn, Tau, Tel, TN, TrA, UMa, Vel, Vul, Zn.  --Lambiam 14:19, 4 June 2024 (UTC)[reply]
All of these 60 things can be ruled out by the letters J, Q, Z: (and limited the using of the letters A, E, L, N, R, U)
  1. NJ is the only thing which contains the letter J, thus NJ must be used.
  2. All other things (besides NJ) containing the letter N cannot be used, this would include Zn.
  3. AZ, Zn, Zr are the only three things which contain the letter Z, but since Zn cannot be used, one of AZ and Zr must be used.
  4. Since one of AZ and Zr must be used, all things containing both of the letters A and R cannot be used, this would include Aqr.
  5. Aql, Aqr, Equ are the only three things which contain the letter Q, but since Aqr cannot be used, one of Aql and Equ must be used.
  6. Since one of Aql and Equ must be used, all things containing both of the letters A and E, or containing both of the letters A and U, or containing both of the letters L and E, or containing both of the letters L and U, cannot be used.
  7. Also, AZ and Aql cannot be both used since both of them contain the letter A, thus at least one of Zr and Equ must be used, and hence all things containing both of the letters R and E, or containing both of the letters R and U, cannot be used.
220.132.230.56 (talk) 15:48, 4 June 2024 (UTC)[reply]
Also, there are 1687855 perfect pangrams, and all of them contain NJ, besides NJ, which ones are used in the most number, the second-most number, the third-most number, etc. of the perfect pangrams? Also, besides the 60 unusable ones, which ones are used in the least number, the second-least number, the third-least number, etc. of the perfect pangrams? 220.132.230.56 (talk) 15:56, 4 June 2024 (UTC)[reply]
The questions are endless, but less and less interesting.  --Lambiam 18:40, 4 June 2024 (UTC)[reply]
OK. Besides, can you give all 476 11-element solutions? Thanks. (I think that there should be more things which are not contained in any 11-element solutions, e.g. the lexicographically first 11-element solution starts with Ag, this means Ac is not contained in any 11-element solutions, also, since the sets of Ca/CA is the same as Ac, thus Ca/CA is also not contained in any 11-element solutions) 220.132.230.56 (talk) 19:21, 4 June 2024 (UTC)[reply]
They can be admired at User:Lambiam/Pangram.  --Lambiam 07:50, 5 June 2024 (UTC)[reply]
Also Ara, Boo, Cnc, Pup, since they have repeated letters themselves. 220.132.230.56 (talk) 19:25, 4 June 2024 (UTC)[reply]
Thus there are 64 unusable ones. 220.132.230.56 (talk) 19:25, 4 June 2024 (UTC)[reply]
Right. My program throws them out right at the start, before commencing its search.  --Lambiam 07:50, 5 June 2024 (UTC)[reply]

Why are English, Indonesian & Malay the closest natural languages to ISO Basic Latin alphabet?

[edit]

(at least according to List of Latin-script alphabets and ISO basic Latin alphabet). Is it cause English has unusually high tolerance for inconsistent and non-phonetic spelling? Why are Malay and Indonesian alphabets so Englishy? Sagittarian Milky Way (talk) 17:57, 4 June 2024 (UTC)[reply]

The notion that one can assign a measure of distance between a natural language and an alphabet is absurd.  --Lambiam 07:02, 5 June 2024 (UTC)[reply]
Mentally replacing "natural language" with "orthography", this becomes one of those questions that is presented as asking about some subtle truth about language, which is odd to me because the answer is pretty much just that the Dutch were the ones to romanize Malay. Also, I will hiss every time someone says that English spelling is qualitatively more irregular than any other: surely you've read French at least once in your life?Remsense 07:08, 5 June 2024 (UTC)[reply]
If one hears a French word pronounced, say /si.fle/ , its spelling can often only be guessed: is it siffler, sifflez, sifflé, sifflée, sifflés or sifflées? The champion may be /vɛʁ/: is it vair, vairs, ver, vers, vert, verts, verre or verres? The other direction, however, from spelling to pronunciation, tends to be rather predictable. For English you often also have to guess in that direction, as is made clear in the poem "The Chaos". I wonder if there are languages where it is the other way around: the standard orthographic rendering of a spoken word is usually predictable, but the pronunciation of a written word is often hard to guess.  --Lambiam 09:51, 5 June 2024 (UTC)[reply]
@Remsense: The romanisation the Dutch introduced for Malay doesn't actually stick to the ISO basic Latin alphabet. It uses diaereses and acutes. The diacritics were abolished after Indonesian independence in the 1947 spelling. Double sharp (talk) 15:48, 16 June 2024 (UTC)[reply]
Oh! Shows what I know. Remsense 15:49, 16 June 2024 (UTC)[reply]
The technical answer is that the spellings of the three languages do not require the use of diacritic marks, or of letters beyond those found in ASCII. (Sometimes diacritics are optionally used in writing English in the case of words borrowed from other languages into English, or the "New Yorker dieresis", but it's never wrong to omit diacritics in English.) However, this has nothing to do with how good the spelling systems are in writing the languages. The Malay spelling system is quasi-phonemic, except in not having a distinct symbol for the schwa vowel, while English spelling is of course quite complex... AnonMoos (talk) 10:42, 5 June 2024 (UTC)[reply]
English doesn't generally use diacritics because its orthography (spelling) was modeled after that of (Middle) French. That may seem surprising, since modern French makes heavy use of diacritics. But English orthography was first established in the late 14th century, and at that time, diacritics were not yet standard in French. Unlike other European languages, English never adopted diacritics, which in Europe are mainly used for vowels. The Great Vowel Shift radically transformed the pronunciation of English vowels. As a result, spellings that approximated vowel pronunciations around 1400 no longer did so, and the divergence was too complex for diacritics to remedy easily. Without a thorough spelling reform, they would just have added further complexity. Fortunately, there was no push to adopt them in English. Malay and Indonesian are essentially different varieties of a single language. They share the same basic orthography, which was developed under British and Dutch influence during the colonial period. (Previously, Malay had been written with the Arabic abjad.) Dutch, like English, has an orthography developed mostly in the Middle Ages, before diacritics were widely or systematically used. So you could add Dutch to your list of (mostly) diacritic-free languages. Malay and Indonesian follow a model based mainly on English, with some Dutch influence. Initially, Indonesian followed a Dutch model, and Malay followed an English model. In 1972, Malaysia and Indonesia agreed on a spelling reform that leaned more toward an English model. Marco polo (talk) 20:27, 5 June 2024 (UTC)[reply]
Also English mostly got rid of non-Latin graphemes over time like thorn, Æ, Œ and long s even though European languages usually added them if anything. Such as turning nn into ñ, ss to ß, vowel splits to å and ae oe ue to ä ö ü. Why did they die out? Sagittarian Milky Way (talk) 01:41, 6 June 2024 (UTC)[reply]
Æ and Œ are non-Latin graphemes?????
The stunning revelation aside, the answer is "the Norman conquest", like has already been gestured towards above. English pretty much died out as an important written language from the 11th to the 14th centuries. I think it's true to some extent that the advent of the printing press finished them off for good, but it's not completely true since there were printed English publications where they printed Þ and Ƿ just fine. As you can discern from reading documents throughout the modern period, the long S did not fall out of use in English until the 18th century or so.Remsense 03:33, 6 June 2024 (UTC)[reply]
The ligatures ⟨Æ⟩ and ⟨Œ⟩ are not included in the Latin alphabet. They make their first appearances in the Middle Ages, first in cursive handwriting.  --Lambiam 09:21, 6 June 2024 (UTC)[reply]
Sure, but they were used to write Latin, which I feel is the pertinent sense in this context. Remsense 09:55, 6 June 2024 (UTC)[reply]
Non-Classical Latin or JUW which might be more common than any other novelty for some reason? It is useful as IJUVW became different phonemes though th is 2 different phonemes and we got rid of those letters (which the first neo-English authors unused to Germanic glyphs were uninclined to resurrect? When did the average England resident stop pronouncing ð and θ "poorly" with Norman accent? Did some people do this longer in some places and castes not influenced by non-Norman languages without th-sounds?) Sagittarian Milky Way (talk) 15:32, 6 June 2024 (UTC)[reply]
It might just be that ð and θ were more difficult to print than write by hand, I guess. I am more surprised that the sounds themselves have remained in English, when they disappeared in all continental Germanic languages. 惑乱 Wakuran (talk) 09:23, 7 June 2024 (UTC)[reply]
They were not "more difficult to print". Again, it is largely French orthographic influence. Plus, Þ and Ƿ looked a lot like P, which didn't help their viability. Remsense 09:50, 7 June 2024 (UTC)[reply]
I can't really understand the questions you're asking, could you clarify? Remsense 09:52, 7 June 2024 (UTC)[reply]
Did Norman have th-sounds (I've heard they're one of if not the rarest 2020s General American English phonemes in the world and in the Indo-Europeans) before 1066 and if it didn't then when did pronouncing IPA ð and θ with a Norman accent die out and who were the holdouts? Also what did ð and θ sound like with a Norman accent? I've heard it's sink for think in Modern Standard German? Also is it really true that relatively few European and world languages have the English r-sound or is it an artifact of transcription conventions and how specific the IPA charts in the database are? Maybe that English IPA letter (common w sound?) that's sometimes outside the main chart is more common than that database says? It's sometimes in the zone not always shown with the more "exotic" sound production methods like clicking. Is there any English vowel that'd be noticed in the accent of an English learner who only knew the language with the most vowel phonemes before learning English as an adult? !Xóõ and Ubykh with massive numbers of consonants don't seem to have all our consonants (and we don't have a few fairly common consonants like ñ, what's the most common phoneme in the world and Europe that English doesn't have? And what's the most common that people who only know English have trouble pronouncing, since some non-English phonemes aren't hard). Sagittarian Milky Way (talk) 17:03, 7 June 2024 (UTC)[reply]
Picking and choosing which I answer, with varying levels of surety.
  1. ⟨th⟩ was used as a digraph representing Greek vocabulary in Latin and in most of post-Roman Europe. As English is one of the few European languages other than Greek to have /θ/, the orthography was naturally adopted when writing English. Normans who moved from Normandy to England generally didn't learn English, so they didn't often speak English with a Norman accent. If I had to guess, it would be how modern French speakers realize it, as /t/. Over time, their children began to acquire English naturally, and as such were native speakers.
  2. I think [ɹ] is comparatively rare, yes.
  3. I think all non-English phonemes are equally hard to pronounce by native speakers (i.e., we don't, like it usually the case.)
Remsense 17:18, 7 June 2024 (UTC)[reply]
The English word "faith" has a [θ] from early old French [θ] (eliminated in later Old French). AnonMoos (talk) 19:38, 8 June 2024 (UTC)[reply]
Some others to pick and answer:
  • English has one of the largest vowel inventories of all (around 13, depending on dialect), a feature it shares with several of its European neighbours (German, French, Dutch, Norwegian). Such large vowel inventories appear to be a bit of a feature of the Standard Average European sprachbund. Interestingly, most of them have a set of rounded front vowels, which (most varieties of) English lacks. English fills in with additional central vowels or an unrounded back vowel. WALS lists German as the language with the largest vowel inventory of their sample (which includes less than 10% of the worlds languages). So there you have it: the vowels present in English, but not in German.
  • The most common sound not used in English? [ç~x~χ] is pretty common, but not used in most varieties of English. The same is true of trilled R's. [x] is an easy sound. Trills are harder (you have to find the right combination of air pressure and closing force to avoid both the fricative and the stop; if the articulator is too strongly damped, this may require excessive force), but most people manage the alveolar or the uvular trill, which tend to exist in free variation.
PiusImpavidus (talk) 19:51, 7 June 2024 (UTC)[reply]
Ah yes the Spanish rr, that wasn't hard for me. One thing that was hard was saying Xinjiang. A native speaker of one of the Chinese dialects/languages said it to me, I tried to copy ASAP, repeat like 7 times, the later max effort exact copy attempt tries sounded to me like they should be well within normal variation but were wrong every time. Sagittarian Milky Way (talk) 23:30, 7 June 2024 (UTC)[reply]
When I started learning Chinese, the hardest Putonghua initial for me to make was the [ɻ ~ ʐ] rhotic, of course. I was murmuring 热水 rèshuǐ into my phone for hours at a time trying to get the voice recognition to accept me. Remsense 00:43, 8 June 2024 (UTC)[reply]
A change in orthography might not necessarily indicate a change in pronunciation. When the Elder Futhark was replaced with the Younger Futhark in Scandinavia, the simplification caused a lot of minimal pairs in written form, whereas the phonetic inventory itself is believed to have increased. 惑乱 Wakuran (talk) 12:33, 7 June 2024 (UTC)[reply]

The table at "ISO basic Latin alphabet#Alphabets containing the same set of letters" is using a weird and extraordinarily technical definition: it allows diacritics, ligatures, and multigraphs, but only as long as they do not constitute distinct letters. Thus it is stated that Malay and Indonesian "are the only languages outside Europe that use all the Latin alphabet and require no diacritics and ligatures", and no mention is made of languages like Zulu, which makes use of the 26 unmodified letters from A to Z and nothing else but has multigraphs counting as distinct letters. --Theurgist (talk) 23:35, 7 June 2024 (UTC)[reply]

Wouldn't that cause their alphabet to have 27 or more letters? Spanish dictionaries have/had? an LL section I think with llama somewhere between luna and muón in alphabetical order instead of before both. And cañon would be after cantar in alphabetical order. Sagittarian Milky Way (talk) 23:51, 7 June 2024 (UTC)[reply]
If you're only interested in collation, then yes. But appearance-wise, Zulu texts only have the 26 basic letters and nothing else, while in French (which is in the table) you will also see a lot of diacriticized letters and a ligature or two. --Theurgist (talk) 00:06, 8 June 2024 (UTC)[reply]