Wikipedia talk:Naming conventions (standard letters with diacritics)/Archive 3
This is an archive of past discussions about Wikipedia:Naming conventions (standard letters with diacritics). Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 |
Two issues
I see two distinct issues here. One, which seems to be the topic of lots of heated debate on this page, is the choice between a plain letter and a letter with a diacritic. The other is the use of diacritics in titles which are transliterations from some other writing system.
Plain vs Diacritic
Right now, this is the lesser issue. My personal feeling is that we English-speakers need to get used to the idea that our language is like the Borg: we assimiliate just about every single thing we come into contact with, and we add its uniqueness to our own. We eagerly soak up words with spelling that makes little or no sense in our own twisted orthography, such as Czech, cappuccino or bon appétit. Sometimes the new loanword comes with baggage (an accent) such as café. Sometimes the absence of an accent does not hamper intelligibility, as in Guantánamo, and sometimes it creates a problem, as with passé. Maybe the printed page looks more "pretty" without any accents at all, but we're reading English, here, not Latin. On accents, then, my advice is Get used to it.
Printability
This is an embarrassing problem. We have fancy Unicode-enabled browsers, and we have installed at least one Unicode font on our super-duper 21st-century computers. Still, though, some characters refuse to display properly. What characters are they? The ones I keep running into are the diacritic-adorned letters used for transliterating foreign alphabets such as Arabic and the Indic scripts. See Kharosthi, or more specifically, the page it redirects you to. The problem with Arabic is the ArabDIN character-encoding. My browser [IE 6.0] just doesn't support it. I get stupid little boxes all over the page. Having them in the article text section is bad enough, but the article title ...? noooooooooooooooo. WHEN someone updates the browsers, we can think about the nice little choices between accented characters and their unadorned counterparts. Until then, we REALLY need a solid, citable guideline about putting unprintable characters in article titles.
--Cbdorsett 12:37, 8 January 2007 (UTC)
That's a problem that needs to be addressed with computers, not one that should be allowed to restrict Wikipedia. Kharoṣṭhī looks fine to me, no boxes there. Perhaps it's time to update the ol' computer. --LakeHMM 04:25, 17 January 2007 (UTC)
- That's fine for you; it's not true on this IE computer, which shows Kharo[][]hī (I use [] for the annoying little square box). WP should be widely accessible and informative; that's what it's for. Septentrionalis PMAnderson 01:25, 2 February 2007 (UTC)
- Renders fine on my Windows 2000, IE6 and XP IE7 boxes, as well as with Firefox 2. Windows 2000 is essentially dead now and is currently in extended support only. A few months back I confirmed that macrons (yes, a slightly different issue) displayed fine on Windows 98, IE5. That OS is fully unsupported now. I do not have access to that system any more to check this example though. Vista has all of the Uniscribe and foreign-language features enabled by default now. There are many free operating systems as well that often run well on very old systems. In the past, I used to view websites over telnet. Images would not render, but I could hardly expect them to be replaced by ASCII art. Missing glyphs can be solved by installing new fonts. Diacritics are often necessary. Bendono 03:15, 2 February 2007 (UTC)
- We should not have to install fonts for all possible glyphs to use the English Wikipedia. People who want to use one of the other Wikipedias will expect to have to have support for the characters in its language; that should not be a prerequisite to the use of the English Wikipedia.
- Furthermore, the other characters used in the other languages most closely related to English have been supported by all browsers, all computers set up for use in English, since back in the DOS days and even before the Internet existed. It is, of course, quite natural and to be expected that English is more likely to borrow from, and more likely to use without modification, the various characters used in the languages which are most closely related to it, in the same family of languages. Gene Nygaard 03:34, 3 February 2007 (UTC)
- My Windows 2000 and Windows XP systems were both rebuilt within the last month. To the best of my knowledge, I have not installed any special fonts. Everything just displays fine, so I have not had the need to. When I tested Windows 98 for macrons (only) a few months ago, it was most definitely the default install. I do not know how some people are configuring their systems to exclude some of these fonts. However, there are many freely available to fix that. Bendono 03:52, 3 February 2007 (UTC)
- But not necessarily the font face those people like to use. There is a whole lot more to the aesthetics of choosing a font than the ability to display some obscure character that the users would never use themselves. Gene Nygaard 16:57, 3 February 2007 (UTC)
- Not sure out Linux, but on a Windows systems that is irrelevant. Even if the currently selected font does not contain the required glyph to display a character, font linking--a feature of MLang--will automatically search other fonts for an appropriate glyph. Square boxes mean that the glyph, including from other fonts via font linking, can not be found. Bendono 17:27, 3 February 2007 (UTC)
- And all that notwithstanding, over 99.99% of Wikipedia users will still get square boxes in some Wikipedia articles, and just my rough guess that it is likely that a majority will get square boxes (or is it some of that %hex number gibberish instead?) in some article names (not counting redirects).
- I see, however, that for some of them which show up as boxes on my computer when I look at the Special:All pages page, they show up as glyphs when I follow the link to the pages. So I dispute your claims about this font substitution always taking place. It clearly does not, as I have seen with my own eyes (I'm using Windows). Gene Nygaard 19:06, 3 February 2007 (UTC)
Ælfric and other Old English names
- Section (and subsections) moved to Wikipedia talk:Naming conventions (use English)#Ælfric and other Old English names by Francis Schonken 23:39, 8 January 2007 (UTC)
Change article name and scope?
- Subsection moved to Wikipedia talk:Naming conventions (use English)#Change article name and scope? by Francis Schonken 23:39, 8 January 2007 (UTC)
Æ/æ/Œ/œ - rules proposal
- Subsection moved to Wikipedia talk:Naming conventions (use English)#Æ/æ/Œ/œ - rules proposal by Francis Schonken 23:39, 8 January 2007 (UTC)
My Opinion: Be Specific
I say be as specific as possible. Somebody else criticised the idea that names with more diacritics would be more correct, but, unless they're being added for no reason, they usually are. For example, with pinyin, it specifies which word exactly, as the tone is there. Leaving the tone marks out are like leaving letters out of English words. For example, Norwegian doesn't use the letter c, but they don't call their article on Fidel Castro "Fidel astro", and they don't replace the letter with another one that looks similar, like making it "Fidel Oastro". They put it in there because it's missing a vital part of the word if you leave it out or change it. Same with other diacritics and "non-standard letters", that make letters make differnet sounds. Examples are Lech Wałęsa and Anders Jonas Ångström. When mentioned in English media, they usually leave the diacritics out, but we leave them in because that's how the names are spelled.
I know this has been discussed many times elsewhere in the page, but I'm just adding my thoughts. Don't feel like you need to repeat things you've said elsewhere on this page or on Wikipedia. I've heard the other arguments, but I think it's best to be specific, accurate, and proper, and redirect the less accurate pages to the ones with diacritics. --LakeHMM 04:23, 17 January 2007 (UTC)
- One difference is that the Norwegian alphabet does still officially include C, Q, W, X, and Z among its 29 letters, even though they are no longer used in native words that are not proper names (some personal names, and a few place names, and other proper names do still include some of them). Gene Nygaard 03:26, 3 February 2007 (UTC)
- And one reason for that, of course, is that they were used in the past in Norwegian words.
- Note also that neither of the two Norwegian Wikipedias has any squiggles on their articles about România: They are at and nn:Romania, just as our English one is at Romania (the earlier link is a redirect), even though the Romanian Wikipedia article is at ro:România. Of course, the Norwegian Wikipedias don't have an article at "Norway" either; but you may not realize that they also do not have it under the same name in the two Wikipedias, but rather each has it under separate spellings. Gene Nygaard 07:20, 3 February 2007 (UTC)
The United States of America's Wikipedia
I'd like to address a topic that comes time and again in this discussion, some native English speakers referring to themselves as "us" versus non-native English speakers as "them". The thing is, this is the "English Wikipedia", not the "United States of America Wikipedia" nor the "(British) Commonwealth of Nations Wikipedia" and certainly not the "Native English Wikipedia". As far as I know, the English Wikipedia doesn't distinguish non-native English speakers from native English speakers when it comes to editing and discussing here, all of us are foreign to each other but we are all editors period. That means that users such as myself aren't foreigners sticking our noses where we aren't invited but that we are editors to the English Wikipedia discussing the issues which pertain to the English Wikipedia.
In this discussion there have been references made to "foreign" editors coming to the English Wikipedia via any of its sister Wikipedias. This isn't accurate as the English Wikipedia is by far the most popular of the Wikipedias and many of the non-native English speakers, like myself, knew of Wikipedia through its English version and we aren't even active editors at our native language's Wikipedia. This doesn't show a lack of allegiance towards our native culture as Wikipedia isn't a bastion for any specific culture and no one owes any kind of alligiance towards any of the Wikipedias. Just as the English Wikipedia isn't of the native English users, neither the French, Deutsch, Spanish, etc. Wikipedias are of or for the native speakers of those languages exclusively. For historical reasons the English language is spoken by millions of people other than British, Americans and Aussies so maybe that's why this discussion concerning foreign names arises more frequently here than in other Wikipedias where the population is more homogeneous.
Finally, those of you who want the native English speakers to be "left alone" to decide on Wikipedia's policies, guidelines etc. fail to measure the consecuences of what you're asking for. The reason why I and many non-native English speakers became editors to this Wikipedia is because this one totally rocks when compared to the other ones, it has thousands of more articles and the topics vary immensly... why? It's not because the native English speakers are more prolific writers than those of other languages, it's in large part due to the fact that thousands of editors from Asia, Latinamerica, Middle East, Africa and the non-English speaking European countries contribute to this Wikipedia. Look at today's featured article for instance, a biography on Sir Syed Ahmed Khan Bahadurسید احمد خان بہا در , do you think the main contributors to that article were John Johnson and Jack Jackson? Rosa 23:20, 3 February 2007 (UTC)
- So, what do you have against John Johnson and Jack Jackson? Is there some particular reason why you think they are especially stupid and incompetent, and incapable of contributing to this article? Besides, who knows. In fact, it may well have been them; a great many of the contributors to that article are completely anonymous IP addresses; most of the others just have the relatie anonymity of their Wikipedia user names, plus whatever you might want to believe about whatever information they might choose to present about their identities on their talk pages. Gene Nygaard 01:26, 4 February 2007 (UTC)
- Whoa, whoa, Gene, c'mon. I am totally on your side in this debate, but if you thought that her point was that Americans or English speakers can't write or are ignorant, then you need to cool off a moment. All I think she was saying is that people in foreign lands are a lot more likely to write knowledgeably about their native countries than the majority of people who haven't been there. I write on articles dealing with places I've lived, I think it goes for all of us. Unschool 02:39, 4 February 2007 (UTC)
- Having said that, I was much grieved today to have written a lengthy response to Rosa's comments, only to have my browser freeze, losing my comments. I really must learn to write my stuff with Word before posting. I'll post an abbreviated version later. Unschool 02:41, 4 February 2007 (UTC)
- Whoa, whoa, Gene, c'mon. I am totally on your side in this debate, but if you thought that her point was that Americans or English speakers can't write or are ignorant, then you need to cool off a moment. All I think she was saying is that people in foreign lands are a lot more likely to write knowledgeably about their native countries than the majority of people who haven't been there. I write on articles dealing with places I've lived, I think it goes for all of us. Unschool 02:39, 4 February 2007 (UTC)
- Rosa wrote a nice litle essay here; a little weak on some of its factual basis, but expressing some good sentiments most of us could agree with. Then she threw it all out the window with a bit of the improper stereotyping on her own part. Gene Nygaard 04:02, 4 February 2007 (UTC)
- wow...this is as close as you've ever come to giving me a compliment on any of the issues we have discussed these couple of weeks Gene...lol...and Unschool gave that sentence its proper interpretation, sans the acrid twist you season your comments to me with. I have nothing against John or Jack; they could be Dr. Johnson or His Honorable Judge Jackson for all I know. All I meant was that it's more likely that someone like Atif nazir created this article on Sir Syed Ahmed Khan Bahadurسید احمد خان بہا در; just as it's more likely for a Jack Jackson to write an excellent article on George Patton or Benjamin Franklin or for a Juan García to write the bio of Simón Bolívar (or for a Rosa Martínez to write an article on Armando Manzanero for that matter lol).Rosa 20:14, 5 February 2007 (UTC)
Redirects don't just happen! Part I, the easy test
Several editors have expressed sentiments along these lines. But what is this, anyway? Some ivory tower speculation by people who have never actually gone out and even looked at what we have in Wikipedia?
- "Redirects make the issue of difficulty in visiting or linking to the article immaterial" Deco, 7 Feb 2006
- "Articles with Czech diacritics are readable in English, you only need a redirect becouse of problems with typing." --Jan Smolik 7 Feb 2006
- "Czech names: almost all names with diacritics use it also in the title (and all of them have redirect)." Pavel Vozenilek, 8 Feb 2006
- "I hope that we have finally reached the agreement that linking through redirects is not a relevant issue," Zocky, 28 Jun 2006
- "This is particularly the case in Wikipedia as we have redirects which allow us to use correct characters in titles without inconveniencing our visitors." Oldak Quill, 28 Jun 2006
- "A person "won't know how to type the ł's"—a moot point because redirects handle all the variants." Vecrumba, 19 November 2006 (UTC)
Let's just assume you are somebody who had just read something about the following in an English-language newspaper or magazine. So you put one of these into the box on the Wikipedia page and hit "Go", ending up at the page creation notice.
But unlike the typical user, who just assumes that Wikipedia doesn't have anything about it, you are "smarter than the average bear!". You realize that it might be there, but under a different spelling. Maybe you've heard me when I have pointed out that redirects don't just happen!
So you use all the tool in your little bag of tricks. Then you create the missing redirects, so that those who follow after you don't have to go through the same thing. Some the folowing are missing redirects which include diacritics (usually in conjunction with missing redirects without diacritics), and there are even a few don't deal with diacritics at all.
- Josip Kras
- Basil Stoica
- Florent Prevost
- George Mihaita
- Paco Leon
- Nurgul Yesilcay
- Dezso Foldes
- Moment of force
- Ciftelia
- Vasterleden
- Tresovice
- Ante Radonic
- Francisco Ibanez Talavera
- Francisco Ibanez
- Stephane Matteuzzi
- Karen Oppegard
- paski sir
- Sutlu Nuriye
- Duved
- Daniel Tchur
- Konrad Fialkowski
- Entremes
- Paidi O Se
- Paidi Ó Sé
- Paidi O'Shea
- Erwin Proll
- Erwin Proell
- Miroslav Rozic
- Robert Strak
- Hannu Juhani Nurmio
- Hannu Nurmio
- Visa Makinen
- Bermejo Pass
- Julieta Colas
- Julieta Colás
- Julieta Colás Márquez
- Julieta Colas Marquez
- Jindrich Backovsky
- Edvaldo Valerio
- Chatenay, Ain
- Chatillon-en-Michaille
- Chatillon-la-Palud
- Marian Tomasz Golinski
- Glenn Caron
- Sater Municipality
- Torshalla
- Gunnar Lindstrom
- Manuel Menendez
- Ricardo Pio Perez Godoy
- Jean-Pierre Clement
- Ruben Pellanda
- Umut Guzelses
- Osman Kursat Duman
- Janusz Kolodziej
- Jacek Koscielniak
- Miroslaw Kozlakiewicz
- Gordan Kozulj
- Ahmet Koc
- Grzegorz Kolacz
- Lech Kolakowski
- Robert Kolakowski
- Viktor Kovacs
- Thomas Floegel
- Aoua Keita
- Luis Marin Munoz
- Yvon Cote
- Yvon Coté
- Edin Curic
- Gustaf Soderstrom
- Besiktas Cola Turka
It will be interesting to see how long these might sit here without turning blue instead of red. All of the above are ones where I have already called it missing redirects to the attention of editors in my edit summaries, but nonetheless they still haven't been fixed. So you can always find the article to which any of them belongs by looking through my contributions list for the past two months. Gene Nygaard 13:24, 5 February 2007 (UTC)
- The problem is that the "Go" button is much less user-friendly than it should be. Search engines like Google or Yahoo handle special characters automatically, are case insensitive and even make suggestions for misspelled words. The Go button does nothing of that, but many users seem to expect it to do so and therefore forget to create necessary redirects when writing articles. -- memset 18:08, 5 February 2007 (UTC)
- There are a great number of ways in which characters with diacritics give different results from those without in Google and Yahoo, and even more in most other search engines.
- And I, for one, would not want the "Go" button to work any differently than it does now. The "Search" button, of course (and still defaulting to a search if the Go button doesn't find what you put in) is a different story, but I'd bet you about 100:1 that what you describe never crosses the minds of most of the people who fail to create redirects. Gene Nygaard 22:16, 5 February 2007 (UTC)
- Here is a specific example of how diacritics do affect searches, memset. For example: Google <Ngobe-Bugle site:en.wikipedia.org>[1] finds the article at Ngöbe Buglé redirected to from Ngobe Bugle (note, no hyphens in either) , but it does not find the separate article about a different subject at Ngöbe-Buglé (which has been on Wikipedia for 11 months, so Google certainly must have indexed it by now). It might have, if the two articles had been properly crosslinked and both redirects existed--but of course, they are not and do not. [NOTE: Since its inception, the consistent spelling used in the text of the Ngöbe-Buglé ever since its inception is Ngöble-Buglé, but whether that is an error in the text or an error in the article name, or simply a failure to list variant spellings and create proper redirects from them as well, is irrelevant to the point at hand. The article has always been at the current name as will, so it doesn't involve an improper move.]
- Note also that Googling for <Ngöbe-Buglé site:en.wikipedia.org> (differing from the above search only in the inclusion of the diacritics) does, of course, find both articles, with and without the hyphen.
- That is just one of the ways diacritics affect search engine results. Another harder to determine factor is the effect on the weighting of the results; it often makes some difference on how high up in the search results a particular hit will appear.
- Furthermore, a search on Google including a word with diacritics, and excluding the same without diacritics (or vice versa) is almost never empty, as it would be if they were treated as identical for its search purposes. And the two directions give markedly different results.
- Now put the same Ngobe-Bugle into the "Go" box on the Wikipedia page. It doesn't go to an article, but switches to a search. That search does find both Ngöbe-Buglé and Ngöbe Buglé, unlike the Google search which only finds the one without the hyphen.
- Not what you expected, is it, Memset? So before you continue bad-mouthing the Wikipedia search engine, keep in mind that all the different search engines have their own little advantages and disadvantages; there is no magic bullet that makes everything work best for everyone, every time. Gene Nygaard 17:05, 7 February 2007 (UTC)
- But search engines don't just stick to the exact search string, they also search for different diacritic variants: google searches for Władysław, Dvořák, "Okopy Świętej Trójcy", or Düsseldorf give roughly the same results with and without diacritics. That the results aren't exactly the same and differently weighted is indented and depends on the user's location and interface language, accordung to this. It looks like this doesn't work well for Ngöbe-Buglé.
- Wikipedia's full-text search isn't doing much better either, just compare its results for México and Mexico. It has just some advantages over Google because it sees all redirects (the redirect Ngobe Bugle isn't in Google's cache for some reason) and the wikicode of all pages (in Ngöbe-Buglé, "Ngobe-Bugle" appears only as a category sort key that is not part of the generated HTML page that Google sees).
- I'm not badmouthing anything, I'm just saying that the "redirects don't just happen"-problem would not exist with an improved "go" search. -- memset 00:31, 8 February 2007 (UTC)
- Gene, the issue with redirects work both ways. If you create an article without diacritics, then the diacritic versions will redlink without proper redirects. No matter what side of the issue you are on, both sides need redirects.
- As you have said, you have known about the above redlinks for at least two months. In all of that time you continue to complain but why have you not created redirects? In any case, they were all fixed within hours of your posting. Also, a significant number of your links have nothing to do with diacritics. Bendono 23:45, 5 February 2007 (UTC)
- You are to be congratulated for showing that there is indeed one Wikipedian who will fix these problems when they are called to his attention. I don't know if you had any help with some of these, but I think (at least hope) that others would also have done so if you hadn't beaten them to it.
- However, it is not exactly a two-way street. When we don't have the English alphabet spelling in the English Wikipedia, that is indeed something that is missing. But the converse is not true; foreign spellings may be a nice-to-have alternative, but they aren't a required element that is missing.
- There is an even more significant reason why it is not a two-way street, however. In most cases, each article name will map into one specific English-alphabet version, some into two alternatives. Since multiple occurences of the same character will normally all be mapped the same way, almost all article names will map into no more than four alternatives.
- So if somebody tries to use a version with diacritics and it doesn't work, they know what they need to try next.
- But the same is not true for going in the other direction. Each letter of the English alphabet can be mapped into by many different possible versions with diacritics. For some article names without any diacritics, there may be hundreds or thousands, even millions, of possible ways that it could it could be written with diacritics.
- So if somebody tries the version without diacritics and it doesn't work, the situation is entirely different. They usually do not know what they need to try next. Or, in other cases, they maybe do know what they want to try next; problem is that they don't know how to make the squiggly character they want to make.
- Note that if you are just reading an article and want to put something into the "Go box, you don't even have that somewhat helpful little edit-screen crutch of having a half-page of squiggly letters you can wade through looking for the one you want to use, after you have gone to the additional trouble of increasing your font size to get them big enough so that you can tell one little squiggle from another one that looks a lot like it. (In a few cases even that doesn't help, of course, and they remain indistinguishable no matter how large you make them.) Gene Nygaard 04:57, 6 February 2007 (UTC)
Part 1b of the test
But now let's get into exactly what you have accomplished, or not accomplished, in jumping right in and fixing those redlinks.
First of all, it was mostly invisible work.
Even on this talk page, where people can see that the links are no longer red, they don't see who fixed them. But that isn't the real problem.
More significantly, this is invisible because absolutely nothing about it shows up on the watchlists of anybody following these articles. The people previously involved with the article will not have any way of knowing that the problem has been fixed, let alone that a problem even existed in the first place. There is nothing on the talk page to tell them about the existence of the problem, In this case there is only one thing that can make them aware of the existence of the problem, and that is something that most article where this problem has been fixed do not have—my earlier edit summaries pointing out the fact that redirects were indeed missing.
Nobody is going to stop creating these problem articles without redirects, just because you jumped in and fixed those. They are going to remain blissful in their ignorance. The ones who remain active on Wikipedia will keep creating the same problems over and over again, as will a new crop of editors who don't ever stumble across anything telling them of the need to fix this problem.
Unfortunately, another problem is that far too many of the creators of these problems appear to have been on kamikaze missions, and have died or otherwise disappeared since the blitzkrieg attack in which which created a whole bunch of unreferenced stubs, often with many other problems in addition to these missing redirects and the fact that they are not properly sorted in categories.
For example, we can see what happens when we go to look at a the other entries in the same categories (ignoring, for now, stub categories and birth/death/dead/living categories) as the ones you fixed above. We still find missing redirects to other articles in those same categories, some likely created by the same individuals who created the articles above without redirects, who remain ignorant of the problem and will likely continue doing so in the future. For example, in categories corresponding to the entries of the numbers above:
- 1. Otokar Kersovani
- 2. Sandor Kanyadi
- 3. Noel-Antoine Pluche
- 4. Stefan Banica, Sr.
- 5. Eduardo Gomez
- 6. Ugur Yucel
- 7. Imre Konig
- 8. not dealing with diacritics
- 9. Scoil na gClairseach
- 10. Noykkio
- 11.
Vyseherad - 12. Matija Divkovic
- 13-14. Miguel Angel Martin is a bluelink. But it is a different article, about a golfer b. 1962, from the one at Miguel Ángel Martín (b. 1960). Neither of these two articles have a disambiguation link to the other.
- 13-14. Jose Escobar Saliente
- 15. the only other one which didn't involve diacritics
- 16. . . . Enough! You should have the idea by now.
My calling attention to these problems in edit summaries wasn't a resounding success, but it did result in a significant number of redirects being added by others before I listed the ones that hadn't been fixed here. So if you have any bright ideas on other ways to call the existence of this problem to the attention of those editors largely responsible for creating it, and to get them to stop doing so in the future, have at it and report back to us on what you tried and on how successful it was. Gene Nygaard 05:26, 6 February 2007 (UTC)
- Unless someone else takes care of them first, I'll fix these tonight. I'm at work now, so it will not be as fast as last time. Gene, why not help out a little? You know what needs to be fixed and you continue to complain about it. You seem to care about Wikipedia, and yet these test seem so counterproductive. Bendono 05:41, 6 February 2007 (UTC)
By the way, it wouldn't be that difficult to create all those missing redirects automatically using a bot. Just go through pages with non-ASCII titles, check if there is a redirect (or another article that contains a link to the current article, e.g. a disambiguation) at the "non-squiggly" version of the title, and create the redirect if it is missing. If there already is a different article (like that Miguel Ángel Martín example), add the article to some list so it can be fixed manually. -- memset 10:46, 6 February 2007 (UTC)
- Done. Please double check Vyseherad. Vyšehrad is the real article. And Vysehrad already exists. If you think that Vyseherad should exist too, please add it yourself. Now enough with the games. I try to make Wikipedia a better place, but it is not a one-person job. Next time you notice some needed redirects, please add them. Bendono 13:26, 6 February 2007 (UTC)
- I suppose I could claim that was part of the test. But it was my typo that caused me to think it was a redlink. So substitute the redlink at Velka Chuchle for another article in the same category.
- I have, of course, created many redirects myself. But as I have pointed out, that doesn't solve the problem. I'll still do it myself on occasion. But I'm now more likely to try to call it to the attention of those who care if the article can be found or not, the ones who should be most interested in creating these redirects. Most of the time it is something that can remain hidden away for all I care.
- Of course, those articles with diacritics which do not have these redirect/disambiguation page/disambiguation line links also have a much higher probability of being missorted in their categories than the ones which do have redirects. So as you are out adding redirects, could you check for proper sort keys as well? Gene Nygaard 15:06, 6 February 2007 (UTC)
Bot proposal
After some of my personal experiences with de-diactricized User:Piotrus/List of Poles, I am strongly in favour of a bot that would automatically create a de-diactrcized redirect of any article using dicatrics in names. What do you think about requesting such a bot?-- Piotr Konieczny aka Prokonsul Piotrus | talk 03:10, 11 February 2007 (UTC)
- A good idea. The only real problem I can see is that it might make compromise solutions on diacriticless forms harder; for example it would have created Wladyslaw II Jagiello, which couldn't be moved to from Jogaila.
- Three specifications:
- It should not create double redirects: John Charles Fremont → John Charles Frémont → John C. Frémont; maybe it should create parallel redirects in such cases, as Frémont is. Parallel redirects are apt to make the no-diacritics compromise harder.
- It should have a stop list. I don't care if we have an article on Ånd and And is a redlink; we should not create that redirect. Perhaps: if the Wiktionary article for the source exists, don't make it.
- Never more than one edit to a location. Septentrionalis PMAnderson 03:45, 11 February 2007 (UTC)
- Redirects with no edit history (i.e. just created and never edit) can be deleted during moves by non-admins (I think...). WP:RM has an easy way to deal with non-controversial moves requiring a deletion, so I don't think much problems would be generated by this.-- Piotr Konieczny aka Prokonsul Piotrus | talk 07:45, 11 February 2007 (UTC)
- About the stop list/Wiktionary lookup: I think a better way to prevent red links from being turned into wrong blue links is to have the bot check the what-links-here list before creating the redirect. If they are any red links, they should be checked by the operator before creating the redirect. -- memset 09:18, 11 February 2007 (UTC)
Turkish dotted and dotless I
I've removed the following:
Turkish distinguishes between dotted and dotless I:
- dotted: İ/i
- dotless: I/ı
Where different from the I/i normally used in English, the specific Turkish characters can be used subject to what is explained in this guideline (although this is no "diacritic" issue strictly speaking), leading to, for example:
- Istanbul (Turkish: İstanbul) - For this city the version of the name starting with dotted capital İ is quite uncommon in English.
- Diyarbakır (with Diyarbakir as a redirect page) - in this case the version of the name with dottless lower case ı in the last syllable is fairly well spread in English too.
The reason is because, out of all articles about Turkish place names, Istanbul is the only one I can think of that doesn't use the correct spelling. If you see İzmir, Muğla, Eskişehir, Ağrı, Gümüşhane, Karadeniz Ereğli, etc. you will notice that they all use the diacritics. The way it read made it seem like some Turkish articles used the English spelling and some don't, but this is not the case. Khoikhoi 02:53, 6 February 2007 (UTC)
- The only change that really needs to be made is to remove the nonsense quibbling about what a "diacritic" is. For the purposes of this convention, it is no different from any other diacritic. It causes the very same problems of inaccessibility of information as any of the others. I reverted your move pending the outcome of this discusion.
- Furthermore, just because some articles violate the existing conventions is not a reason to grant them exemption from those conventions. Gene Nygaard 04:53, 6 February 2007 (UTC)
- What inaccessibility? And what do you mean by "some articles violate the existing conventions"? Are you suggesting that we should rename the articles to "Izmir", "Mugla", "Eskisehir", "Agri", "Gumushane", and "Karadeniz Eregli"? Are there certain conventions that say we should? Khoikhoi 04:57, 6 February 2007 (UTC)
- I agree to respect the Turkish spelling of the Turkish names. But why only Turkish and not Azeri? Why are İsmet İnönü's dots respected and not İlham Əliyev's (Ilham Aliyev)? Švitrigaila 16:51, 6 February 2007 (UTC)
- What inaccessibility? And what do you mean by "some articles violate the existing conventions"? Are you suggesting that we should rename the articles to "Izmir", "Mugla", "Eskisehir", "Agri", "Gumushane", and "Karadeniz Eregli"? Are there certain conventions that say we should? Khoikhoi 04:57, 6 February 2007 (UTC)
- Probably the person who created the article just spelled it without the diacritic in the first place. The community is divided right now between those of us who want native spellings to be used in articles and those who want to use the Anglicised spellings. In fact, in this very page the second title which reads "Using diacritics (or national alphabet) in the name of the article" tells of a discussion which took place about this one year ago at the village pump regarding this subject. It's not clear to me what exactly was the outcome of that discussion but if the trends still hold true then the answer probably is that community couldn't reach consensus.Rosa 17:30, 6 February 2007 (UTC)
- Not exactly. There can be a consensus of what can't be done, and no consensus of what must be. There was a long discussion about Azeri names. The Ilham Aliyev article was normally titled İlham Əliyev until several mounths ago, and there was a discussion about the place of the ə in azeri names. A clear majority has decided that the ə must be excluded... and decided of nothing else to replace it. And you know that if it is decided to forbid the only correct spelling without deciding (or even discussing) which one of the wrong spellings must be used instead, all ends in a total mess. Article titles with Azeri names have constantly an chotically changed names since this "decision". That's why I'm clearly in favor of keeping the Turkish spelling of the Turkish names... ans the Azeri spelling of the Azeri names. Švitrigaila 00:00, 7 February 2007 (UTC)
- Just remember that it is not a misspelling to use the English alphabet when writing in English, and you shouldn't have a whole lot of problems. The point I'm making is that if you feel some substitution would be a misspelling in a foreign language, so what. You can put the article under the English spelling, and redirect to it from various other possibilities. Gene Nygaard 00:33, 7 February 2007 (UTC)
- To be more specific, in the particular case under discussion, this has been done by having it at Ilham Aliyev, which is not a misspelling, and with redirects from both the İlham Əliyev spelling you used and the İlham Aliyev you claim to be a misspelling. Gene Nygaard 01:01, 7 February 2007 (UTC)
- Just remember that it is not a misspelling to use the English alphabet when writing in English, and you shouldn't have a whole lot of problems. The point I'm making is that if you feel some substitution would be a misspelling in a foreign language, so what. You can put the article under the English spelling, and redirect to it from various other possibilities. Gene Nygaard 00:33, 7 February 2007 (UTC)
- Not exactly. There can be a consensus of what can't be done, and no consensus of what must be. There was a long discussion about Azeri names. The Ilham Aliyev article was normally titled İlham Əliyev until several mounths ago, and there was a discussion about the place of the ə in azeri names. A clear majority has decided that the ə must be excluded... and decided of nothing else to replace it. And you know that if it is decided to forbid the only correct spelling without deciding (or even discussing) which one of the wrong spellings must be used instead, all ends in a total mess. Article titles with Azeri names have constantly an chotically changed names since this "decision". That's why I'm clearly in favor of keeping the Turkish spelling of the Turkish names... ans the Azeri spelling of the Azeri names. Švitrigaila 00:00, 7 February 2007 (UTC)
- Probably the person who created the article just spelled it without the diacritic in the first place. The community is divided right now between those of us who want native spellings to be used in articles and those who want to use the Anglicised spellings. In fact, in this very page the second title which reads "Using diacritics (or national alphabet) in the name of the article" tells of a discussion which took place about this one year ago at the village pump regarding this subject. It's not clear to me what exactly was the outcome of that discussion but if the trends still hold true then the answer probably is that community couldn't reach consensus.Rosa 17:30, 6 February 2007 (UTC)
- Yes, Khoikhoi, there is: WP:COMMON, and the policy which invokes it. Istanbul is usually so spelt in English, just like Rome or Nuremburg. Can you explain the recurrent claim that Turkish names are somehow exempt from general Wikipedia policy? Septentrionalis PMAnderson 17:16, 6 February 2007 (UTC)
- Exactly Khoikhoi...this project tries to specify Wikipedia's naming conventions on this very issue. The proposal is that the articles should be spelled as it is routinely used according to English sourcesnames, and when in doubt of what the English "common usage" (with or without diacritics) is, then diacritics should be avoided. So, one of the consequences of implementing this project would be that İzmir, Muğla, Eskişehir, Ağrı, Gümüşhane, Karadeniz Ereğli will have to be renamed "Izmir", "Mugla", "Eskisehir", "Agri", "Gumushane", and "Karadeniz Eregli" unless you come up with reliable English sources that use the names with diacritics. Rosa 17:30, 6 February 2007 (UTC)
- For several of these names, such sources probably exist, however. (And the Turkish wikipedia should, as it does, follow Turkish usage; even .) Septentrionalis PMAnderson 17:53, 6 February 2007 (UTC)
- Exactly Khoikhoi...this project tries to specify Wikipedia's naming conventions on this very issue. The proposal is that the articles should be spelled as it is routinely used according to English sourcesnames, and when in doubt of what the English "common usage" (with or without diacritics) is, then diacritics should be avoided. So, one of the consequences of implementing this project would be that İzmir, Muğla, Eskişehir, Ağrı, Gümüşhane, Karadeniz Ereğli will have to be renamed "Izmir", "Mugla", "Eskisehir", "Agri", "Gumushane", and "Karadeniz Eregli" unless you come up with reliable English sources that use the names with diacritics. Rosa 17:30, 6 February 2007 (UTC)
- Septentrionalis, WP:COMMON is not a policy or guideline. Khoikhoi 18:06, 6 February 2007 (UTC)
- It ought to be; but I meant WP:COMMONNAME, of course. Septentrionalis PMAnderson 18:46, 6 February 2007 (UTC)
- Septentrionalis, WP:COMMON is not a policy or guideline. Khoikhoi 18:06, 6 February 2007 (UTC)
- Khoikhoi, Turkish names get the same consideration as anybody elses. No more. Our primary consideration remains best known in English.
- What do I mean by accessibility? Are you totally oblivious to the world around you? What do you suppose has constituted a large portion of the discussions on this page? I'm talking about missing redirects. the associated redlinks and inability to find things with the "Go" button. I'm talking about articles that aren't in categories, because they aren't alphabetized the way they should be. And these screwball dotless i's and dotted I's are not one silly iota different from any of the other diacritics; there's no reason whatsoever for them not to be included within the scope of this project page. Don't be trying to remove them from there. Gene Nygaard 20:41, 6 February 2007 (UTC)
Better slightly late than never
- LOL...this project is surely past its prime. If there isn't consensus from the community for more than a year, shouldn't this proposal be dismissed or something? Rosa 16:57, 6 February 2007 (UTC)
- Well, as long as the discussions are ongoing, proposals are considered active. WP:NCGN was discussed for over a year before being 'promoted'. We need a policy on diactrics, although honestly I don't see why we cannot replace the entire page with 'they are allowed, use them' :) -- Piotr Konieczny aka Prokonsul Piotrus | talk 03:08, 11 February 2007 (UTC)
- As a reader and researcher, I find diactrics to be very helpful in my quest to at least have an idea how to pronounce a word that is from a language that is not my own. I recommend that we use diactrics in titles that are names of foreign origin and have the English variant of the name, if different in more ways than just the diactrics, in parentheses after the original name. In my opinion that simple task can be Wikipedia's contribution to the broader and subtle education of people who are not used to seeing anything other than un-accented English. Lemuela 06:46, 14 February 2007 (UTC)
- This is an argument for including the foreign spelling in the first line of the article; as we do. I don't on the whole think it's a very good argument: Most readers will not have any idea what pronunciation Stanisław Ulam or Paul Erdős represents, and if they did, it would be misleading; experience shows that the letters in question are pronounced in practice as if they had no diacritics. I prefer to present the valuble information that Stanislaw Ulam's name was adapted to English; but Paul Erdős did not. Septentrionalis PMAnderson 19:30, 14 February 2007 (UTC)
- As a reader and researcher, I find diactrics to be very helpful in my quest to at least have an idea how to pronounce a word that is from a language that is not my own. I recommend that we use diactrics in titles that are names of foreign origin and have the English variant of the name, if different in more ways than just the diactrics, in parentheses after the original name. In my opinion that simple task can be Wikipedia's contribution to the broader and subtle education of people who are not used to seeing anything other than un-accented English. Lemuela 06:46, 14 February 2007 (UTC)
- I am with Lemuela. I personally think that to use "Stanislaw Ulam" instead of "Stanisław Ulam" (with a redirect of "Stanislaw Ulam" to the proper name) is both a disservice to the reader/researcher and is misleading, at best. When a reader is properly redirected to the name that includes proper diatrics, the user is further educated as to the real name without argument of form, type, usage, or any other argument that "popularity contests" present within wikipedia. Yes, most readers will not know what diatric to use (if any), that is what redirect is so useful for. As a geographer, I would rather have reference to the proper name in order to 'get it right the first time' then to wonder what the name really is (there is ongoing debate/clarification with regard to Mongolian names right now). Yes, one could argue that "Stanislaw Ulam" is correct in English - but actually it isn't. The proper translation, if we are proposing such usage, is "Stanley Ulam" - this, of course, takes away from the actual name and is the disservice that is being mentioned. Therefore, I think the best approach is to use such diatrics. Rarelibra 19:40, 14 February 2007 (UTC)
- I chose Ulam as an example, because he himself, his wife, his co-worker Jan Mycielski, and his colleagues all spell the name as Stanislaw in English, as his autobiography Adventures of a Mathematician attests. Septentrionalis PMAnderson 06:39, 15 February 2007 (UTC)
- I am with Lemuela. I personally think that to use "Stanislaw Ulam" instead of "Stanisław Ulam" (with a redirect of "Stanislaw Ulam" to the proper name) is both a disservice to the reader/researcher and is misleading, at best. When a reader is properly redirected to the name that includes proper diatrics, the user is further educated as to the real name without argument of form, type, usage, or any other argument that "popularity contests" present within wikipedia. Yes, most readers will not know what diatric to use (if any), that is what redirect is so useful for. As a geographer, I would rather have reference to the proper name in order to 'get it right the first time' then to wonder what the name really is (there is ongoing debate/clarification with regard to Mongolian names right now). Yes, one could argue that "Stanislaw Ulam" is correct in English - but actually it isn't. The proper translation, if we are proposing such usage, is "Stanley Ulam" - this, of course, takes away from the actual name and is the disservice that is being mentioned. Therefore, I think the best approach is to use such diatrics. Rarelibra 19:40, 14 February 2007 (UTC)
- That is not necessarily a valid argument. I shorten my first name and never use my middle name. I have done this for my entire life without exception. My family, friends, and colleagues call me this. When I was younger, I was not even sure about the correct spelling of my first name since I had never used it. However, I have had many, many issues because it does not match the name on my birth certificate. That was life in the US. Now I live in Japan. And the same problems continue. My passport only recognizes the spelling on my birth certificate, and so that is my legal existence here as well. Now I write my name in kanji (based on the abbreviated first name and no middle name). My friends and colleagues respect this. However, getting the government to recognize it is just as moot as in the US. I certainly have a preference over how my name is spelled; but I can not get it officially recognized. It's something to accept and move on. Ulam has an official, legally recognized name. Anything else is a preference. Bendono 07:13, 15 February 2007 (UTC)
- We are not an official organization, and it is our policy to name articles by what the subject is best known as, in English. Thus Jimmy Carter and Madonna (entertainer), although neither is the legal name of the subject, and never has been; and the second
involves us withincludes a disambiguator. I doubt Jimmy Wales is the name on his passport either. Editors who disagree with policy should argue on the policy talk page, not here; I do not see the strong consensus required to change it. Septentrionalis PMAnderson 18:16, 15 February 2007 (UTC)
- We are not an official organization, and it is our policy to name articles by what the subject is best known as, in English. Thus Jimmy Carter and Madonna (entertainer), although neither is the legal name of the subject, and never has been; and the second
- That is not necessarily a valid argument. I shorten my first name and never use my middle name. I have done this for my entire life without exception. My family, friends, and colleagues call me this. When I was younger, I was not even sure about the correct spelling of my first name since I had never used it. However, I have had many, many issues because it does not match the name on my birth certificate. That was life in the US. Now I live in Japan. And the same problems continue. My passport only recognizes the spelling on my birth certificate, and so that is my legal existence here as well. Now I write my name in kanji (based on the abbreviated first name and no middle name). My friends and colleagues respect this. However, getting the government to recognize it is just as moot as in the US. I certainly have a preference over how my name is spelled; but I can not get it officially recognized. It's something to accept and move on. Ulam has an official, legally recognized name. Anything else is a preference. Bendono 07:13, 15 February 2007 (UTC)
We haven't called for a vote. Start up a vote for consensus and let's see what happens. Rarelibra 18:26, 15 February 2007 (UTC)
- Why? This obscure page cannot warrant a policy violation; that's what policy is for. Just another reason m:voting is evil. Septentrionalis PMAnderson 18:37, 15 February 2007 (UTC)
- First you state "... I do not see the strong consensus required to change it", next you state "... Why?" and quote a guideline (not policy). That is exactly why we need a vote, proper. Because you state you don't see consensus, so we will call for a consensus. Do things correctly. Rarelibra 18:41, 15 February 2007 (UTC)
- Voting is evil; and escapades like this are why. But WP:NAME#Use English words, and the following section (on common names) are policy. It cannot be validly changed or overriden here; go object there, and see how far you get. Septentrionalis PMAnderson 18:59, 15 February 2007 (UTC)
- You kill me. Voting is not evil. It is how consensus is established. Even your page of reference states It is important to note that these are conventions, not rules carved in stone. As Wikipedia grows and changes, some conventions that once made sense may become outdated, and there may be cases where a particular convention is "obviously" inappropriate. I like the 'see how far you get' comment. Is there a problem you aren't discussing, or are you always this crass? Rarelibra 19:09, 15 February 2007 (UTC)
- No, it is one way to demonstrate consensus, if it exists. This page alone demonstrates that Rarelibra does not speak for consensus. Voting tends to destroy a forming consenus; this is one reason (to quote policy again) Wikipedia is not a democracy. Changing WP:Name is the road to go here, and we will both see how far Rarelibra gets in doing so. I don't believe he will get anywhere; but let's see. Septentrionalis PMAnderson 20:25, 15 February 2007 (UTC)
- You kill me. Voting is not evil. It is how consensus is established. Even your page of reference states It is important to note that these are conventions, not rules carved in stone. As Wikipedia grows and changes, some conventions that once made sense may become outdated, and there may be cases where a particular convention is "obviously" inappropriate. I like the 'see how far you get' comment. Is there a problem you aren't discussing, or are you always this crass? Rarelibra 19:09, 15 February 2007 (UTC)
- Voting is evil; and escapades like this are why. But WP:NAME#Use English words, and the following section (on common names) are policy. It cannot be validly changed or overriden here; go object there, and see how far you get. Septentrionalis PMAnderson 18:59, 15 February 2007 (UTC)
- First you state "... I do not see the strong consensus required to change it", next you state "... Why?" and quote a guideline (not policy). That is exactly why we need a vote, proper. Because you state you don't see consensus, so we will call for a consensus. Do things correctly. Rarelibra 18:41, 15 February 2007 (UTC)
I have no idea why it is you think that I am supposed to go gallavanting off into the direction that you suggest. The best part is when you speak in the third person, as if someone isn't in the conversation. It is an attempt to distance that person, pschologically. To see them as an object and not real. Whether or not I choose an action will be on my terms, 'andy, not yours. Rarelibra 20:29, 15 February 2007 (UTC)
- As PMA says, proposals are not decided upon by voting on them (as described on {{proposed}}, WP:POL and WP:HCP). Perhaps this page simply needs more advertising, or a workable compromise. If it is a simple restatement of existing practice, then dislike of that practice is not a very strong argument against a guideline. >Radiant< 13:44, 19 February 2007 (UTC)
- Unfortunately, this page is closer to being the opposite to the existing practice, as a quick look around on Wikipedia shows. (I'd point to specific pages to demonstrate this but WP:BEANS forbids that.) I agree with Radiant that if it were a simple restatement of existing practice then we wouldn't need a vote to move it up to policy. But that is a very big if. Stefán 15:23, 19 February 2007 (UTC)
- I don't think that pointing out examples violates BEANS. If this is a restatement of the opposite of existing practice, then voting on it is not going to accomplish anything either; instead, perhaps it could be reworded to show practice. Or we could deprecate the page and move on. >Radiant< 08:47, 20 February 2007 (UTC)
- Well you'd be amazed how suddenly people can become interested in rather obsure pages, such as Þorlákshöfn when they get mentioned with respect to naming conventions. The birthday cake at the top of this section was meant as an ironic suggestion that this page did not have much useful purpose. Stefán 17:53, 20 February 2007 (UTC)
- I don't think that pointing out examples violates BEANS. If this is a restatement of the opposite of existing practice, then voting on it is not going to accomplish anything either; instead, perhaps it could be reworded to show practice. Or we could deprecate the page and move on. >Radiant< 08:47, 20 February 2007 (UTC)
- Unfortunately, this page is closer to being the opposite to the existing practice, as a quick look around on Wikipedia shows. (I'd point to specific pages to demonstrate this but WP:BEANS forbids that.) I agree with Radiant that if it were a simple restatement of existing practice then we wouldn't need a vote to move it up to policy. But that is a very big if. Stefán 15:23, 19 February 2007 (UTC)
- The best part is that they properly referenced the "Thorlakshafn" usage to go to the page with the diatrics. That is correct and extremely education to any user. Rarelibra 18:03, 20 February 2007 (UTC)
It's obvious that (a) using diacritics in article titles is the standard practice and (b) that there's no consensus for dropping them. In addition, we have examples of reference works that include them (e.g. my 1989 Unabridged Webster's from Gramercy Books, New York, includes entries like "Čapek, Karel" and "Andrić, Ivo". We also have plenty of low-key move wars going over this issue, and this proposed guideline is often cited as the reason for moves (e.g. [2]). I think it's time that this proposal be retired, and either marked as rejected or changed to reflect the actual practice. Zocky | picture popups 17:48, 19 March 2007 (UTC)
- Agree. - Francis Tyers · 17:52, 19 March 2007 (UTC)
- Agree, too. -- memset 18:54, 19 March 2007 (UTC)
- Yeah, Zocky got it right. This proposed guideline would only have worked if it had been de facto enforced by most people, which it wasn't. Khoikhoi 20:34, 19 March 2007 (UTC)
- I doubt Zocky read the page, which makes specific exception of those diacritics which are routinely used in English. Both Čapek and Andrić are; and are fully compatible with this page. Septentrionalis PMAnderson 18:32, 20 April 2007 (UTC)
- There should never have been an attempt to drop diacritics. Rarelibra 21:22, 19 March 2007 (UTC)
- Rarelibra, as usual, is in opposition to the whole policy of using English names at all. He should be ignored. Septentrionalis PMAnderson 18:32, 20 April 2007 (UTC)
- Once again, it is amazing to see an intellect like Pmanderson resort to personal attack. Immature, at best. Rarelibra 19:13, 20 April 2007 (UTC)
- Please, both of you, step back and shake hands. No need to quarrel over a rejected policy, really.-- Piotr Konieczny aka Prokonsul Piotrus | talk 20:02, 20 April 2007 (UTC)
- Well, I'm perfectly willing to shake hands, once; I do deny that it's been rejected. Septentrionalis PMAnderson 22:48, 20 April 2007 (UTC)
- Please, both of you, step back and shake hands. No need to quarrel over a rejected policy, really.-- Piotr Konieczny aka Prokonsul Piotrus | talk 20:02, 20 April 2007 (UTC)
- Once again, it is amazing to see an intellect like Pmanderson resort to personal attack. Immature, at best. Rarelibra 19:13, 20 April 2007 (UTC)
- Rarelibra, as usual, is in opposition to the whole policy of using English names at all. He should be ignored. Septentrionalis PMAnderson 18:32, 20 April 2007 (UTC)
Rejected... really?
I wonder if we can salvage some parts that are either consensus or common practice. For example, it seems that we allow diactrics to be used on Wiki, in titles and articles. If this is the case, shouldn't we simply state it? Perhaps not in a separate policy, but make it clear in other policies that the common practice on Wiki is to use them?-- Piotr Konieczny aka Prokonsul Piotrus | talk 20:03, 20 April 2007 (UTC)
- Agreed. There is a very regular and accepted use of diacritics that can be documented. Rarelibra 20:04, 20 April 2007 (UTC)
- I'll also add that WP:POLICY clearly states that codification of current convention and common practice is a common way a policy is created.-- Piotr Konieczny aka Prokonsul Piotrus | talk 21:26, 20 April 2007 (UTC)
- ?? Rarelibra 21:43, 20 April 2007 (UTC)
- To quote in full:
- The codification of current convention and common practice. These are proposals that document the way Wikipedia works. Of course, a single user cannot dictate what common practice is, but writing down the common results of a well-used process is a good way of making policy.
- In large part, this page does document the way WP works; it has a definite emphasis, but what it says that we use diacritics when English does, and when it doesn't, we don't. We say this elsewhere, so the adoption of this guideline is not urgent; which is why it hasn't been. Septentrionalis PMAnderson 22:52, 20 April 2007 (UTC)
- Let me ask you two questions about the following quote from the proposal in a nutshell: If it is not clear what "common usage" is, then the general Wikipedia guideline is to avoid use of diacritics in article titles.
- Would the wording If it is not clear what "common usage" is apply to cases when several reliable sources can be found which use diacritics and several reliable sources can be found which do not?
- If the answer to the first question is yes, would you say that in cases such as these, it really is general practice on Wikipedia to avoid the use of diacritics?
- Stefán 23:59, 20 April 2007 (UTC)
- Stefan - Pmanderson would have NO diacrits used whatsoever. But I agree with you - there are cases where reliable sources prove the use, which points to a "common usage" thereof. There is no harm in use, period. Rarelibra 04:27, 21 April 2007 (UTC)
- Diacritics of various kinds are in widespread and endemic use on the English Wikipedia. Any attempt to suggest otherwise is not fated to succeed. Conventions should document practice and, in practice, diacritics are used. --Stemonitis 06:20, 21 April 2007 (UTC)
- I fully agree; again, since I have done so just above. So does this page. Septentrionalis PMAnderson 20:46, 21 April 2007 (UTC)
- I don't quite understand that last, ungrammatical, entry. Pmanderson, there is no established of dropping accents where they belong, and you are the only person who has claimed in the last few days that this proposal hasn't been rejected. Every other editor considers it to be rejected. The consensus view is that it has been rejected. It has, therefore, been rejected. There is no practice of omitting diacritics, and no desire to restrict the use of diacritics. Bereft of life, it rests in peace; this is a Dead Proposal. --Stemonitis 21:03, 21 April 2007 (UTC)
- Elliptical, yes; ungrammatical, no: I fully agree with Stremonitis' previous post; in fact, I agree again, since I have done so before. This page also agrees with that post. Dropping accents where English does not use them is both practice and policy, and remains so, independently of this page. A handful of wilfull editors cannot change that. Septentrionalis PMAnderson 21:24, 21 April 2007 (UTC)
- I don't quite understand that last, ungrammatical, entry. Pmanderson, there is no established of dropping accents where they belong, and you are the only person who has claimed in the last few days that this proposal hasn't been rejected. Every other editor considers it to be rejected. The consensus view is that it has been rejected. It has, therefore, been rejected. There is no practice of omitting diacritics, and no desire to restrict the use of diacritics. Bereft of life, it rests in peace; this is a Dead Proposal. --Stemonitis 21:03, 21 April 2007 (UTC)
- I fully agree; again, since I have done so just above. So does this page. Septentrionalis PMAnderson 20:46, 21 April 2007 (UTC)
- Diacritics of various kinds are in widespread and endemic use on the English Wikipedia. Any attempt to suggest otherwise is not fated to succeed. Conventions should document practice and, in practice, diacritics are used. --Stemonitis 06:20, 21 April 2007 (UTC)
- Can I ask Septentrionalis to reply to my questions above (and other editors not to put words in his mouth. Stefán 06:20, 22 April 2007 (UTC)
- Stefan - Pmanderson would have NO diacrits used whatsoever. But I agree with you - there are cases where reliable sources prove the use, which points to a "common usage" thereof. There is no harm in use, period. Rarelibra 04:27, 21 April 2007 (UTC)
- Let me ask you two questions about the following quote from the proposal in a nutshell: If it is not clear what "common usage" is, then the general Wikipedia guideline is to avoid use of diacritics in article titles.
- To quote in full:
- ?? Rarelibra 21:43, 20 April 2007 (UTC)
- I'll also add that WP:POLICY clearly states that codification of current convention and common practice is a common way a policy is created.-- Piotr Konieczny aka Prokonsul Piotrus | talk 21:26, 20 April 2007 (UTC)
(left) That does require clarification. It depends on the numbers, and the reliability of attestation on both sides. For example, English texts can certainly be found using Nürnberg, but the usage of Nuremberg is both more common and more recognizable; and the article is, properly, at Nuremberg. I would take If it is not clear what "common usage" is quite narrowly, as applying to the occasional instance where the usage is fairly evenly divided, sufficiently so that it is unclear what "common English usage" is; usually because the subject is not widely mentioned in English; 3-2 may be 60%, but it is statistically meaningless. Here I think there is a tendency to avoid diacritics; two examples (both of which I know about because I did argue for diacritics and lost), are Aetius, not Aëtius, and Pilsen, not Plzeň. Septentrionalis PMAnderson 22:43, 22 April 2007 (UTC)
- Interesting examples. I'm actually tempted to pull a Þorlákshöfn here and move Aetius to Aëtius since that seems to be the general sense of the talk page discussion there. Indeed most of the articles the page disambiguates do have the diacritic. As for the Nürnberg/Nuremberg and Plzeň/Pilsen cases I think they're a bit different in that the (putative) English names differ from the native ones in more than diacritics. Haukur 00:24, 23 April 2007 (UTC)
- I'd like to point out, as supplementary evidence, that articles with names that derive from the Indian languages usually (but not always) drop diacritics from the article title, even when using them in the body of the article. Also, Chinese terms mentioned on Wikipedia rarely include the diacritics which indicate tone, except sometimes upon first use. My general impression is that this happens because the relevant languages are not typically written in the latin alphabet for native communication; when the latin alphabet is the main script of a language, there is a usually a constituency which demands inclusion of all the diacritics (practice with regard to Vietnamese is inconsistent, though).—Nat Krause(Talk!·What have I done?) 00:34, 23 April 2007 (UTC)
- I'm not sure whether Nuremberg is that far away from this discussion: there are two central themes: English usage against local convention, and the fact that the local convention involves a diacritic. The first can involve languages which are primarily written in a non-Roman alphabet(Ukrainian, as with Kyiv; Serbian; Tibetan), and some of those have ciacritics also. Septentrionalis PMAnderson 04:05, 23 April 2007 (UTC)
- Thanks for the reply. For the Nuremberg example, I would agree that it seems pretty clear that this is the common English usage. To justify that we should use Nuremberg as the title, I would have appealed to the use English convention, which I certainly agree with. I was under the impression that this proposed guideline only dealt with cases when such as Ubeda/Úbeda (flashback horror) when the suggested names are identical except for the inclusion of diacritics.
- However, you say that that may not be such an important distinction, so I would like to discuss the other example you take in a little more detail. You mention a Czech city. Looking at the Category:Cities and towns in the Czech Republic I notice that there seem to be of the order of a hundred towns and cities there, whose articles use diacritics, and it seems that all of them follow the rule that if the Czech name is used (because there is no more common English name) then it has the appropriate diacritics. My guess is, that in each case you could find support for using the diacritics or leaving them out, eg. googling suggests some English sites write Nový Jičín while others use Novy Jicin. Therefore, I would say, all, or at least, most, of these names fall into the category If it is not clear what "common usage" is, and with this understanding of that phrase, it seems very far from the truth that it really is the general Wikipedia guideline [] to avoid the use of diacritics, when faced with these 100 or so examples. Stefán 17:49, 23 April 2007 (UTC)
- I'm not sure whether Nuremberg is that far away from this discussion: there are two central themes: English usage against local convention, and the fact that the local convention involves a diacritic. The first can involve languages which are primarily written in a non-Roman alphabet(Ukrainian, as with Kyiv; Serbian; Tibetan), and some of those have ciacritics also. Septentrionalis PMAnderson 04:05, 23 April 2007 (UTC)
- Unless in English texts Funny Foreign Squiggles are commonly used, we should not use them in article names. This is in the current naming conventions and despite some writing articles and ignoring the conventions, we should keep it that way.--Philip Baird Shearer 19:11, 23 April 2007 (UTC)
- Am I correct in summarising your position as saying that most of these 100 or so articles on Czech cities and towns I mentioned should not use the squiggles in their title and that the editors responsible are ignoring Wikipedia conventions? Stefán 19:37, 23 April 2007 (UTC)
- I dot so that PBS can give his own answer; but to me this again seems a question of fact: what are the proportions? We should use Prague, not Praha; because it's common English usage. For the same reason, we should use the diacriticless spelling for Czech sports figures or other international figures for whom it is hardly ever used in English. The clearest example I know of here is a Pole, Stanislaw Ulam, whose autobiography consistently spells the name thus, not Stanisław. (I disregard the argument of a disruptive minority that Stanisław is somehow his real name, and we can use no other.) Septentrionalis PMAnderson
- On the other hand, we should, and do, use Edvard Beneš; English does. This scholar.google.com search on Nový Jičín suggests that it is normal practice to spell it with diacritics, and we should do so. (This page, however, suggests that actual Czechs, attempting to advertise in English, don't mind dropping the diacritics anywhere near as much as our Wikipedian nationalists do.) Where results are fairly evenly divided, the reasoning on this page would argue that we should not use diacritics, because they are inconvenient for readers and editors alike. Septentrionalis PMAnderson 20:45, 23 April 2007 (UTC)
- Well, the non-diacritics version seems also to be used. In any case, the point I was trying to make with this exercise is that in my opinion, these 100 or so articles would fall into the category If it is not clear what "common usage" is and with that definition of these categories, it certainly does not seem the current practice on Wikipedia to leave off the diacritics. Therefore, I cannot accept the argument that this proposed policy is somehow just stating the current practice. Stefán 21:04, 23 April 2007 (UTC)
- Am I correct in summarising your position as saying that most of these 100 or so articles on Czech cities and towns I mentioned should not use the squiggles in their title and that the editors responsible are ignoring Wikipedia conventions? Stefán 19:37, 23 April 2007 (UTC)
- Unless in English texts Funny Foreign Squiggles are commonly used, we should not use them in article names. This is in the current naming conventions and despite some writing articles and ignoring the conventions, we should keep it that way.--Philip Baird Shearer 19:11, 23 April 2007 (UTC)
- If there is no common English spelling then they can blaze the way. But if there is then the English spelling should be used. I would point out that it a town is not used in any English language publication it is unlikely that it is notable enough to have an article here on Wikipedia (OR). The Economist has an interesting take on this (see Economist Style Guide: Accents). It assumes that a person educated enough to read their missives should be expected to understand French German, Spanish and Portuguese squiggles but not any others unless they are in italics (which in the Wikipedia name of a article is not relevent). --Philip Baird Shearer 21:18, 23 April 2007 (UTC)
- I think we can safely assume that any town which has a wikipedia article will have been mentioned in some English text. But sometimes those sources would not be the ones used to write the article. Also, if the practice on Wikipedia was along the lines of the Economist style guide, then so be it. I am just trying to point out that while this isn't the practice, then we cannot just write in a guideline that this should be the practice, unless there is some sort of an effort to demonstrate that this change of practice has consensus. Stefán 21:32, 23 April 2007 (UTC)
- This is not a change of practice; it is what we actually do. The only major difference between this page and WP:NAME ia the default condition on ties, and that seems amply justified by the inconvenience of diacritics to most of our audience. Septentrionalis PMAnderson 03:45, 24 April 2007 (UTC)
- In my opinion, what you describe as the only major difference represents a massive shift in Wikipedia practice. You can of course argue that such a shift would be beneficial but my point is that you should be required to justify that such a shift has great support among editors. Stefán 04:48, 24 April 2007 (UTC)
- This is not a change of practice; it is what we actually do. The only major difference between this page and WP:NAME ia the default condition on ties, and that seems amply justified by the inconvenience of diacritics to most of our audience. Septentrionalis PMAnderson 03:45, 24 April 2007 (UTC)
- I think we can safely assume that any town which has a wikipedia article will have been mentioned in some English text. But sometimes those sources would not be the ones used to write the article. Also, if the practice on Wikipedia was along the lines of the Economist style guide, then so be it. I am just trying to point out that while this isn't the practice, then we cannot just write in a guideline that this should be the practice, unless there is some sort of an effort to demonstrate that this change of practice has consensus. Stefán 21:32, 23 April 2007 (UTC)
- On an article on something Polish (for example) it is often of interest to Polish editors and some native English speakers. So when there is a debate on the name Polish editors naturally feel more comfortable with Polish squiggles, and as they make up the majority of editors of that article the article gets squiggles. However that is not optimising the name for readers over editors and it does not mean that the majority of editors who have accounts on Wikipedia agree with the article name having squiggles, I suspect it is just that they do not have the page on their watch list and have better things to on Wikipedia than do than argue over such matters. The question that foreign editors should be asking themselves is what name would most native English speakers who read this article prefer not what name they and their fellow none native editors prefer. But back to this article: I think that the guidelines already in place cover all this so this proposed guideline is largely unnecessary. --Philip Baird Shearer 08:45, 24 April 2007 (UTC)
- Wikipedia very strongly encourages readers to become editors. As a result, we can expect that the group of people who edit an article will resemble the group of people who read the article. Thus our article on Łódź should try and fulfill the needs of the people who read the article, we don't need to pay so much attention to people who are not interested in the article at all. Also, there is no requirement for readers of the English wikipedia to be native English speakers, like it nor not, English is a very important second language and the English wikipedia is a valuable resource to a great number of readers who are not native English speakers. Then I would simply repeat the first sentence of this reply, but also note that it is far from true that it is only foreigners who prefer the inclusion of diacritics. Stefán 17:04, 24 April 2007 (UTC)
- When I checked Łódź had been edited 501 times, of which, on average, each editor had made about two edits. 112 of those editors were IP addresses so I do not know the exact numbers, but I would hope (for the sake of those 200+ editors and vandals) that far more than 200+ people have viewed the page! Also, in reply to your comments on readers for whom English is a second language: I would have thought that for readers who's native language is not English (and therefore who may not be very familiar with the Latin alphabet), to impose on them words which use letters over and above those usually needed to read English is an unnecessary burden - unless the word is usually written that way in English texts. --Philip Baird Shearer 18:20, 24 April 2007 (UTC)
- I believe Stefan's point was that the editors will be representative of the readers, not that only the editors will read the article. Also, the problem of "burden" is over-stated. The use of accents is generally far simpler for a non-native speaker of English to understand than the remainder of English orthography, which we have no qualms about inflicting upon them. The point remains that the vast majority of editors (and by extension readers) are entirely comfortable with using accented characters and ligatures where appropriate, which is what this discussion is really about. --Stemonitis 18:39, 24 April 2007 (UTC)
- And this is not germane; this Wikipedia is intended for editors who choose to read English, primarily native speakers. Anyone who has trouble reading English will choose some other Wikipedia to begin with; there are plenty. Septentrionalis PMAnderson 20:09, 24 April 2007 (UTC)
- I believe Stefan's point was that the editors will be representative of the readers, not that only the editors will read the article. Also, the problem of "burden" is over-stated. The use of accents is generally far simpler for a non-native speaker of English to understand than the remainder of English orthography, which we have no qualms about inflicting upon them. The point remains that the vast majority of editors (and by extension readers) are entirely comfortable with using accented characters and ligatures where appropriate, which is what this discussion is really about. --Stemonitis 18:39, 24 April 2007 (UTC)
- How do you extrapolate that the readers are a superset of the editors? Surly a higher proportion of the editors who edit the Łódź are more likely to be Polish, than those who edit Lord's Cricket Ground? However surly most Polish readers will choose to read not ? --Philip Baird Shearer 19:39, 24 April 2007 (UTC)