Wikipedia:Naming conventions (Tibetan)/proposal 2

Proposed naming convention

Determine what the common, widely-used name of the subject is, and use that
If step one proves infeasible or requires an inordinate amount of guesswork, then use the system of phonetic romanisation described below, which represents the pronunciation of Lhasa and its near environs
With the possible exception that, if it can be established that a name in some other Tibetan dialect would be more appropriate, and we know, verifiably, what the relevant pronunciation is, use a spelling based on the local pronunciation (no examples spring to mind, however)

The gist of the above is that the convention call for conventional names to be used when identifiable (Tenzin Gyatso, Shigatse, Lhasa, etc.) and that the proposed romanisation system is used as fallback when no conventional name can be identified (including quotations of ordinary words and phrases in Tibetan).

An unanswered question

Considering that many Tibetan names are commonly used by many different people (just like English names such as Mary, John, or Jones), should each name word have an accepted spelling, rather than each person's name? For instance, many personal and place names include the word bkra-shis, which is often romanised as "tashi" (Tashilhunpo, Gonpo Tashi, Tashi Namgyal). However, I doubt any systematic spelling would produce the spelling "tashi" for this word. Suppose there is an obscure, 19th century person named Bkra-shis; if we stipulate that there is no accepted spelling of his name (i.e. no accepted spelling for this word in reference to him), should it nevertheless be spelled "Tashi" or should it conform to the system?

Goals of a romanisation system

1. It should represent accurate phonology. This requires that some kind of standard target pronunciation is available; in this case, I have assumed the standard described by Tournadre and Sangda Dorje in Manual of Standard Tibetan. A romanisation system is a "practical orthography"; it is practical in the sense that it uses a relatively small number of characters, compared to the International Phonetic Alphabet. However, an ideal romanisation system will still include all or almost all of the information which would appear in an IPA transcription.

2. It should imply accurate pronunciation to readers in the target language. This goal will, in theory, never be fully realised, since a reader encountering an unfamiliar word from a foreign language is unlikely to pronounce it exactly as a native speaker of that language would, no matter how it is spelled. However, if this goal were ignored completely, we could simply spell everything with random letters: for instance, "Lhasa" might be spelled "Nqvq". An ideal romanisation system will allow a reader unfamiliar with the contours of the language at least a fighting chance at guessing a pronunciation that would be somewhat intelligible to the people who speak the language.

3. It should match as much as possible the existing conventional spellings that readers are likely to have encountered. The best case is one where the spelling derived from the system is identical to the conventional spelling. This is clearly not possible in every case (especially since some names have a multiple conventional spellings), but it is preferable to make the differences relatively minor, in order to make it more possible for readers to recognise the same word in two different spellings, or to recognise that two words have similar pronunciations.

3a. A special subset of the previous which is relevant in this case is that it is preferable to maintain continuity with Wylie spellings. Wylie transliteration serves a different purpose than a system of phonetic spelling, and it is unlikely that Wylie is going anywhere (that seems to be about the only thing that Tibetologists can agree on, although one does occasionally see something else in use; for instance, I recently read a paper in which the author used a system based on Sanskrit IAST romanisation, writing, for example, "Rñiṅ-ma" instead of "Rnying-ma"). The Tournadre simplified system^[1] is the exemplar of this goal: it is basically just Wylie with silent letters removed and with umlauts and aspiration indicated where necessary.

Consonants and vowels

Note: examples below show what the spellings of the words would be if they were all written according to the proposed system, although I am not actually suggested that they be written that way if there is established conventional spelling.

proposed	IPA	Tibetan Pinyin	Tournadre	Wylie equivalents	Example
a	a	a	a	a	Lama
e	e	ê	e	e	Dorje
ê	ε	ai	ä	ad, al, as, a’i	Drêbung
i	i	i	i	i	Nyingtri
o	o	o	o	o	Chamdo
ö	ø	oi	ö	od, ol, os, o’i	Pö’
u	u	u	u	u	Nagchu
ü	y	ü	ü	ud, ul, us, u’i	Gendün
k	kʰ	k	kh, gh	kh, khw, mkh, ’kh, g, gw	Kênbo
g	k^[2]	g	k, g	k, rk, lk, sk, kw, dk, bk, brk, bsk, rg, sg, dg, bg, brg, bsg, lg, mg, ’g	Gelug
ky	cʰ	ky	khy, ghy	khy, mkhy, ’khy, gy	Tilgo Kyêndze
gy	c	gy	ky, gy	ky, rky, lky, sky, dky, bky, brky, bsky, dgy, bgy, brgy, bsgy, mgy, ’gy	Gyandze
ng	ŋ	ng	ng	rng, lng, sng, dng, brng, bsng, mng, ng	Ngari
ch	tɕʰ	q	ch, jh	ch, mch, ’ch, j, phy, ’phy, by	Chögyi Nyima
j	tɕ	j	c, j	c, cw, gc, bc, lca, py, dpy, rj, gj, brj, dby, lj, mj, ’j, ’by	Jênrêsig
ny	ɲ	ny	ny	rny, sny, gny, brny, bsny, mny, nyw, ny, my	Nyingma
t	tʰ	t	th, dh	th, mth, ’th, d, dw	Tubdên Gyatso
d	t	d	t, d	t, rt, lt, st, tw, gt, bt, brt, blt, bst, bld, rd, sd, gd, bd, brd, bsd, zl, bzl, ld, md, ’d	Dêndzin Gyatso
tr	ʈʰ	ch	thr, dhr	khr, thr, phr, mkhr, ’khr, ’phr, gr, dr, br, grw	Chögyam Trungba
dr	ʈ	zh	tr, dr	kr, tr, pr, dkr, dpr, bkr, bskr, bsr, dgr, dbr, bsgr, sbr, mgr, ’gr, ’dr, ’br	Drashilhünbo
n	n	n	n	rn, sn, gn, brn, bsn, mn, n	Nênang Bawo
p	pʰ	p	ph, bh	ph, ’ph, b	Tsurpu
b	p	b	p, b	p, sp, dp, lp, rb, sb, db, sbr, lb, ’b	Bênchen Lama
m	m	m	m	rm, sm, dm, m, mr	Milarêba
ts	tsʰ	c	tsh, dzh	tsh, tshw, mtsh, ’tsh, dz	Namtso
dz	ts	z	ts, dz	ts, rts, sts, tsw, gts, bts, brts, bsts, rdz, gdz, brdz, mdz, ’dz	Dzongkaba
w	w	w	w	w, db	Wangchug
y	j	y	y	g.y, y	Yangbajên
r	r	r	r	r, rw	Rangjung Dorje
l	l	l	l	kl, gl, bl, rl, sl, brl, bsl, l, lw	Litang
lh	l̥	lh	lh	lh	Lhasa
sh	ɕ	x	sh, zh	sh, shw, gsh, bsh, zh, zhw, gzh, bzh	Shigadze
s	s	s	s, z	s, sr, sw, gs, bs, bsr, z, zw, gz, bz	Sera
ṣ	ʂ	sh	rh	hr	Buṣang
h	h	h	h	h, hw	Hor
’ (g)^[3]	ʔ	0 (g)	0 (k)	d, s (g, gs, k)	Trinlê’

Tones

Central Tibetan has two basic tones: low and high. In one-syllable words, each tone has two types (depending on which consonant, if any, appears at the end of the word), for a total of four tones. However, this additional distinction is, with rare exceptions, unnecessary to understand the word, so it is not usually necessary to indicate it in writing. We can write the high tone with an acute accent (á) and the low tone with a grave accent (à). Since tone is a relatively minor part of the Tibetan sound system, I anticipate that there will be many cases where we prefer not to include the tone marks at all, and this should be considered an acceptable variation.

The only exceptional cases where four tones need to be distinguished are cases where a word-final "m" or "ng" sound is followed by an "implicit" [ʔ], meaning that the [ʔ] is not actually pronounced, but it nevertheless affects the pronunciation of the tone. For instance, Khams (the Tibetan province) and kham ("piece") are both pronounced kʰam with a high tone, but Khams has a silent, implicit ʔ sound on the end, which results in these two words having tones that sound different. Therefore, to be entirely correct, when one includes tone marks, Khams would be written as Kám’ ... but, without tone marks, it is simply Kam. In this case, the "’" is a tone mark, not a consonant.

Vowel length

Vowel length is marginally phonemic in Central Tibetan. The sounds [l] and [r], when appearing at the end of a syllable, cause the preceding vowel to be lengthened (often the [l] or [r] is not actually pronounced at all). According to Tournadre and Sangda Dorje, the only other common situation in which long vowels occur is when a word ending in a vowel is suffixed with ’i (འི, a common grammatical suffix). In these cases, we can indicate the long vowel simply b including the i after it; for example, Ràngjung Rìgbêi Dòrje or Bếndên Dếnbêi Nyìma. In any instance where long vowels occur (there are sometimes long vowels in Sanskrit loan words, although it is unclear to me whether the distinction is ever actually pronounced), we should indicate it using a macron.

Notes

"floating nasal" sounds should be added when they are commonly pronounced (e.g. the first n in Gendün or the first m in Gumbum [Kumbum]). There is a fairly consistent pattern to the appearance of these sounds, but it must nevertheless be decided case-by-case on the basis of actual pronunciation. Err on the side of not including additional sounds.
with regard to the voiced stop and affricate consonants of ancient Tibetan (viz b, d, dz, g, gy, and j) the chart only refers to their use as the first sound in a word. When these sounds occur in non-initial syllables, they should be always be transcribed b, d, dz, g, gy, and j, respectively; never, p, t, ts, etc. This is because, according to Tournadre, p and pʰ, etc. are actually pronounced the same in non-initial syllables, so we should show etymology instead. Thus, Chu-bzhi-sgang-drug is "Chushi Gangdrug", not "Chushi Gangtrug" and Mkhas-grub is "Kêdrub’", not "Kêtrub’".
Syllable splitting: the use of digraphs works for modern Tibetan because there are quite few consonant clusters, and only a small range of consonants can appear last in a syllable. There will be a few cases where the syllable break needs to be shown in order to clarify pronunciation. This should be done by inserting a hyphen. Without a hyphen, assume that:
- "ng" between two vowels means the sounds [n] and [g] separately. Thus "Trangu" = "Tran-gu". If "Tra-ngu" or "Trang-u" is intended, it must be written out with the hyphen. Likewise for "ny".
- "gy" is always the digraph "gy", unless otherwise indicated. Thus, "Sagya" = "Sa-gya". If "Sag-ya" is intended, it must be written out with the hyphen.
Known irregular pronunciations should be taken into account, provided that they are valid in terms of standard pronunciation. For instance, the famous monastery (and jargon term) bla-brang should be pronounced [laʈaŋ] according to regular sound changes from classical Tibetan to modern, but instead it seems to be pronounced [lapraŋ] in the Lhasa dialect. Therefore, the spelling "Labrang" is appropriate.

Controversial points

Use of the Latin alphabet's "voiced stop/affricates" letters to represent Tibetan's unaspirated stops and affricates; and the corresponding use of the Latin alphabet's "voiceless stop/affricates" letters to represent Tibetan's aspirated stops and affricates. This will probably be the most conspicuous difference between the proposed systematic spelling and most of the conventional spellings, because, in Wylie, some of these letters are put to a different purpose and this tends to carry over into the conventional spellings.

The important point here is that classical Tibetan (which Wylie describes) had a different system of stops and affricates than modern Lhasa dialect does. Classical Tibetan had a three-way system with "voiceless, aspirated", "voiceless, unaspirated", and "voiced" sounds, as shown on this table below:

Voiceless, aspirated	Voiceless, unaspirated	Voiced
kh	k	g
khy	ky	gy
th	t	d
ph	p	b
ch	c	j
tsh	ts	dz

Modern Tibetan, however, simplifies this to a two-way system, distinguishing only between "aspirated" and "unaspirated" consonants (the unaspirated consonants are sometimes voiced), as shown here:

Aspirated	Unaspirated
k	g
ky	gy
t	d
p	b
ch	j
ts	dz

All European languages (except for Romany) have a two-way system like modern Tibetan. This is why the Roman alphabet, which was devised to write Latin, has a different letter for "p" and "b", but not for "ph" (ancient Greek did have a three-way system, which is why there are triads of letters such as β, π, φ, although in modern Greek these letters are no longer pronounced according to this pattern).

This proposal takes our alphabet's "voiceless" consonants—k, t, and p (ch being a bit of an exception)—which represent the "voiceless, unaspirated" sounds in Wylie and uses them to represent the "aspirated" sounds of modern Tibetan. There are a few reasons for this. First, Wylie's "th", "ph", and "tsh" are among the most readily mispronounceable of its spellings (granted "th" is only a problem for English or Albanian speakers, as well as those who are habituated to English orthography, but "ph" is used for [f] in a variet of European languages, and in Vietnamese). "Kh" is not particularly problematic, but I don't see any reason to preserve when the "kh" - "th" - "ph" pattern is removed.

Wylie	proposed	Tournadre/ Sangda Dorje phonetic notes
kh	k	kh
th	t	th
ph	p	ph
k, g	g	k
t, d	d	t
p, b	b	p

In the phonetic transcriptions in their book, Tournadre and Sangda Dorje use "kh" - "th" - "ph" and "k" - "t" - "p" instead of "kh" - "th" - "ph" / "k" - "t" - "p", which means they closely follow the Wylie spelling (see the table at right). However, in context of (in particular) English, using "k", "t", "p" to represent an unaspirated [k], [t], [p] is just plain wrong, and will make it harder for our readers to guess the correct pronunciation (this is why English speakers find pre-Pinyin spellings of Chinese to be "wrong": for instance, "t" in tao is pronounced as a "d"—this is the same sound as the "t" in potala or terton). I suspect that using these letters for these sounds will also seem wrong to speakers of German and other Germanic languages, and it will definitely seem wrong to any Chinese people who are familiar with pinyin (i.e. everyone in the mainland under the age of, say, 40). I am told that the Romance languages tend to use much less aspiration than English does, so "k" - "t" - "p" for unaspirated sounds might more sense to French, Spanish, etc. speakers (Tournadre is French). However, I'm dubious about the idea that this spelling would be particularly preferable for Romance speakers, since, even if they have voiceless unaspirated sounds, they do not contrast them with aspirated sounds (i.e., they have [t] but not [tʰ]). As far as I know, out of the major languages which are written with the Roman alphabet, only Vietnamese makes that distinction.

In addition, since modern Tibetan has a two-way stop system and the Latin alphabet provides us with two letters for each case, it seems uneconomical not to use them. "k" - "t" - "p" / "g" - "d" - "b" gives us one sound per phoneme (which is preferable, although this proposal doesn't give a lot of weight to that goal overall). Using "ky" instead of "khy", "ts" instead of "tsh", and "tr" instead of "thr" avoids unnecessary trigraphs.

Lastly, this usage matches the way these sounds are treated in Tibetan Pinyin, which enhances interoperability between the two.

Use of "ê" for the ε sound. I must admit that I have not seen this exact letter used for this Tibetan sound elsewhere. Since, by etymology, it was originally an a sound, Tournadre and Tibetan Pinyin prefer to use "ä" and "ai", respectively. However, in conventional spellings, it seems to me that this sound almost always becomes simply "e" (except that, at the end of words, it is often "ey"). For instance, the conventional spellings "Drepung", "Tenzin", "Thubten", "Gyaltsen" are all indicating this sound; basically, they are not distinguishing between this and the [e] sound that appears in words like "Gendün" and "Dorje". In order to stay closer to the conventional spellings while still distinguishing these , we can use e with a diacritic. Note that this is exactly the same sound that "ê" represents in French.

Use of "dr" and "tr" for [ʈ] and [ʈʰ] (the retroflex stops), respectively. This sound (taking "tr" as the example for convenience) seems to vary back and forth in convention spellings between "t" and "tr". For instance, "Trinley", "Ganden Tripa", "Tashilhunpo, "tulku", "tashi delek", etc. are all words that begin with the retroflex stop. In fact, this difference in spelling reflects a difference in pronunciation. Roughly speaking, a relatively prestigious Lhasa accent will make this sound an affricate instead of a stop: [ʈʂ] (this is the same difference between "t" and "ts"), whereas in the surrounding areas and in lower-class Lhasa accents, the sound is not affricated. Tournadre and Sangda Dorje describe it as "lightly affricated". I think that the English pronunciation of "tr" in words like "tree" roughly approximates the affricate sound [ʈʂʰ], but doesn't approximate the stop sound [ʈʰ] very much. The problem is, no other sound in English does either. The closest would be "t" or "d", which is why we do see spelling (Tashilhunpo, etc.) which do use those letters. Of course, these letters are in use for the normal "t" and "d" sounds (as in Tubdên or Dorje). The closest alternative would be to use the dot-below mark, which indicates this sound in Sanskrit IAST. For example, "tashi delek" would be "ḍashi’ deleg" (instead of "drashi’ deleg"), "Trinley" would be "Ṭinlê’" (instead of "Trinlê’"), "tulku" would be "ḍülgu" (instead of "drülgu"), "Chushi Gangdruk" would be "Chushi Gangḍug" (instead of "Chushi Gangdrug"), etc. However, I decided to suggest the "tr" spelling instead, because 1) I think it is more common in conventional spellings (even though there are a few particularly prominent words with conventional spellings that use the "t" spelling); 2) it reduces the number of special characters needed to spell according to the system; 3) it shows etymology (all of the equivalent Wylie spellings have "r" on them, and this r was presumably pronounced historically).

A related point, although perhaps not quite a controversy, is how to spell the sound [ʂ] (the retroflex sibilant). This sound (which developed from the sequence "hr" in classical Tibetan) seems to occur quite rarely in modern Tibetan (the only occurrence I've found on Wikipedia so far is in the place name "Pu-hrang" mentioned in Guge; the article simply spells according to the Wylie system). Tournadre uses "rh" for this sound, but this is perhaps an awkward choice, since it could imply some sort of approximant sound or a voiced sound, both of which would be incorrect. Zangwen Pinyin uses "sh" for this sound ([ʂ] is the same sound heard in Chinese words such as "Shanghai"), which is a fine choice except that this proposal already uses "sh" to represent [ɕ]. Granted, it doesn't matter a lot how this is written, since it comes up so rarely, but I have suggested "ṣ" (s-with-dot-below), which is how this sound is represented in Sanskrit IAST.

Examples

Wylie	proposed	Tibetan Pinyin	Tournadre	other transcriptions
Gzhis-ka-rtse	Shigadze (Shìgadze)	Xigazê	Zhikatsé	Shigatse, Shikatse
Bkra-shis-lhun-po	Drashilhünbo (Dràshilhünbo)	Zhaxilhünbo	Trashilhünpo	Tashilhunpo, Tashilhümpo, etc.
’Bras-spung	Drêbung (Drềbung)	Zhaibung	Dräpung	Drepung
Chos-kyi Rgyal-mtshan	Chögyi Gyêltsên^[4] (Chö́gyi Gyềltsên)	Qoigyi Gyaicain	Chökyi Gyeltshen	Choekyi Gyaltsen
Thub-bstan Rgya-mtsho	Tubdên Gyatso (Túbdên Gyàtso)	Tubdain Gyaco	Thuptän Gyatsho	Thubten Gyatso, Thubtan Gyatso, Thupten Gyatso
Bya-bral Rin-po-che	Chadrêl Rinboche (Chàdrêl Rìnboche)	Qatrail(?) Rinboqê	Jhadräl Rinpoche	Jadrel, Jadral, Jatral, Chadrel, Chadral, Chatrel, Chatral Rinpoche

^ Please note that, as I mentioned here, Tournadre uses two different transcriptions in the same book. "Tournadre simplified" should be distinguished from the system used for phonetic notations in the body of Manual of Standard Tibetan
^ With occasional exceptions in very formal speech, when k appears at the end of a syllable, it is pronounced ʔ, Nevertheless, it should be transcribed with g
^ Parenthetical values reflect the final k, which, as noted above, is usually realised as ʔ. Note that Tournadre and Tibetan Pinyin do not indicate ʔ at all in other cases.
^ possibly Chögyi Gyaltsên or Chögyi Gyêntsên, depending on pronunciaition variations

[1] Please note that, as I mentioned here, Tournadre uses two different transcriptions in the same book. "Tournadre simplified" should be distinguished from the system used for phonetic notations in the body of Manual of Standard Tibetan

[2] With occasional exceptions in very formal speech, when k appears at the end of a syllable, it is pronounced ʔ, Nevertheless, it should be transcribed with g

[3] Parenthetical values reflect the final k, which, as noted above, is usually realised as ʔ. Note that Tournadre and Tibetan Pinyin do not indicate ʔ at all in other cases.

[4] ssibly Chögyi Gyaltsên or Chögyi Gyêntsên, depending on pronunciaition variations

[1]

[2]

[3]

[4]