The rɑɪt sɑʊnz?

« previous post | next post »

Angus Grieve-Smith writes:

I was always taught that the most straightforward way to write American diphthongs is [aj] and [aw], and the "long" mid vowels as [e] and [o]. Recently I've been seeing [ɑɪ ɑʊ ɛɪ] and [ɔʊ] popping up.  This seems to reflect at least three different changes:

(1) A shift from using [j w ɰ] to represent glides, to representing diphthongs as a series of vowel sounds.
(2) A shift to greater detail in these representations.
(3) A shift in the standard from somewhere close to my dialect (Hudson Valley) to … someplace else.

Angus's point (1) is well taken, and closely related to his point (2).  "Rising" diphthongs in English — those that go from a more open mouth position to a more closed one, and therefore end "higher" than they began — aren't generally distinguished from one another in how high they go.  There's a distinction between the ones where the tongue goes to the front of the mouth (e.g. the vowels in standard American pronunciations of boy, buy, bey) and those where the tongue goes towards the back of the mouth (e.g. the vowels in standard American bough, [rain]bow).  But you don't generally see words distinguished (in a given variety of English) between degrees of height in the off-glide.  And across dialects, there's indeed a lot of variety in the detailed trajectories of these vowels. Also,  all of these are clearly on the vowel+glide side of the fence rather than vowel+vowel side. So why not stick with the IPA's palatal approximant /j/ and its labio-velar approximant /w/ (i.e. the consonant-side versions of the vowels /i/ and /u/), and let the phonetic details take care of themselves?

My guess is that there are two reasons:

(a) Angus's point (3) is misleading — he doesn't really pronounce words like "buy" the way he thinks he does.
(b) The IPA's choice of 'j' to represent a palatal approximant (= high front glide) is problematic for Anglophones, for whom 'j' is strongly identified as the voiced palatal affricate in joy or jerk.

Point (a) first. (And I'll limit the discussion here to just one of the cases that he brings up.)  I don't think I've ever heard an American speaker who really has the glide version of IPA [i] at the end of words like buy.  You can work out what that would be like by saying bee to yourself, and then saying buy so that it ends with the same sound.

Americans stop in lots of (vowel-space) places when they say a word like buy, ranging from Southerners who have a simple monophthong that barely budges from its low back nucleus ("bah"), to others who take it forward to (the low front vowel of) bad but not really up much at all, to those who go forward and then up to (the vowel of) bed or bid.  But nobody goes all the way to (the high front vowel of) bead.  (The vowel nucleus also varies, but never mind that for now.)

Here are the pronunciations of baa, bid, bead from the online Merriam-Webster dictionary site:


Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.


Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.


Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

I've plotted them on a standard -F2 / -F1 formant plot, in which vowel height runs bottom-to-top on the y axis, and the front/back dimension runs left-to-right on the x axis. In order to reduce each of these to a single value (plotted at the center of each text string), I've taken the point (near the center of the syllable) where F1 is maximum. I've added ooh and bee as additional points of reference.

Agasint this background, I've plotted the whole diphthongal trajectory of the M-W pronunciation of buy:

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

You can see that it peters out quite a bit short of the vowel of bee — and even a bit short of bid. Here's the trajectory of bide, from (I think) the same speaker, which does get close to bid:

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

There's one little wrinkle here. Many American speakers — including this guy on the MW site — have raising and fronting of this vowel-class before voiceless consonants. Thus bite (OK, really bight — MW has bite read by a female speaker, whose vowel space would be confusingly different):

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

You can see that the nucleus is a bit higher, and the glide in this case goes all the way to bee, even though the vowel as a whole is quite a bit shorter. (See "Raising and lowering those tighty whities", 3/20/2005, and "Trajectory of 'long i'", phonoloblog, 8/25/2004, for some further discussion.)

I'm not sure how Angus talks, but my guess is that his buy and bide are pretty much like this speaker's are, and his bite will either be raised and fronted (like the example above) or will be closer to the trajectory of his bide.

So to sum up: IPA [ɪ] as in bid is a reasonable approximation to the offglide in words like buy and bide. But given the variation across dialects and contexts in the extent of the offglide, why not just call it (some implementation of) /j/, as Angus would prefer? Why the [ɪ]-ward movement that he notes:

When it was just on (credited to Random House and Collins) I thought it was just one editor's quirk, but it's in the ESL text I'm using this semester (Grant 2010), and now in the new edition of Yule's Study of Language (2010), and Language Files (2007).  What gives?  Does anyone know what's going on?  Did this happen gradually by people copying each other, was it argued for in a paper, or was it decided by some LSA committee that I never heard about?  Is that level of detail really necessary in dictionaries, ESL texts and introductory linguistics books, and how do we know that it is?

My best guess is that /j/ is just too annoying/confusing for Anglophones, who are already traditionally resistant to IPA in dictionary and other non-specialist publications.  The fact that [j] is wrong as a matter of phonetic detail (if you think of it as a glide equivalent to [i]) offers an excuse to avoid this obnoxious feature of the IPA. That's certainly how I feel about it.  But perhaps we'll hear from some people who know something about the editorial policies of the sources that Angus cites.

(Angus also made some interesting observations about the treatment of both the nucleus and the off-glide in the vowels of words like bough and [rain]bow, but I'll leave that for another time.)


  1. D.O. said,

    October 2, 2010 @ 5:48 pm

    First, thank you for the post. Now criticisms from a complete non-linguist. First, there must be a way to reverse axes in R. Seeing negative frequencies is much more weird than /j/ sound (well, I'm not a native Anglophone, but still). Second, there was some discussion on LL a while back about benefits to learn IPA for, well, almost everyone. If such a small thing as unusual rendering of /j/ sound is a major annoyance than just throw out the whole concept.

    [(myl) I shouldn't have given the impression that IPA 'j' is a major source of serious unhappiness to serious people. However, it's one of the things about IPA that tends to confuse anglophone beginners, and so (I hypothesize) a decision that avoids using it increases the readability of transcriptions in applications like dictionaries.

    But this may not really be a factor in the trend that Angus pointed to — there's no comparable problem with 'w', which also seems to be losing ground to /ʊ/ [and, as various commenters observe, /ɪ/ and /ʊ/ are traditional in British dictionaries, which have been using IPA for decades, whereas American dictionaries still mostly use their own ad hoc systems].

    As for reversing axes in R, I've fixed it. Sorry for your pain. Decisions of the International Phonetic Association are harder to change (not that there's an obvious fix in this case).]

  2. Lazar said,

    October 2, 2010 @ 6:07 pm

    "A shift from using [j w ɰ] to represent glides, to representing diphthongs as a series of vowel sounds."

    I'm a bit confused – to my knowledge, [j] and [w] have never been standard representations for offlglides in any language, and diphthongs have always been represented as series of vowel sounds. Whether in English, German, or the Romance languages, I always see falling diphthongs transcribed, for example, as [aɪ] or [ai]. When I've seen Americans using transcriptions like [aj], I've always written it off as some non-standard parochialism along the lines of using [č] for [tʃ].

    [(myl) I'm used to seeing some kind of differentiation between vowels and glides, sometimes by representing the glides as approximants, sometimes by using the non-syllabic diacritic (e.g. i̯), and sometimes with a smaller superscript character used to indicate an off-glide or on-glide. I've always thought that the representation of diphthongs or triphthongs as vowel sequences was a typographically convenient approximation, used in cases where there was no ambiguity. But I'll confess to not being plugged into IPA "best practices" — my own feeling is that phonetic detail is usually best represented with acoustic measures, especially in the case of vowels, where we have a pretty perceptual proxy in the form of formant frequencies.

    As for the decision about whether to treat affricates as sequences of stops and fricatives, I gather that's an old and bitter argument which is best left in its current fitful slumber.]

  3. Randy Alexander said,

    October 2, 2010 @ 7:10 pm

    Mark, I'm very glad you brought this subject up, as it's bothered me for years, and I've not seen anything written about it anywhere.

    Some points:

    1) There is almost nothing in the Handbook of the International Phonetic Association about diphthong or glide transcription. Actually my "almost" is just in case I missed something; I'm pretty sure there's nothing at all.

    2) I'm 99% sure there is nothing in the Handbook that says that [j] is used as the end of a glide: it is a consonant, not a vowel.

    3) Pullum & Ladusaw's Phonetic Symbol Guide, which outlines current and historical use of the IPA symbols, doesn't mention anything about [j] being used as a glide endpoint (although, I think this is an oversight: it is in fact reasonably common in America, especially in ESL literature).

    Personally, I favor [i] and [u] over [ɪ] and [ʊ] for glide endpoints because they are potential endpoints; whether you reach them or not depends (in "standard" English) on how strong or weak (I believe Daniel Jones was the one who started using those terms, but all of my books have not been moved to my new residence, so I can't easily check) your utterance of a given word is.

    If you were given the task of modeling pronunciation of phonemes to a group of ESL students, or American children in a phonics (or whatever they're calling it these days) class, I believe you would tend toward [i] and [u] in your glides anyway, even though you don't reach those points in normal allegro speech.

    4) The point of the second symbol in a glide is to indicate the general direction of the glide, not where the glide stops (or should stop).

    The danger in using [ɪ] and [ʊ] instead of [i] and [u] (or even [j] and [w] is that they could imply (and I think they do) that one should stop in those places even in strong utterances, which is quite misleading.

    If someone didn't understand your utterance of the word "buy", and you repeated it for them loudly and slowly as [ba:ɪ:], I imagine that you might have even more trouble getting your point across. I don't believe this would happen if you said [ba:i:].

    [(myl) It's true that IPA /j/ is a consonant rather than a vowel, but one plausible story about "glides" is that they are precisely vowels used in some sense as consonants. Certainly no one would have any qualms about using 'j' in transcribing an onset or (I think) an on-glide.

    As for using [i] and [u] to indicate the direction of rising diphthongs in English for second-language learners, this makes sense to me as long they're clear that this doesn't mean always (or even often or perhaps ever) trying to actually reach these points.]

  4. Jarek Weckwerth said,

    October 2, 2010 @ 7:58 pm

    In the British EFL industry, IPA-like transcriptions using /aɪ/ and /aʊ/ are essentially the only option, and have been for at least thirty years (I think). OK, the Oxford Pronunciation Dictionary by Upton et al., and some other Oxford dictionaries have /ʌɪ/, but I don't think I've ever seen old-American-style /aj/ and /aw/ in any EFL materials this side of the pond. (As long as they use IPA-like transcription, that is.) And even in "professional" phonetic literature you don't see them too often these days. So I can see how Lazar (and, in all probability, a lot of other readers) will be incredulous of Angus's incredulity.

    But it's not the case, as Lazar says, that /j/ and /w/ are never used for offglides. They are used so e.g. in transcriptions of Polish, for morphophonological reasons (I believe). And I am pretty certain that there isn't too much difference between Polish /aj/ and what is shown as /ai/ in e.g. Spanish.

  5. dw said,

    October 2, 2010 @ 8:22 pm

    Are we taking about phonetic or phonemic transcription?

    For phonemic transcription, the first question is whether a vowel like that in "high" consists of one phoneme or two. Wells argues cogently that it is only one phoneme:

    If [we regard the vowel as a succession of two phonemes, then] the glide is readily identified with the /j/ of "yes", what is the preceding vowel? Depending on the accent we are investigating, there may be phonetic grounds for identifying it with the vowel of "palm", that of "strut", or that of "trap", but in some cases — in RP, for instance — it is phonetically different from all three of these, occupying an intermediate position.

    (Accents of English, i, 49) Google books link

    The question then becomes whether we wish to represent this single phoneme as a sequence of two vowel symbols, or as a vowel-consonant sequence. Perhaps it seems a bit weird to use a vowel-consonant sequence to represent a single phoneme.

    If, on the other hand, we're talking about a narrow phonetic transcription, then the use of a sequence of vowels [ai] etc. allows us more precision in identifying the endpoint of the glide than a vowel-approximant sequence [aj] does.

  6. Craig Perlman said,

    October 2, 2010 @ 8:27 pm

    I'd frankly be more interested in the use of [a] to head off [aɪ] in words like "right".

  7. Angus Grieve-Smith said,

    October 2, 2010 @ 8:33 pm

    Thank you so much for answering my query! If Karen Landahl were alive, she'd be very disappointed in me for making claims about my own speech without measuring it first.

    Here's a video where I speak mostly-unrehearsed for a little while starting at about 1:25. And you're right! My offglides are pretty lax, and they sound pretty close to my monophthongal /ɪ/s. (Sorry, there's probably too much background noise there for acoustic analysis.) In any case, please scratch observation #3.

    Back to observation #1, I had no idea that [j w ɰ] were only supposed to be used for onglides. Seems pretty arbitrary to me. I guess if we wanted to be consistent we would abolish those three and just use the vowel symbols. But then how do you show which one is the nucleus? In any case, maybe you're right that it's just easier than explaining [aj] to people from outside the Balkans.

    Randy Alexander and Dw's comments speaks to my observation #2. Why would you want that level of detail for a dictionary model or an ESL or linguistics text, if academic linguists have to think about which one to use?

    Mark, what's the worst that could happen if an ESL student thought that they always had to go that high in their offglides? When you get to that level of detail, it's probably time to put the books away and just listen to the native speakers.

    Looking forward to more of these insightful comments!

  8. mgh said,

    October 2, 2010 @ 10:23 pm

    very basic question: can you recommend a site for someone with no linguistics background (me) to learn to read the symbols used here?

    posts like this always go over my head, sometimes more so than the ones on Chinese, which at least can provide a translation!

    [(myl) In addition to the resources cited in comments below, you might benefit from my lecture notes on the Pronunciation of English for introductory linguistics. ]

  9. Iulus said,

    October 2, 2010 @ 10:26 pm

    Has anyone ever seen it transcribed as [ae]?

  10. Iulus said,

    October 2, 2010 @ 10:32 pm

    @Mgh, here is a site with a flash of all the IPA symbols used for English with sounds
    This site has the whole IPA, but uses technical terms so you may get pretty lost:
    I'd recommend heading to the library and picking up a book on phonetics though. Introductory Phonology by Bruce Hayes does a pretty good job of explaining the terms in an easy to understand way, but there must be dozens of similar books out there.

  11. Dw said,

    October 2, 2010 @ 10:49 pm


    the wikipedia article on the International Phonetic Alphabet would be a good start

  12. D.O. said,

    October 2, 2010 @ 11:17 pm

    Sorry for your pain.

    I sincerely hope that this is a joke. I enjoyed reading this post so much. I even tried pronouncing various syllables myself in a doomed attempt to relate somehow to the discussion by observing where my tongue and lips move and what kind of effort my throat makes. And all this knowing full well that my pronunciation is, to put it mildly, hopelessly wrong.

  13. Angus Grieve-Smith said,

    October 2, 2010 @ 11:30 pm

    Also, Mgh, if you want to include IPA in your comments, the IPA Character Picker is an invaluable resource.

  14. Josh said,

    October 3, 2010 @ 12:10 am

    Is there any information on the pronunciation of these sounds during language acquisition? I've noticed my almost 2 year old daughter does go all the way to the full ee sound when she says the [aɪ] sound.

    hi -> hah-ee
    bye -> bah-ee

    I'm pretty sure neither my wife or I stop on the ee sound, but we are definitely further towards this end of the spectrum. Is there a tendency for toddlers to over analyze dipthongs and glides into their component monopthongs?

  15. Don Killian said,

    October 3, 2010 @ 1:01 am

    Regarding the use of IPA for English, I don't think there is always a lot of consistency. /æ/ tensing is a fairly common feature in the Northeast of the US but it's almost never mentioned in introductory textbooks/courses. (Nice to see you do mention it in your course ː)) And even if many other accents don't make the phonemic difference, it does appear quite commonly phonetically, particularly in the West.

    But I don't think the resistence to the IPA is just due to the /j/.. Americans like doing things differently :P

    A few small things about your page btw, Mark: while I can agree with you that the lack of an open central vowel in IPA is annoying (actually they're finally introducing that soon), I'd like to know why you say that the phonetic /a/ is hardly ever found? As far as I know it's a very common sound cross linguistically.. Spanish for instance has a more front than central /a/, and languages like Dutch have a phonemic difference between /a/ and /ɑ/. So you don't have to get into particularly exotic languages to find that.

    Also, /ɐ/ is not open, it's slightly raised. I believe Amharic has that if you want to hear the vowel quality.. it's definitely different from a central open vowel, but not raised enough to be /ə/. An American ear will probably hear it as something in between /a/ and /ə/. The central /a/ is just plain missing :P SIL was using the /α/ (that's an alpha.. this comment thing seems to have issues with my IPA) as a substitute.

    Lastly, if you want a good way to input IPA, I suggest Charis or Doulos SIL fonts, and then Ekaya is a useful (free) program which is a Windows port of the Linux KMFL. It uses the exact same method as Keyman does. You can find the program here:

    and the keyboard layouts and documentation here:

    The SIL page also has some suggestions in case you use a mac, since the input is a bit different.

  16. Angus Grieve-Smith said,

    October 3, 2010 @ 1:26 am

    Yes, the [a] is found all over New England, and I have a native speaker in my intro linguistics class.

  17. bfgray said,

    October 3, 2010 @ 1:51 am

    First, a disclaimer: I am Australian.

    Like Jarek, I teach ESL using British materials. The /aɪ/ transcription has always bothered me because the final sound in my pronunciation is much closer to /i/. On a related point, word-final 'y' is always transcribed /ɪ/, too, which seems completely at odds with my pronunciation, and not much like RP either.

    Personally, I feel that – at least for pedagogical purposes – Randy's suggestion of transcribing the 'potential endpoint' of the glide makes a lot of sense. When I teach lower-level students, I never bother with the 'proper', two-vowel transcriptions of the English 'long vowels' (I opt for something idiosyncratic like 'ã ẽ ĩ õ ũ' instead), because I find that – for the rare ones who consciously try to break it up into two sounds – it just confuses them and they come out with incredibly 'posh', hyper-RP diphthongs which are at odds with the rest of their vowel inventory, and which they can never maintain once their attention drifts.

    As for /j/ and /w/ for glides, I have only ever seen these used in phonemic transcription of French, mainly in syllable onset position, although /j/ is also used in the coda, as in this Wikipedia article:

  18. Jarek Weckwerth said,

    October 3, 2010 @ 5:29 am

    @dw: Are we taking about phonetic or phonemic transcription?

    That's the $100 question, of course. These are all meant to be phonemic, I think, but the problem with using IPA-like systems for phonemic transcriptions is that you need to imply some “leeway”, and there will always be people saying, “Well, this isn’t quite what the IPA chart says…”. People like John Wells will say, in turn, “Read the front matter”, where the relation between the symbols used and the actual productions meant is (should be) explained in e.g. a dictionary. These discussions tend to repeat from time to time (especially once sound changes come to the attention of the, erm, punters) but there doesn’t seem to ever be much agreement of what to do, and the old system remains. (Remember I’m talking from the point of view of British EFL stuff.) Or, if there are changes, they are cosmetic. Or they tend to cause muddled discussion…

    On a related note, for EFL learners it doesn’t really matter what the IPA says; they don’t know much about it (e.g. that it’s supposed to be language-independent – they practically always see it as a more or less helpful recoding of the silly English spelling system, and they have no idea it can be used for their own language, too), and they treat it as phonemic, because a normal person in the street knows very little about things that happen below the phonemic level. So I’m afraid I sort-of agree with Wells etc. that there’s not much point in changing the system that has been used for so long in (at least British) EFL materials even though it’s quite outdated now and does not reflect actual pronunciation by IPA standards.

    @dw: a sequence of vowels [ai] etc. allows us more precision in identifying the endpoint of the glide than a vowel-approximant sequence [aj] does.

    See above. It allows us more precision in phonetic transcriptions. But these (at least those Angus started with in his mail) aren’t phonetic, so no need to be precise.

    @Craig Perlman: I'd frankly be more interested in the use of [a] to head off [aɪ] in words like "right".

    Me, too. It struck me how Mark used [ɑ] for the onglides of both MOUTH and PRICE (these are Wells’s keywords for the two diphthongs). I can see how this can be OK in some types of American English, but in many other varieties (South-East England and Australia being the two main offenders) the onglides are markedly different, with that of MOUTH being front, and that of PRICE – back. And I think this is also correct, these days, of “Standard Southern British English” (formerly known as RP), which led Clive Upton et al. to adopt different onglides for the two in their Oxford Dictionary of Pronunciation. They use /a/ for the vowel of TRAP, and for the onset of MOUTH; generally a good idea, I think. And they use /ʌɪ/ for the vowel of PRICE, which I think is reasonable only in showing that the onglide is different. But since /ʌ/ stands for the vowel of STRUT… I’m not sure. I’d prefer /ɑ/. (True, these days, the two are very close to each other in southern England, being differentiated mainly by length, so maybe it doesn’t matter.)

    Two related things: Notice how /ʌ/, as used for English, is also quite a bit off from the point of view of hardcore IPA. By the standards of IPA, it should stand for a back unrounded vowel, and I think the vowel of STRUT isn’t fully back for a majority of English accents.

    Second, if you want a taste of the discussion about MOUTH and PRICE, among other things, you can check out this article by Jack Windsor Lewis (which is only a starting point), or this piece by John Wells.

    @Iulus: Has anyone ever seen it transcribed as [ae]?

    Yes, [ae], or if you prefer /ae/ is used quite often for German. And Felicity Cox has used /ɑe/ for Australian English (papers on her website).

    @bfgray: On a related point, word-final 'y' is always transcribed /ɪ/, too,

    Depends on which specific book you use. Practically all dictionaries (including pronunciation dictionaries) from the three major publishers now use /i/ for the final vowel of HAPPY.

    This must be the longest comment ever. Apologies.

  19. Jarek Weckwerth said,

    October 3, 2010 @ 5:42 am

    And apologies for all the typos and errors — the input box isn't meant for editing book chapters, it seems ;)

    I'm afraid I've got one more thing to say: Whichever two-element transcription you use, it won't matter for EFL learners. The diphthongs are pretty robust — as long as they're diphthongs. So if "reasonable" communicative pronunciation is your aim, there won't be any harm done whichever way you go.

    But if you want "native like", then I think there isn't a good way of showing what's going on in any transcription, even with marks for non-syllabicity etc. For my Polish students, the main problem is the timing, then the onglide (/a/ really doesn't cut it), and the offglide, I would say, is the least important part (which is not at all to say that it doesn't cause small problems).

  20. Kirk Hazen said,

    October 3, 2010 @ 7:44 am

    I have published on the variations of the vowel in "bide" in a few linguistic venues, and I have had different editors request each of the three representations ( [aj], [ai] or [ɑɪ] ). They are all linguists but follow different editorial norms. I would guess that diphthongs are represented with a wider variety of symbols than than any other class of sounds. I would also assume that this variation has been going on for a few decades. As a scholar, I end up reading these symbol variations of the vowel in "bide" as the same vowel.

    As a teacher, I unintentionally cause small panics in my students when I sometimes use [aj] or [ai] or [ɑɪ] in writing. Past study guides and handouts have all the variants in them.

    I suspect the desire to have strictly vowel symbols in the representations of diphthongs is the underlying motivation for the movement toward [ɑɪ]. As long as we reach some consensus, I would be happy with any of them.

  21. Adrian Morgan said,

    October 3, 2010 @ 10:31 am

    I'm wondering whether it would change the balance of pros and cons as to which symbols to use if your dialect included a diphthong like /əʉ/ (as in the Australian HOPE vowel).

    In Australia, the Harrington, Cox & Evans (1997) transcription scheme allocates symbols to phonemes in a strictly empirical way based on the nearest IPA-standard vowel to the average result from a formant analysis study (based on speakers from just one part of the country, I note in passing). For example, the diphthong in SOUND is allocated the symbol /æɔ/ – no room there for a conventionalised pattern for representing closing diphthongs in general. In my opinion, it takes its obsession with empirical purity too far, and because of it has features I don't agree with.

  22. Christian DiCanio said,

    October 3, 2010 @ 11:15 am

    I think there has been somewhat of a change in linguistic traditions with the increase in phonetic knowledge. If you think of glides simply as moving targets with no steady state AND if you think that diphthongs have a more primary vowel target (the first vowel target) and a more secondary one (the latter part), then representing dipthongs with two vowels, as in [aɪ], might seem misleading. It doesn't capture the dynamic nature of the dipthong and it suggests that each vowel target has equal "weight" (duration? starting point? ending point?). This is, in fact, the explanation that I was given when I started to learn phonetics. I was taught that dipthongs, like [aɪ] should be transcribed as [aj].

    However, glides are not merely dynamic entities. They may have steady states, most specifically where they are geminate (see Maddieson, 2007). So, we must leave behind the assumption that glides are only defined as dynamic units. Second, there are languages with coda glides (like Polish) where sequences like [aj] are pronounced quite distinctly from languages where there are phonological diphthongs. A careful phonetician would like to distinguish the two. These two arguments make the "two-vowel" approach to diphthong transcription seem very sensible.

    Peter Ladefoged and Ian Maddieson have also long been proponents of transcribing diphthongs with two vowels. Perhaps their influence on phonetics and linguistics (in general) has convinced linguists to transcribe diphthongs this way.

  23. Angus Grieve-Smith said,

    October 3, 2010 @ 11:35 am

    I agree with Kirk Hazen that what's most important is reaching some kind of consensus. But consensus by definition avoids the kind of "Why wasn't I notified about this?" or "Auch ihr! Neue gelbe Schuhe!" feeling that I had when seeing it pop up in one place after another.

  24. Lao Tan said,

    October 3, 2010 @ 12:23 pm

    Could be, perhaps, a hint of tonogenesis?

  25. NW said,

    October 3, 2010 @ 2:29 pm

    I've got the feeling that a lot of the convention is for dictionaries for foreign learners, and they're locking into conventional representations. (Learners don't want to know about Yorkshire /a/ or Canadian raising.) Back in the old days it was good enough to write 'sit' as /sit/ and 'seat' as /si:t/. Pronunciation dictionaries of English later found it profitable to introduce greater detail and different symbols to stop Spanish or Polish learners just using a native /i/ vowel.

    Wells's position seems typical of most dictionary-makers: you need unequivocal marking of the systemic differences more than fine phonetic accuracy.

  26. Jongseong Park said,

    October 3, 2010 @ 3:18 pm

    From what I've seen, narrow transcriptions of various dialects of English using empirical methods like the ones for Australian English mentioned by Adrian Morgan above suggest that on average /aɪ/ and /aʊ/ do not even reach [ɪ] and [ʊ] for many speakers of English in ordinary speech. One could even argue that something like /ae/ and /ao/ might be more realistic transcriptions of these diphthongs.

    But suppose (for the sake of simplicity, but not too unrealistically) that an individual's realizations of these diphthongs can vary from [aɛ, aɔ] to [ae, ao] to [aɪ, aʊ]. Would it be best to take the average range of [ae, ao] as representative? Or the maximal range of [aɪ, aʊ]? Taking into account the smoothing effect that fast speech has on diphthongs, it may make more sense to think of /aɪ, aʊ/ or even /ai, au/ as the psychological underlying form of the diphthongs. We may think we are pronouncing /i/ and /u/ at the end, but in reality the diphthongs get cut off somewhere along the way before they quite reach those goals.

    I'm happy with /aɪ/ and /aʊ/, as they seem to correspond pretty well with my own maximal diphthong pronunciations—/ai/ and /au/ sound unnaturally exaggerated for English.

    I've never found transcriptions like /aj/ and /aw/ satisfactory. Inexactness is not the main issue; we could agree to use /j/, /w/, and /ɥ/ broadly as any glides in the general vicinity of their cardinal values. So for example [ɔʏ] in German would be transcribed as /ɔj/, Chinese [ɑo] as /ɑw/, Dutch [ɐʏ] as /ɐɥ/ (or should the German diphthong be /ɔɥ/?). I am not happy with the loss of phonetic detail, but that's not my main objection.

    The real problem is that they make it look like we're dealing with vowel + consonant combinations on the phonemic level. English already has /j/ and /w/ that behave as consonants in words like 'yet' /jet/ and 'wet' /wet/. Reusing these symbols for transcriptions like /aj/ and /aw/ makes it look like these consonants can come in syllable codas in English, so that we should be explaining why we don't get /æj/, /ew/, etc.

    I think we should reserve /w/ and /j/ for where it makes sense to analyse them as independent phonemes, as in Swedish hej /hɛj/ and Polish Wołga /ˈvɔwɡa/. Otherwise, we should stick to vowel symbols for phonemic diphthongs.

    I was always taught that the most straightforward way to write American diphthongs is [aj] and [aw], and the "long" mid vowels as [e] and [o].

    Wouldn't it be more consistent to write the mid vowels as [ej] and [ow]?

  27. Angus Grieve-Smith said,

    October 3, 2010 @ 5:38 pm

    The most consistent is not always the most straightforward.

  28. Taylor Selseth said,

    October 3, 2010 @ 10:20 pm

    When I first started being interested in English Phonetics I was confused for a short while with /eɪ̯/ and /oʊ/ because, as is typical in the Upper-Midwestern US, I have monophthong realization for those vowels, [eː] and [oː].

  29. Alon Lischinsky said,

    October 4, 2010 @ 5:15 am


    I'm a bit confused – to my knowledge, [j] and [w] have never been standard representations for offlglides in any language, and diphthongs have always been represented as series of vowel sounds. Whether in English, German, or the Romance languages, I always see falling diphthongs transcribed, for example, as [aɪ] or [ai].

    Contrarily, I have only rarely seen a Spanish falling diphthong represented as anything but [aeiou][jw], and then only from Peninsular authors.

  30. Jongseong Park said,

    October 4, 2010 @ 10:37 am

    In French you have cases like paille /paj/, but then the French /j/ is not a pure offglide, often having a fricative quality almost like a [ʝ].

  31. mollymooly said,

    October 4, 2010 @ 11:59 am

    Semirelevant John Wells post recently on linking semivowels.

    OTOH e.g. "view" is always transcribed /vju/, not /viu/ or /vɪu/, which is just a conspiracy to make people think the vowel is a sequence of two segments instead of a diphthong.

    Talking of strong utterances: How would one transcribe the refrain of that awful Dolly Parton/Whitney Houston song?

  32. Angus Grieve-Smith said,

    October 4, 2010 @ 1:15 pm

    You mean this song, Molly?

  33. KCinDC said,

    October 4, 2010 @ 2:40 pm

    On a related point, word-final 'y' is always transcribed /ɪ/, too, which seems completely at odds with my pronunciation, and not much like RP either.

    It does seem unusual, but then the president of the United States pronounces it that way.

  34. Dw said,

    October 4, 2010 @ 6:06 pm

    A closely-related question is how to transcribe a word like "star" in rhotic North American accents.

    The more traditional solution has been something like [stɑɹ] or — [stɑɻ] i.e. as a closed syllable ending in an approximant, while an alternative is to transcribe the vowel as a diphthong ending in an R-colored vowel: e.g. [stɑɚ] or possibly [stɑɑ˞]. At a phonetic level, there isn't much difference in the implications of the two transcriptions.. Using [ɹ, ɻ] gives more information about the degree of retroflexion in the tongue, while using R-colored vowels gives more information about the underlying position of the base of the tongue.

    At a phonemic level there may be arguments for the vowel to be either one or two phonemes. I think the arguments for a sequence of two phonemes is generally stronger in words like "star" than in a words like "sty".

  35. Dw said,

    October 4, 2010 @ 6:31 pm

    On a related point, word-final 'y' is always transcribed /ɪ/, too, which seems completely at odds with my pronunciation, and not much like RP either.

    This used to be used in traditional RP. It's mostly been replaced in contremporary by /i/ — a shorter verson of the FLEECE vowel. See

  36. Dhananjay said,

    October 4, 2010 @ 7:12 pm

    @Angus, "Back to observation #1, I had no idea that [j w ɰ] were only supposed to be used for onglides. Seems pretty arbitrary to me. I guess if we wanted to be consistent we would abolish those three and just use the vowel symbols. But then how do you show which one is the nucleus?"

    The non-syllabic diacritic, which Mark Liberman mentions and depicts above.

  37. Dw said,

    October 4, 2010 @ 8:12 pm

    OTOH e.g. "view" is always transcribed /vju/, not /viu/ or /vɪu/, which is just a conspiracy to make people think the vowel is a sequence of two segments instead of a diphthong

    What makes you think that it is a diphthong? Would you say the same thing about word-initial /ju/, as in "unique"?

  38. Robert Coren said,

    October 4, 2010 @ 8:24 pm

    @Angus: [oʊ maɪ]! Thanks for the pointer to the IPA character picker. I've been wanting something like that for years.

  39. Lazar said,

    October 4, 2010 @ 10:19 pm

    Dw: I've never been able to figure out whether English /ju:/ should count as a diphthong or as two segments. Words like "few" and "cue" (places where /j/ cannot otherwise be paired with a vowel) seem to suggest the former, whereas words like "you" and "unique" seem to suggest the latter.

  40. mollymooly said,

    October 5, 2010 @ 6:09 am

    @dw: I was channelling Peter Ladefoged, but there was static on the line. Historically /ju/ was unequivocally a vowel, but phonotactically as Lazar says it points both ways. Pronunciation shifts like/ˈdʒæɡjuə/–>/ˈdʒæɡwɑ/ reinforce the trend to resegmentation.

  41. Rodger C said,

    October 5, 2010 @ 8:26 am

    @Mollymooly: Or as Kentucky NPR announcers carefully say it, /ˈdʒæɡwaiɹ/.

  42. Rodger C said,

    October 5, 2010 @ 9:26 am

    Or at least one such announcer I've heard repeatedly.

  43. dw said,

    October 5, 2010 @ 11:06 am


    Sure, current /ju/ is derived from the historical falling diphthong /iu/. However, with respect to the current language (excluding dialects such as Welsh English that may preserve the falling diphthong), it seems pretty hard to make a case that /ju/ doesn't begin with a consonant. After all, we all say "a uniform" rather than "an uniform".

  44. dw said,

    October 5, 2010 @ 11:08 am

    Pronunciation shifts like/ˈdʒæɡjuə/–>/ˈdʒæɡwɑ/

    I've only ever heard this in the US (where there would usually be a final /r/). Are the Brits doing it too?

RSS feed for comments on this post