What 'IPA' means now…

« previous post | next post »

I have mixed feelings about the International Phonetic Alphabet. It's good to have standard symbols for representing phonological categories across languages and varieties. You need to know the IPA in order to understand books and papers on many speech-related subjects, as well as for practical things like learning to sing the words of songs in languages you don't know. And the IPA is certainly better than the various clunky alternatives for (symbolic) dictionary pronunciation fields. So I teach it in intro courses.

But there are two kinds of objections to it:

  1. Some of its representational choices are weird or worse: reserving /a/ for a vowel category that's rare in the world's languages, while assigning /ɐ/ to what is probably the commonest vowel category of all; insisting that affricates must be a sequence of two symbols (like /tʃ/), even though they function as single phonemes in many languages; and so on.
  2. It encourages the fiction that phonetic interpretation should be represented symbolically, rather than as a mapping from symbols to signals.

The first problem has resulted in some systematic rebellions, like the preference of Americanists for affricate symbols like /č/. [Update — See Sally Thomason's post "Why I don't love the International Phonetic Alphabet", 1/2/2008, as she reminds us in the comments…]

The second problem causes significant confusion in the phonological and sociolinguistic literature. For some motivation, see "On beyond the (International Phonetic) Alphabet", 5/19/2018, and "Farther on beyond the IPA", 1/18/2020. For a deeper dive, see my 2018 book chapter "Towards Progress in Theories of Language Sound Structure".

But yesterday, as I revised some of the lecture notes for my intro course, I was surprised — and amused — to find that my suggestion to check out Google's results for IPA tutorial now yields pages of references to "Ingenuity Pathway Analysis", where  Pathway Analysis is "the term from molecular biology for a curated schematic representation of a well characterized segment of the molecular physiological machinery", and Ingenuity Pathway Analysis is proprietary software from the company formerly known as Ingenuity (now QIAGEN) .

It's incidentally interesting how many random choices from the set of (26^3 = 17,576) 3-letter sequences are internet-meaningful, often in several ways. Here are five (literally) random examples: DNU, GWI, YTF, AZO, GTO. I haven't checked the whole set, but there may well not be any orphan members.

[If you want to continue the exploration yourself, point this program at a file with one letter per line. Or this R sequence:

Alphabet=strsplit("abcdefghijklmnopqrstuvwxyz", "*")
sample(unlist(Alphabet),size=3,replace=T)

]



33 Comments

  1. Dick Margulis said,

    September 28, 2022 @ 8:46 am

    I expected a disquisition on India pale ale, which has evolved into a euphemism for "I can drink bitterer swill than you can" and has nothing whatsoever to do with India, but I digress.

    [(myl) The connection has been noticed in an early LLOG post, of course. And (as I'm sure you know) there's a historical connection with India

    But "swill"? ]

  2. ycx said,

    September 28, 2022 @ 9:20 am

    Before I learned about India pale ale (mentioned above), my default thought when seeing IPA was either the linguistics one in the post, or isopropyl alcohol.

    [(myl) Neither of those common interpretations previously interfered with the link in my lecture notes, since tutorials for India Pale Ale and/or isopropyl alcohol are not widely popular.]

  3. Tim Leonard said,

    September 28, 2022 @ 9:27 am

    It's worth reading the Wikipedia entry for TLA (Three-letter acronym). https://en.wikipedia.org/wiki/Three-letter_acronym

  4. Jonathan Downie said,

    September 28, 2022 @ 10:56 am

    There are not one but TWO hilarious SpecGram posts on the incipient colonialism of IPA: https://specgram.com/CLXXXVII.4/05.workaday.ipa.html and on the difference between IPA and beer: https://specgram.com/CLXXXVIII.1/03.nuz.baits28.html

  5. Coby said,

    September 28, 2022 @ 11:18 am

    I believe that the representation of affricates as stop-fricative sequences is due to the IPA's French origin. The IPA is, after all, originally the creation of a French association of foreign-language teachers intended to help French students acquire a working knowledge of foreign languages. Since French lost its affricates in the transition from Old to Middle (English words like judge or chase come from Old French), a stop-affricate sequence is the best that French-speakers can do. They would be very unlikely to catch the difference between "catch it" and "cat shit"; just listen to the way they say words like tsar, tchèque, pizza or jazz.

    And while the IPA allows ligatures that help mark the difference, for some reason English-language lexicographers who use IPA don't take advantage of this option.

  6. Chris Button said,

    September 28, 2022 @ 11:24 am

    I wish they had ligatures for affricates.

    I also miss a dedicated symbol for a pharyngeal approximant. I appreciate that fricative/approximant differences get murky back there, but it is a hugely influential component of (Pulleyblank's) Middle Chinese and indeed his analysis of Mandarin phonology.

  7. Sally Thomason said,

    September 28, 2022 @ 12:03 pm

    Boy, do I agree with you on your point #1! For reasons different from the slight weirdness: see my fieldworker's lament:
    http://itre.cis.upenn.edu/~myl/languagelog/archives/005287.html

  8. Chris Button said,

    September 28, 2022 @ 12:51 pm

    Frankly I struggle with the IPA vowel chart in general. There is so much that doesn't really hang together in terms of what constitutes a cardinal point, why, and what is the relationship between them

  9. Daphne Preston-Kendal said,

    September 28, 2022 @ 1:38 pm

    To point 1: the IPA as defined is a phonetic, not a phonemic alphabet. The use of a particular symbol for a particular phoneme can be (and usually is) thus based on factors like familiarity of a symbol and how likely a native speaker of language A is to produce a sound recognizable to native speakers of the language they are learning. The use of as a phonemic symbol for what in many languages might phonetically be [ɐ] (or more likely something more central and lower) is thus entirely allowable. A more infamous case, which bugs pedants no end, is the customary use of for the rhotic consonant in languages where it’s almost never realized as an alveolar trill, like English, but especially French and German which use an uvular trill, if a trill at all.

    To point 2: You’re perfectly at liberty to use a tie bar over affricates to emphasize their connectedness, like . The other way to distinguish an affricate from a stop plus fricative is to put a syllable dot between them if appropriate: .

  10. Daphne Preston-Kendal said,

    September 28, 2022 @ 1:43 pm

    My ASCII angle brackets got misinterpreted as HTML. Let me try again with Unicode ones:

    To point 1: the IPA as defined is a phonetic, not a phonemic alphabet. The use of a particular symbol for a particular phoneme can be (and usually is) thus based on factors like familiarity of a symbol and how likely a native speaker of language A is to produce a sound recognizable to native speakers of the language they are learning. The use of ⟨a⟩ as a phonemic symbol for what in many languages might phonetically be [ɐ] (or more likely something more central and lower) is thus entirely allowable. A more infamous case, which bugs pedants no end, is the customary use of ⟨r⟩ for the rhotic consonant in languages where it’s almost never realized as an alveolar trill, like English, but especially French and German which use an uvular trill, if a trill at all.

    To point 2: You’re perfectly at liberty to use a tie bar over affricates to emphasize their connectedness, like ⟨t͡ʃ⟩. The other way to distinguish an affricate from a stop plus fricative is to put a syllable dot between them if appropriate: ⟨t.ʃ⟩.

  11. Coby said,

    September 28, 2022 @ 2:25 pm

    Daphne Preston-Kendal: The IPA was defined as "phonetic" at a time when the word "phonemic" didn't exist, but it seems clear from your comment that you regard it as phonemic, and I agree. But I also think that it's neither "international" nor an "alphabet" as well. For one thing, it does not relate to nations but to languages (or language varieties), and the link between "nation" and "language" is tenuous at best. For another, alphabets are usually finite sets of symbols arranged in… well, alphabetic order, while the IPA is neither; it should really be called a code. The entity also known as IPA (International Phonetic Association) is maintaining obsolete 19th-century terminology.

  12. Paul Garrett said,

    September 28, 2022 @ 5:20 pm

    Definitely "India Pale Ale"… was my first reaction. The rest was boring. :)

  13. Bob Ladd said,

    September 29, 2022 @ 1:20 am

    How many of the 17,576 three-letter sequences refer to airports (in addition to whatever else they may mean)?

  14. Philip Taylor said,

    September 29, 2022 @ 2:08 am

    Possibly all of them, since (according to the Central Intelligence Agency), there are over 41,700 airports all over the world. Of course, it is quite probable that not all of these will have TLAs, but nonetheless …

  15. Antonio L. Banderas said,

    September 29, 2022 @ 3:48 am

    @Sally Thomason

    How come you did not use a háček to spell the words with háčky?

  16. Kirit said,

    September 29, 2022 @ 5:16 am

    There's the "GPLed TLA FAQ" (or GTF), which is a database of TLAs: https://jtv.home.xs4all.nl/gtf/.

    As for the airports thing, the three letter codes are used for ticket bookings only (IATA codes) and aviation uses 4 letter ICAO codes.

    Not all of the IATA codes are actual airports, for example LON can be any London airport (useful when you want to book a ticket to there), and the codes get re-assigned as airports close or open (the code BKK used to refer to Don Mueang airport which became DMK, IATA: VTBD), but now refers to Suvarnabum (IATA: VTSB).

  17. Brian said,

    September 29, 2022 @ 6:42 am

    I recall taking an entry level Linguistics MOOC (think Linguistics 101.) IPA was never discussed but, after a few weeks into the course, it was used liberally with no explanation or introduction. I dropped that course immediately. Perhaps there is a straight forward intro to IPA with sound?

  18. Francisco said,

    September 29, 2022 @ 1:51 pm

    Bob and Phillip: according to the IATA website more than 11,000 three-letter identifiers (of the 17,576 possible) have already been assigned. There are far more airports and aerodromes in the world than there are three-letter combinations, but the IATA codes pertain to the air transportation business rather than to the aeronautical infrastructure per se. For the latter there are the four-character ICAO location identifiers. Every airport, aerodrome and airfield has its unique ICAO code (except in the US, were some aerodromes may have an FAA identifier instead).

  19. Philip Taylor said,

    September 29, 2022 @ 3:26 pm

    I wonder whether the IATA will avoid assigning TLAs with particularly worrying meanings — I am thinking of, for example, QAK ("Is there any risk of collision ?").

  20. Chas Belov said,

    September 30, 2022 @ 12:48 am

    @Daphne Preston-Kendal re:

    "You’re perfectly at liberty to use a tie bar over affricates to emphasize their connectedness, like ⟨t͡ʃ⟩. The other way to distinguish an affricate from a stop plus fricative is to put a syllable dot between them if appropriate: ⟨t.ʃ⟩."

    The tie bar is rendering in my browser to connect the space before the opening angle bracket with the t, not the t with the ʃ.

    "If appropriate" is ambiguous to me. ¿Would the following correctly reflect what you are saying? "The other way to distinguish an affricate from a stop plus fricative is to put a syllable dot between them if indicating they are separate: ⟨t.ʃ⟩."

  21. Philip Taylor said,

    September 30, 2022 @ 12:59 am

    The incorrect placement of the tie bar in ¿most? browsers appears to be a result of the styling information associated with the page. Turn off page styling and it appears correctly in my browser.

  22. ~flow said,

    September 30, 2022 @ 7:03 am

    I've always found it unfortunate that the IPA uses symbols like /a/ and /r/ to refer to very specific variants of these 'sound classes' (low vowels and rhotics, or whatchamaycallem) because often what you want is a rather broad transcription and can forego finer details for the sake of simplicity. One could make the case that *no* letter of the Latin alphabet should be used for a single, specific, narrowly defined sound but rather for a class of sounds (like alveolar and dental /n/ and so on), but I've never explored what that could look like.

  23. Chas Belov said,

    September 30, 2022 @ 6:51 pm

    @Philip Taylor: Thank you for the suggestion. Alas, displaying the page without page styling had no restorative effect (or indeed any effect). It looks like a browser issue: fails in Firefox, works in Safari. Guess I have a bug to report to Mozilla.

  24. Chas Belov said,

    September 30, 2022 @ 7:28 pm

    The issue was filed with Mozilla 11 years ago and has had minimal activity in the last several years. I've added a note to it, which hopefully will prompt follow-up.

    Here's the bug, for those who are into open source:
    Bug 543200: font fallback should try to use the same font for a complete character cluster or word

    And, based on the existing bug notes, one of the following might work if Language Log's comment system allows the HTML:
    ⟨t͡ʃ⟩ or
    ⟨t͡ʃ⟩

  25. Chas Belov said,

    September 30, 2022 @ 7:29 pm

    And it doesn't.

  26. Chas Belov said,

    September 30, 2022 @ 7:37 pm

    It's odd. Verdana works in my Mac's TextEdit program and displays the cluster correctly. The CSS for Language Log is odd, however. font-family is given as "sans-serif, Verdana, Arial, Helvetica" where I would have expected "Verdana, Arial, Helvetica,sans-serif" as "sans-serif" is not a font, it's a fallback, so it's kind of potluck as to what the site visitor gets based on their browser's defaults.

    I respectfully suggest that whoever is maintaining Language Log's style sheet, to replace:

    font-family: sans-serif, Verdana, Arial, Helvetica;

    with

    font-family: Verdana, Arial, Helvetica,sans-serif

    Since Verdana supposedly works for that cluster, I would expect it to fix the issue. If making that change in the site CSS doesn't fix the issue, then I would need to add a note to that effect to a different Firefox bug, Bug 1500531 many combining underlines and diacritical marks rendered wrong, especially in monospace fonts

  27. Chas Belov said,

    September 30, 2022 @ 7:41 pm

    ¡Oops! I mean:

    font-family: Verdana,Arial,Helvetica,sans-serif;

    Gotta have that trailing semicolon.

  28. Philip Taylor said,

    October 1, 2022 @ 2:11 am

    Chas — " Alas, displaying the page without page styling had no restorative effect (or indeed any effect). It looks like a browser issue: fails in Firefox, works in Safari. Guess I have a bug to report to Mozilla".

    Odd. My browser is also Mozilla-based (Seamonkey 2.53.14), and turning off page styles via View / Use Style / causes the over-tie to appear in correctly. In Firefox I can find no way to turn off page styling, but telling Firefox not to allow sites to override my choice of fonts does have the desired effect.

  29. Chas Belov said,

    October 1, 2022 @ 2:45 am

    @Philip Taylor: Thank you for the tip. I checked my preferences and I see that I already don't allow websites to override my font selection in Firefox. I previously had fonts set to Arial MS Unicode and switched it to Verdana, to no avail. So this mystery remains.

  30. Philip Taylor said,

    October 1, 2022 @ 4:50 am

    OK, here I see :

    Fonts for Latin : Times New Roman / Arial / Consolas. For "Other writing systems", the same. Default is Serif. Having turned off "Allow sites to override", and using the DOM inspector, I see that that particular stretch of text is an a element, with styles

    font-size: 1em; line-heightL 1.5em; margin: 1.2em 0; list-style-type: none; text-align: left; (for too many –wp–presets to report); font-family: Verdana, Arial (underlined), Helvetica, sans-serif; color: #333;

    and that's about it.

  31. Philip Taylor said,

    October 1, 2022 @ 4:59 am

    […] is a <p> element …

  32. Martha said,

    October 1, 2022 @ 10:43 am

    As an ESOL teacher, depending on the situation/students/textbook, I will sometimes introduce IPA in pronunciation instruction. Having something like /tʃ/ rather than a single symbol helps. You can identify the two parts and have the students mash them together.

    When I first learned IPA in college, we were taught to use tie bars with affricates and diphthongs. I can't say for sure because I don't have one in front of me, but I'm fairly certain I've used textbooks that also use them.

  33. Jonathan Badger said,

    October 3, 2022 @ 2:28 pm

    It's interesting that you chose R as the language for your code snippet as R is commonly used in bioinformatics where the "wrong" meaning of IPA in the pathway sense would be common. Is R commonly used in linguistics as well?

RSS feed for comments on this post