On the difficulty of saying what a word is

Sophie MacDonald asks:

I have been an on-and-off reader of Language Log for several years, and have always enjoyed your contributions, though I’m not a linguist. I do work on formal language theory sometimes, but very much within mathematics and computer science, not linguistics.

Recently, a music theorist colleague asked me for help with a question. She is engaging with the body of literature that applies linguistic ideas and methods to the study of music, and she is in particular working with the idea that it is hard to give a definition of a chord or a melodic phrase that actually makes sense within musical practice. She was asking for linguistic sources indicating the difficulty of saying what a word is, which might be useful for the point she is making.

I remember that you have written on LLog before about some Chinese students’ difficulties in distinguishing between words and characters, so you were the first person I thought of. If you have any sources on this idea—the ambiguity of what a word is—or any ideas of where I could look, I would be very grateful.

Of course, we could take the easy way out and give an encyclopedia or dictionary definition for "chord":

A chord, in music, is any harmonic set of pitches / frequencies consisting of multiple notes (also called "pitches") that are heard as if sounding simultaneously.


And for "melodic phrase":

a succession of notes forming a distinctive sequence. synonyms: air, line, melodic line, melody, strain, tune.


I sense that what Sophie and her colleague are after is something that is much greater and deeper than a dictionary definition.

As we might say in a Chinese sort of way, after all (dàodǐ 到底) what IS a chord?  What IS a musical phrase?

The ambiguity of words.  The fuzziness of reality


  1. Mike Grubb said,

    January 20, 2023 @ 9:57 am

    Oh boy. Is what is of interest for this discussion "word" as a discrete phonological unit, a discrete graphical representation, a constitutive element of discrete semantic units, a discrete grammatical unit, or a Platonic concept? Or, for that matter, a representation of a theological First Cause? Or the constitutive elements of performative speech acts? Or artifacts of our social species' capacity for localized consensus-building? Or something phenomenologically experienced, like pornography, that we struggle to define but we know it when we "see" it? I think I need some more caffeine.

  2. Victor Mair said,

    January 20, 2023 @ 11:32 am

    Merci beaucoup!

  3. Olaf Zimmermann said,

    January 20, 2023 @ 1:12 pm

    Suprasegmental phonology, anyone? (Just asking … twinkle twinkle little bat …) D'ailleurs, en quoi se distinguent les calombours/puns/Wortspiele)? Quoth John Lennon: "Mersey beaucoup!"

    There's a reason why computer[s]ized speech processing is a flop, in spite of all the current hype. Re-reading Boileau and Goethe's musings about language, there are many gems (as well as much rubbish – but you soon find how to pick out the raisins) to be found.

  4. Victor Mair said,

    January 20, 2023 @ 2:35 pm

    Merveilleuse, Olaf Zimmermann! Wunderbar!

    "Words in Mandarin: twin kle twin kle lit tle star" (8/14/12)

  5. cameron said,

    January 20, 2023 @ 2:40 pm

    it's a question of realism or nominalism. modern philosopher of science usually refer to realism vs anti-realism.

    words are like atoms, neutrons, or positrons. are they real things in the world? or are they abstractions that are part of an explanatory system that models the world?

    writing systems sets of conventions that act as mechanisms for modeling natural languages. a "word" is an abstraction that's part of many writing systems. a word, in an alphabetic language is a group of characters that are not separated by whitespace on the page. sometimes where the whitespace goes seems "natural", but that's because we're so used to it. why, for instance, do we write "wouldn't", rather than "would n't" – there's no reason, that's just the convention. we write it as one "word" rather than two because of the accidents of history.

    in Chinese the word isn't an abstraction that's traditionally part of how they think about their writing system. it's not part of the set of conventions they've adapted for writing the language. their writing system is at root a syllabary, and so they think of the character/syllable as the basic unit of writing, not the letter or the word; they don't have the former, and so haven't needed the latter.

    linguists tend to prefer to discuss spoken language, rather than written language. hence the set of abstractions they favor run along the lines of phonemes and morphemes, and suchlike. there's no need to define a "word", which is such an arbitrary convention of the written language

    pop (rock, jazz, and country) musicians often think of music in terms of chords. classical musicians think of music in terms of the notes on the page, written in standard notation. the chord sequence isn't something a classical performer needs to think about. they just play the notes as written. to the classical performer the chord sequence, if there is one, is part of a musicological analysis – an abstraction irrelevant to performance or enjoyment of the music.

    of course, classical composers and arrangers often very much think of chord sequences, because they're the ones writing down the notes for the performers to play, and they're dealing with the music in the abstract, not actually playing it.

    you can see the difference in the notation used. classical music is generally just written with notes and rests on the staff – no chord symbols. pop music is often written down as a piano arrangement with chord symbols written above the staff. that's what you get if you buy the published sheet music for a song. sometimes it's stripped down to what's called "fake book" notation – just a single-note melody line with chord symbols above the staff.

    arrangers will often give some players charts with just chord symbols, no melody at all. studio musicians often use what's called "Nashvillle notation" where the chords aren't spelled with their letter names, just with their numbers relative to the key. that's another level of abstraction, a high-level model of an arrangement that's valid whatever key the singer ends up being comfortable in

  6. Jonathan Ginzburg said,

    January 20, 2023 @ 6:22 pm

    It's also useful to note that, as far as spoken language goes, the notion of a word is indeed an abstraction. B wondering in the kitchen looking for the scissors can produce the utterance "Where are the sci…" stopping in "mid word" when they locate the scissors. This is a normal (and self attested) utterance, additional variants of which can be found in Brennan and Schober 2001 (where they occurred under experimental conditions, e.g., Move to the yel- purple square or Move to the yel- uh purple square.) Thus, self-repair, which is (on at least one view) a component of grammatical competence (an incontrovertibly an intrinsic feature of real language use), does not respect "word integrity".

  7. JPL said,

    January 20, 2023 @ 8:27 pm

    For the notion of 'word' as a technical term used in linguistics, a good place to start might be RMW Dixon and Alexandra Aikhenvald (eds.), Word: A Cross-Linguistic Typology, 2002, published I think by Cambridge Univ. Press.

    The locus classicus for a definition is probably Bloomfield's "postulates" article (1926), where he defines 'word' as referring to a "minimal free form", a "free form" being a form that can stand alone as a meaningful utterance. The difficult cases are provided by the Native American languages of the Pacific Northwest, in which the minimal free forms can have the internal structure of a full sentence. (And so the forms expressing the major components of the proposition (e.g., nominal and verbal) are bound forms, and yet would still be regarded as instances of lexemes.) Linguists also make the distinction between words, which are what appear in texts, and which are instances of categorical structures belonging to the system; and lexemes, which are the general categorical structures and belong to the lexicon (in the language system). The term 'form' (in Bloomfield's account) refers to the part of a Saussurean sign (as an element of a text) that is the perceivable indicator of sameness and difference. The term 'word' used to refer, e.g., in Bloomfield's time, to just the perceivable part of the sign, but nowadays and in more general usage, it seems to refer to the whole Saussurean sign, including the "meaning" part. This is just a basic start, since a linguist will regard the phenomenon of the "minimal free form" as an open-ended problem.

  8. David Deden said,

    January 20, 2023 @ 8:45 pm

    Words are like atoms? No, no, no. They are molecules. Yes. They are made up of atoms, but they are not atoms. Words began as utterances, differentiated by pauses, pitches, some associated primal meaning. They were just noise 99% of the time, but because that 1% gave advantage to those who could decipher associations, it became a functional tool of sociality, just as random sparkings became a functional tool. I think.

  9. Jerry Packard said,

    January 20, 2023 @ 8:47 pm

    Pages 7-14 of my book ‘The Morphology of Chinese’ gives a summary of how words are defined in natural language (with focus on Chinese but more broadly applicable to language in general) including lexical, orthographic, sociological, phonological, syntactic, semantic and psycholinguistic. The most common definition is the syntactic definition, namely, a free morpheme – aka a minimal free form. That definition is syntactic because it is a minimal free form that can fill a terminal syntactic slot. In reality, as Mike Grubb, cameron and Jonathan Ginzburg point out, it is an abstraction.

  10. Antonio L. Banderas said,

    January 20, 2023 @ 9:45 pm

    Is it of any pedagogic relevance for Second Language Acquisition?

  11. Jerry Packard said,

    January 20, 2023 @ 10:20 pm

    It is, because second language acquirers import the concept from their first language, and because the second language being acquired usually uses ‘word’ as a primary operative concept.

  12. David Marjanović said,

    January 21, 2023 @ 8:01 am

    words are like atoms, neutrons, or positrons. are they real things in the world? or are they abstractions that are part of an explanatory system that models the world?

    Let me just mention that this is 2023, not 1923. Atoms, neutrons and positrons are real things in the world, they are not abstractions; I can bring you an atom and a neutron on a plate (except the neutron has a measured half-life of a bit less than a quarter of an hour) and a positron in a magnetic trap.

    I like the analogy of words to molecules: they seem obvious at first, but then you get into complex cases with different kinds of bonds…

  13. Terry Hunt said,

    January 21, 2023 @ 4:13 pm

    To illustrate one problem of defining a chord, some musicians insist that it has to contain at least three different notes, while others (particularly electric guitarists) make use of "power chords" which contain only a root note, a note that is an interval of a fifth above the root, and notes that are an octave above or below these two so do not count as 'different' from them. The first-mentioned musicians view this as a mere interval between two notes rather than a chord.

    Such a "power chord" is neither major nor minor, which can be musically useful. Moreover on a guitar – usually tuned to 'equal' temperament, a compromise that makes most intervals very slightly 'out of tune' in order to be approximately right in any key – intervals of a fifth are usually closer to exact or 'just' temperament than any others, so power chords sound a little purer and more forceful than other chords with less closely matching harmonics.

  14. JPL said,

    January 21, 2023 @ 6:30 pm

    Another kind of question that may be relevant is, e.g., are the forms that express what are called "inflectional" distinctions (e.g., case, aspect, tense, number, etc.) bound or free? If they are free forms, they can be regarded as words, even though in other more familiar languages such distinctions are expressed by bound forms. The question of whether given forms are free or bound is decidable with relation to the empirical facts for that language, including phonological variations and how speakers of the language view them. (Of course a minimal free form can be internally complex. The term for a minimal form is 'morph'; Bloomfield used the term 'morpheme' for this, but, in keeping with the system-text distinction, the term 'morpheme properly applies to the element of the categorical system, not the instance in a text. But these days (as opposed to the more Machian "Structuralist" period), the terms 'morph', 'word', 'sentence', etc. seem to be used for expressions, i.e., Saussurean signs; if you want to refer to only the perceivable index, one could still use the term 'Bloomfieldian form'.)

  15. Sophie MacDonald said,

    January 26, 2023 @ 4:45 pm

    Thanks everyone for your suggestions! My friend and colleague (Alison Stevens, in Edinburgh, https://www.alisonnicole.com) and have a number of things to look up now.

    A good example of the kind of thing that Alison is interested in is the argument sometimes made, that "today's music" is worse, specifically less interesting and less expressive, than music of an earlier time because today's music uses fewer chords. She points out that even to assign meaning to such an assertion, you need a way of counting the distinct chords in a piece of music, which requires you to say what a chord is and what a piece of music is. For example, how many distinct chords are in the rhythm changes? It depends very much on the performance and on the analysis.

    The analogy to language here is the way in which people sometimes try to draw inferences, and make value judgments, about different languages based on assertions of the form "language X has more words than language Y". To do anything with that kind of assertion, you need a way of counting words in both languages, which requires you to say what a word is in each language.

  16. Jackson Holiday Wheeler said,

    January 26, 2023 @ 5:04 pm

    Interesting choice of "chord" vs "phrase".

    Even though a chord is usually defined in beginner texts as a harmonic relationship, I hear lots of musicians discussing chords both in a harmonic sense (multiple notes played simultaneously) and in a melodic sense (multiple notes played sequentially).

    This can be seen especially when speaking of an arpeggio, which is sometimes defined as a "broken chord" — a chord's notes played in sequence. However, the harmonic relationship of the chord is still maintained, because they are played one after the other (and often other notes being played at the same time would reinforce the underlying harmonies).

    For example, Charles Cornell does this quite often in this video about chords in the Interstellar score: https://www.youtube.com/watch?v=6AI6PG7oj7E

    From my (admittedly amateur) point of view, a chord is essentially a relationship of notes, regardless of whether they are played simultaneously to create harmony or sequentially to create melody.

    This all reminds me of composer Alexander Scriabin's quote, "Melody is unfurled harmony, harmony is furled melody".

