Zipf genius
« previous post | next post »
I have always been deeply intrigued by George Kingsley Zipf (1902-1950), but Mark's recent "Dynamic Philology" (5/24/25) rekindled my interest.
Put simply,
He is the eponym of Zipf's law, which states that while only a few words are used very often, many or most are used rarely,
where Pn is the frequency of a word ranked nth and the exponent a is almost 1. This means that the second item occurs approximately 1/2 as often as the first, and the third item 1/3 as often as the first, and so on. Zipf's discovery of this law in 1935 was one of the first academic studies of word frequency.
Although he originally intended it as a model for linguistics, Zipf later generalized his law to other disciplines. In particular, he observed that the rank vs. frequency distribution of individual incomes in a unified nation approximates this law, and in his 1941 book, "National Unity and Disunity" he theorized that breaks in this "normal curve of income distribution" portend social pressure for change or revolution.
Because of its applicability to other types of data than purely linguistic ones, I sometimes feel that Zipf unlocked a secret key to the universe, which is truly humbling. What is even more astonishing is that Zipf did not like mathematics, whereas mathematics-physics is usually thought of as the ultimate approach to Unified Field Theory. It would seem that Zipf discovered a strictly empirically based approach to cosmology.
BTW, I have habitually pronounced his striking surname as it is spelled, accounting for all four letters, but Wikipedia gives it as /ˈzɪf/ ZIFF; German pronunciation: [tsɪpf]. Zipf is "from late Middle High German zipf zipfel ‘point tip corner’ hence a topographic name for someone who occupied a narrow corner of land as for example between converging channels of a stream; or a nickname for someone who wore a pointed garment like a long hood."
Source: Dictionary of American Family Names 2nd edition, 2022, as cited here.
In Bavarian and Austrian German, Zipf m (strong, genitive Zipfes or Zipfs, plural Zipfe): "tip, peak, corner".
Reminds me of the Cantonese geonym zeoi2 咀 ("spit [[narrow neck of land projecting into a body of water]", etc.), to be distinguished from the homonym-homophone zeoi2 咀 ("chew, masticate").
Selected readings
- "Zipf's demon" (10/25/24)
- "Dynamic Philology" (5/24/25)
- "Why estimating vocabulary size by counting words is (nearly) impossible" (12/8/15)
- "Dictionary-sampling estimates of vocabulary knowledge: No Zipf problems" (12/9/15)
- "One law to rule them all?" (6/2/19)
- "Lexical display rates in novels" (4/18/20)
- "The indecipherability of the Voynich manuscript" (12/11/19)
- "A different perspective on family name distributions" (1/19/20)
- "Macroscopic bosons among us" (7/5/12)
- "Zipf and the general theory of wrinkling" (11/15/03)
- "Moby Zipf" 6/1/19) — rich assemblage of bibliographical references to studies of Zipf's work as applied to a wide variety of fields, with the first comment by VHM:
George Kingsley Zipf seems to have been an incredibly brilliant person. In addition to being Chairman of the German Department at Harvard, he was University Lecturer, a rare honor which meant that he could teach any subject he wanted. He died on September 25, 1950 at the age of 48 after a three-month illness. Yet, within that short life, not only did he discover Zipf's law, which has such important implications for linguistics, he applied similar models to human behavior (the principle of least resistance), frequency distribution of individual incomes and its implication for national unity and disunity, and other vital fields. It is said that his statistical insights can explain properties of the internet, even though he arrived at them before it was discovered.
I'm especially intrigued to learn that he worked with Chinese and wonder what he focused on in that regard.
All in all, a fascinating person. If there's not a biography of Zipf, he's ripe for one.
Scott P. said,
June 18, 2025 @ 8:14 am
I wonder if he ever had a collaboration with Hermann Zapf.
Matt McIrvin said,
June 18, 2025 @ 10:19 am
Cosma Shalizi has persuasively argued in recent years that a lot of the supposed power laws out there aren't– they're some similar-looking kind of distribution like a log-normal one. The details matter out on the far tail.
Gregory Kusnick said,
June 18, 2025 @ 11:05 am
I'm struggling to make sense of this claim. Zipf was in his teens when Hubble published his empirically-based cosmological law. And of course the notion of mathematical laws derived from empirical observation goes back much further, to Galileo, Kepler, Newton, and others. So I'm not clear on what original insight Zipf is being credited with here, beyond his eponymous power law.
david said,
June 18, 2025 @ 1:42 pm
1/f noise has a long history among electrical engineers starting with the invention of the vacuum tube. https://arxiv.org/pdf/physics/0204033
That arkiv is a little technical but cites a Martin Gardner 1979 Scientific Article [9] popularizing Clarke and Voss’s work [10] showing speech and music tend to have 1/f spectra.
unekdoud said,
June 18, 2025 @ 1:49 pm
Is it weird that I visualize "corner" as a right angle? From the etymology it's basically a horn, right?