Statistically Significant Other

« previous post | next post »

The most recent xkcd:

This is such a good joke, and in retrospect such an obvious one, that it's hard to believe that Randall Munroe was the first one to tell it — but I can't find any precedents. Of course, "significant other" has only been in common use since the 1970s.

In a scientific context, plain "significant" is usually interpreted to mean "statistically significant" — this might be one of the most important lexicographical developments of the 20th century. I believe that the credit, or blame, goes to the public-relations genius of Ronald Fisher (1890-1962), exemplified in his brief popular article, "Mathematics of a Lady Tasting Tea". (For some relevant discussion in earlier LL posts, see "Listening to Prozac, hearing effect sizes", where "statistically significant" is contrasted with "clinically significant"; or "The secret sins of academics", which includes some fun quotes from Deirdre McCloskey on the perils of significance testing without a loss function; or the series of posts beginning with "The 'Happiness Gap' and the rhetoric of statistics" and ending with "The 'Gender Happiness Gap': statistical, practical and rhetorical significance".)

But the female lead in this xkcd strip makes her point, not with a t-test or some other method from inferential statistics, but rather with a boxplot, and the common-sense assertion that "you spend twice as much time with me as with anyone else". Boxplots (R help page here) are a method of "exploratory data analysis" (EDA), invented by another great public-relations genius of 20th-century statistics, John Tukey (1915-2000).

Alas, even a combination of inferential and exploratory statistics with clever wordplay is not enough to guarantee happiness: the strip's mouseover title is "… okay, but because you said that, we're breaking up." Here's hoping that the male character gets over his (feigned?) commitment issues.

Among John Tukey's best-known coinages, outside of the realm of EDA, are the terms "bit" and "software". My personal favorite, though, is his term for the Fourier transform of the log of the squared magnitude of the Fourier transform of a signal — since this is the spectrum of the spectrum, in some sense, he suggested that it should be called the "cepstrum". And since a Fourier transform takes us from time to frequency, a second one obviously takes us from the frequency domain to the quefrency domain. And since a filter enhances or attentuates frequency components, an operation that enhances or attentuates quefrency components must be a lifter.

This is the only example that I can think of where scientific or mathematical terminology was created by spoonerism.

For various interesting reasons (for example, when two signals are convolved, their cepstra are added), the cepstrum is in fairly wide use — the standard acoustic parameters for speech recognition are mel-frequency cepstral coefficients (MFCCs), for example.

[Update: in addition to the 3/7/2008 webcomic precedent found by atta in the comments, Victor Steinbok sent email citing a Yelp page from December 2007, an ask.metafilter.com page from January 2006, and a blog post from 2/1/2002, which all deploy the "statistically significant other" phrase in one way or another. (The 2002 blog post: "My statistically significant other comes in with p < .001, so I think I'll keep him.")  This is the result of Victor's exhaustive search of the Google hits for the phrase. Considering the reflex association significantstatistically significant, this is a remarkably small number of precedents — I guess it reflects exactly the compartmentalization of mathematics and romance that xkcd so strikingly overcomes.

Victor also notes that

… "insignificant other" has earned an entry in the Urban Dictionary. It's also a part of a title of a movie, two books, at least one song and a YouTube video. There is a cafepress "license plate frame" and an IO abbreviation is listed in TheFreeDictionary.com.

There's no report yet on "clinically significant other"… ]



27 Comments

  1. atta said,

    February 4, 2009 @ 9:16 am

    I couldn't find any precedents either, until I searched for "my statistically significant other", then you do get just a few earlier occurrences. Including, in fact, a webcomic: http://www.bitstrips.com/user/296/read.php?comic_id=1813&subsection=1
    But xkcd does the joke far better (surprise).

  2. greg said,

    February 4, 2009 @ 9:16 am

    Alas, even a combination of inferential and exploratory statistics with clever wordplay is not enough to guarantee happiness: the strip's mouseover title is "… okay, but because you said that, we're breaking up." Here's hoping that he gets over it.

    Odd, I read the mouseover as being his words, not hers: he accepts her reasoning, but because of her using such reasoning, she has proven herself unsuitable for dating.

    I have to admit though, as an undergraduate in physics, while we used Fourier transforms, I never encountered cepstrums, or quefrency domains. How do you even pronounce the latter? Is the que hardened like cue/queue? Or does it retain the pronunciation it has in the middle of frequency?

  3. Mark Liberman said,

    February 4, 2009 @ 9:27 am

    greg: …I read the mouseover as being his words, not hers: he accepts her reasoning, but because of her using such reasoning, she has proven herself unsuitable for dating.

    Me too. I meant that I hope he gets over his commitment issues — potential partners with such a good command of statistics and wordplay are rare.

    More precisely, I took the mouseover comment to be an attribution to the strip's character of a humorous imitation of someone with commitment issues; and my comment was an attempt, obviously somewhat forced, to continue the joke. Anyhow, I changed the wording slightly, in a way that I hope will make misunderstanding less likely.

  4. Mark Liberman said,

    February 4, 2009 @ 9:37 am

    greg: I have to admit though, as an undergraduate in physics, while we used Fourier transforms, I never encountered cepstrums, or quefrency domains. How do you even pronounce the latter?

    In my experience with it, the cepstrum is most useful in signal-analysis circumstances where you'd like convolution of time-functions to turn into addition of single coefficients, more or less. The Fourier transform does some of this, since it turns convolution into element-wise multiplication, which a log transform turns into addition. But the FT of a buzz (a regular sequence of events in the time domain) is still a regular series of overtones in the frequency domain, whereas its cepstrum is dominated by a single component corresponding to the spacing of the overtones, i.e. the fundamental. Since convolution is (a linear model of) the effect of (for example) combining voice-source characteristics (pitch, voice quality etc.) with vocal-tract filtering, or microphone effects, or etc., this kind of decomposition is a good bet for many speech applications.

    But it's used for all kinds of other stuff as well.

    I've always heard "quefrency" pronounced (in pseudo-phonetic spelling) "KWEE-fren-see". For speech-recognition applications, you generally want to ignore the low quefrencies (which correspond to overall spectral tilt and are heavily influenced by transmission-channel differences) and the high quefrencies (which are mostly due to pitch), and focus on the mid quefrencies. Simple statistical models — in effect, one sort of regression or another — will do this for you very nicely, when things combine additively.

  5. greg said,

    February 4, 2009 @ 10:04 am

    Me too. I meant that I hope he gets over his commitment issues — potential partners with such a good command of statistics and wordplay are rare.

    Aha. I agree entirely. Thanks for the clarification, and explanation of cepstrum/quefrency. I can see where cepstrum would be useful in various applications of physics, but it never really poked its head up as an undergrad. Doing a search for 'physics quefrency', it seems that most of the use is in sonic physics, though there are occasional references in astronomy as well.

  6. Michael said,

    February 4, 2009 @ 10:15 am

    let's not forget mho, the reciprocal of ohm…

  7. Fluxor said,

    February 4, 2009 @ 10:40 am

    I'm surprised that cepstrum retained its hard 'c' pronunciation from spectrum rather being a soft 'c' as is the norm when a 'ce' appears at the start of a word.

    As a spoonerism, it's very clever since it embeds the underlying mathematics into the new spelling. The original coinage by Tukey et al was about the analysis of echos. The underlying mathematics of the Fourier transform involve convolution, which requires the flipping of a signal backwards and then sliding it across another signal. The new words take the original spelling and flips it, resembling what would be done mathematically during a convolution. Quefrency resembles either an echo or the sliding of signals.

  8. Statistically Significant « Panther Red said,

    February 4, 2009 @ 11:15 am

    […] Posted by acilius under Comics | Tags: Language Log |   Thanks to Language Log for this comic […]

  9. Nathan said,

    February 4, 2009 @ 11:46 am

    I was a bit surprised to see "KWEE-fren-see" above. Don't you use IPA around here?

    [(myl) Well, we use it, even if we don't always love it, but we realize that many readers don't know it. So for all you IPA lovers, that's [ˈkwi.fɹən.si]. ]

  10. Statistically Significant « 360 said,

    February 4, 2009 @ 12:16 pm

    […] but the comic also has some linguistic interest.  Language Log (a longtime fan of the strip) has a great post that starts by discussing the phrase "statistically significant", and ends up using the […]

  11. Sili said,

    February 4, 2009 @ 12:42 pm

    Oh, do let's forget the mho, please. It's so terribly unfair to Siemens.

    Is this really a form of Spoonerism? I don't recall ever seen it used within words – always across sentences: "A toast! To the queer old dean."

    For what it's worth boxplots have become part of the 6th form curriculum in maths (at least at the A-level) here in Denmark in the past ten years. I don't recall seeing them, but they were on the exam last Winter when I was proctoring (and bored enough to rifle through the problems). It has to be said, though, that there may still be some lack of understanding. One student said she had no idea what it was about, she just knew how to put the data into her TI89 and get the result out. I'm just old enough that I've never learnt to use a graphing calculator. Tried one the second year at uni, but returned. Made do with my TI86 until it gave out after about eight years or so. (And never could figure out how to use HP and Casio.)

    This talk of quefrency (which I read /kwefrensi/) makes me want to reread my analysis texts to internalise Fourier transforms.

    [(myl) Here's an alternative that might (?) be simpler — lecture notes on "Impulse Response and Convolution" and "Towards the Discrete Fourier Transform" from a Mathematical Foundations course for cogsci students that I teach. This approach does everything as applied linear algebra, with Matlab (or Octave) examples — whether that makes it easier or harder will depend on what you know and how you prefer to learn and think, I guess. ]

  12. Nigel Greenwood said,

    February 4, 2009 @ 12:58 pm

    John Tukey also devised one of the simplest possible 2-sample tests (ie to test whether the samples are, well, significantly different). Graphically, the test consists in plotting the sample values on a common scale & adding the number of As larger than the largest B to the number of Bs smaller than the smallest A. In other words count the values that "stick out" at either end of the plot. Under a wide range of sample sizes (& assuming the 2 samples are roughly equal in size), a count of 7 is significant at the 5% level, & a count of 10 at the 1% level.

    The test is, to use the jargon, nonparametric: the critical values are based on purely combinatorial considerations. So no calculations of means, medians, quartiles, etc are necessary.

  13. Nathan Myers said,

    February 4, 2009 @ 5:06 pm

    I took his remark "because you said that" not as an objection to her knock-down argument style, but rather just to register an objection to her punning. We see another recent example of punishment for ill wit in http://partialyclips.com/pclipslite.php?id=1591 .

  14. Nathan Myers said,

    February 4, 2009 @ 5:07 pm

    Sorry, that's http://partiallyclips.com/pclipslite.php?id=1591

  15. Rubrick said,

    February 4, 2009 @ 7:16 pm

    Since this post mentions both xkcd and Fourier transforms, I feel justified in linking to this very early (and one of my all time favorite) xkcds.

  16. Alec said,

    February 4, 2009 @ 10:40 pm

    I'm sure there are more spoonerisms in botanical or zoological nomenclature – the only one I can call to mind right now is Meclatis, which is a subgenus of Clematis.

  17. Gwillim Law said,

    February 4, 2009 @ 11:02 pm

    scientific or mathematical terminology was created by spoonerism

    … when R.W. Gosper devised a plane-filling curve with a roughly hexagonal shape and called it a flowsnake.

  18. Florence said,

    February 5, 2009 @ 4:31 am

    "You spend twice as much time with me as with anyone else". I have always found this expression mildly intriguing, and for some reason I get this urge to stop and think about the math involved. I wonder if this is due to the fact that I am a non-native English speaker (as far as I know, there is no French equivalent to this expression) or that I am no good at maths. Or both?

  19. Robert said,

    February 5, 2009 @ 6:25 am

    Quefrency is an odd notion, since the Fourier transform is its own inverse, apart from some trivial factors. The domain of the function goes from time to frequency, and back to time.

    Wordplay is a pretty common source of technical names though. There's the sine-Gordon equation, a modification of the Klein-Gordon equation, for example, and quantum physicists will often talk about bras and kets, so named because they're the two halves of a bracket.

  20. Nigel Greenwood said,

    February 5, 2009 @ 6:38 am

    @ Florence: t(me) >= 2*max[sub i]{t(person[sub i])}

    Somebody with more patience than I might feel like producing this in a nicer format!

  21. Nigel Greenwood said,

    February 5, 2009 @ 6:43 am

    Slightly OT, I know, but this might be the place to bring up those strangest of pronunciations: /ʃaɪn/ or /sɪnʃ/ for sinh, the hyperbolic sine function.

  22. Florence said,

    February 5, 2009 @ 9:51 am

    @Nigel: thank you! I think I prefer the first version though, much easier to pronounce ;-)

  23. vaardvark said,

    February 5, 2009 @ 11:02 am

    @Nathan Myers
    …register an objection to her punning. We see another recent example of punishment for ill wit…

    You mean pun-nishment?

  24. Laurent C said,

    February 5, 2009 @ 11:31 am

    Florence, "You spend twice as much time with me…" -> "Tu passes deux fois plus de temps avec moi…". This is indeed not phrased exactly the same.

  25. Tim said,

    February 5, 2009 @ 3:26 pm

    I interpreted the mouseover text to mean that the male character thought the pun was terrible, and refused to date someone who would say something like that. (Or, at least, that he was making a joke to that effect.)

  26. statistically insignificant « Andrew’s Blog said,

    February 5, 2009 @ 4:26 pm

    […] Click for an interesting article on the term. […]

  27. Aaron Davies said,

    February 9, 2009 @ 6:08 pm

    ditto wrt to the mouseover and the pun, though of course it has to be a joke on his part–the typical xkcd character's true reaction to such masterful wordplay would be to immediately propose

RSS feed for comments on this post