Language Log

Chi and squares and contingencies

June 24, 2013 @ 7:03 am · Filed by Mark Liberman under Words words words

Sybil Shaver writes:

Reading Stephen White's novel Line of Fire I encountered the following: (in the middle of a discussion of a death which is either accidental or suicide, p. 51 of the hardcover)

"What do you mean 'if she intends to die'? Isn't dying always intent?"

I shook my head. "It helps to think about suicidal behavior having two pairs of defining variables. Picture a simple chi square – a two-by-two graph. On one axis is the dichotomy of intent – the person intends either to die or to survive. On the other axis is the dichotomy of lethality – the person chooses either a method of high lethality or one of low lethality.

"The two-by-two chi square allows for four possible combinations." I turned over our grocery list and sketched a chi-square with four boxes. "People with low intent sometimes choose methods of high lethality. They can end up dying, almost by accident, because death wasn't what they were seeking. The opposite is people who intended to die, but they chose a low-lethality method. They're the ones who believed that five aspirin and two shots of vodka would kill them. But they end up surviving, again, almost by accident."

"You drew four boxes. What are the other two?"

I squeezed water from a rag to use to wipe the counter. "I described low intent/high lethality, and high intent/low lethality. The other two are low intent/low lethality, and high intent/high lethality. People in both those categories get the outcome they intended. Low intent/low lethality is the classic 'cry for help' suicide attempt-someone who intends to survive but is eager for someone else to know about the gesture. That person doesn't wish to die, and she chooses a method that makes death unlikely. High intent/high lethality is the guy who puts a shotgun barrel in his mouth and pulls the trigger with his toes. He intends to die and chooses a method that is damn near certain to do it.'

The first-person narrator uses "a chi-square" to refer to what I have always called "a contingency table". [In fact, the description of the four possibilities is very close to the way I describe the four possible results of doing a classical hypothesis test: two are errors, of different natures, and two give correct results.]

The narrator is Alan Gregory, a clinical psychologist (Ph. D.) and presumably at least a partial alter-ego for the author, who is also a clinical psychologist.

There is certainly some overlap between the usage of contingency tables and the usage of Chi-square tests, but I've never seen or heard of a contingency table being referred to as "a chi square" before. So is this an idiosyncracy of Stephen White (or possibly, of his teacher whom he thanks in the acknowledgments) or is it a common usage in some circles?

I don't intend to embarrass anyone by this question. (I'm quite sure that 30 years after my last heard lecture in PDEs, I'd badly mangle the terminology today.) But I'm curious. I hope you can find a way to put this out to the LL commenters.

I agree with Sybil — a matrix of (e.g.) intents and choices is called a "contingency table", while a "Chi square(d) distribution" is "the distribution of a sum of the squares of k independent standard normal random variables", used in various sorts of "Chi-square(d) test".

It's true that the contingency table described in White's novel is a square matrix (with the same number of rows and columns), and it's also true that the Χ² distribution plays a role in the analysis of contingency tables, e.g. via Cramér's V. So it's plausible that someone might get confused, fictionally or in reality, and misunderstand or misremember a square contingency table as a "Chi square".

On the other hand, maybe some sub-tribe of statisticians has taken this terminological nexus as the basis for reducing the six syllables of "contingency table" down to the two syllable of "Chi square". Comments?

June 24, 2013 @ 7:03 am · Filed by Mark Liberman under Words words words

Permalink

27 Comments

Pete said,

June 24, 2013 @ 7:53 am

What stands out for me is the stupidity of the question "You drew four boxes. What are the other two?"
Matt Whyndham said,

June 24, 2013 @ 7:55 am

So our contingency table could include the axes "statisticians are lax with their own terminology" and "the author Steven White doesn't understand his own research".
Rod Johnson said,

June 24, 2013 @ 8:11 am

Not to be argumentative, but that question doesn't seem particularly stupid. It's the kind of thing an intrigued, intelligent layman might ask.
NW said,

June 24, 2013 @ 8:20 am

'You drew four boxes, and have already explained the two more difficult of them clearly. But I can't guess what the other two are.'
Jens Fiederer said,

June 24, 2013 @ 8:23 am

I think "What are the other two?" is meant to ask not the definition of those boxes, which would be obvious even to the layman, but just shorthand for "What about the other two, do you have anything to say about those?"
Sybil said,

June 24, 2013 @ 8:51 am

Agreeing with Jens Fiederer. I don't intend to cast nasturstiums on psychologists, or anyone else. I liked this novel. a lot.
Martin J Ball said,

June 24, 2013 @ 8:52 am

Sorry, Jens. That interpretation is not possible for me (as a NS) for that sentence…
Dick Margulis said,

June 24, 2013 @ 9:06 am

The copyeditor damn well should have queried that. Editors unsure of mathematical terms and too embarrassed to query the author send such questions to copyediting-l many times each week and receive courteous and clear explanations, along with suggestions for how best to query the author. Of course, it's possible this was queried and the author stetted it (happens all the time with certain authors) or that the publisher was too cheap to engage a competent copyeditor in the first place (happens all the time with certain publishers). We don't have enough information to determine fault, but this should never have made it to print in any case.
nicoleandmaggie said,

June 24, 2013 @ 9:14 am

In my field we don't often refer to them as contingency tables. We sometimes refer to them as chi square tables when they have the chi square results included showing whether the actual distribution is different from what we would expect randomly (in fact, I think my Stata textbook does that), but it sounds like the person thinks the "square" refers to the boxes and not the mathematical term that shows significance. And it doesn't sound like there are chi square results in the table drawn up anyway. How odd.
Zubon said,

June 24, 2013 @ 9:21 am

On the stupidity of the follow-up question: the trope you want here is "Viewers are Morons." You do not actually need to believe that most of your readers are morons, but if you shoot for the median reader, half of them will have no idea what you are talking about. Under the opposite trope, "Viewers are Geniuses," the full explanation could be to imagine a table with axes of intention and lethality; explanation ends, the implications are obvious to a certain audience. The paragraph with examples gets you to the median reader. Explaining things that are obvious to your genius readers lets everyone follow along. You bore some readers for a paragraph but avoid completing losing other readers.

Blindsight by Peter Watts does this in-story. Speaking to another character, the protagonist explains a similar two-by-two table of a Prisoner's Dilemma-style payoff, going through all four boxes and the implications of each. In his role as narrator, he then explains to the reader that he did not really go through that full explanation, because the person he is talking to is a genius with an in-skull computer who did not need it all spelled out, but he is unpacking that conversation for our benefit.
Sybil said,

June 24, 2013 @ 9:33 am

On the original question; what's your field, nicoleandmaggie?
prasad said,

June 24, 2013 @ 9:58 am

I agree with Pete and NW. No sensible person could possibly understand the off-diagonal terms but then need patient hand-holding to understand the simpler diagonal ones. The strange thing is that the infodump could have been made much less silly/conspicuous with so little change. The speaker first explains 'cry for help' and shotgun suicide. *Then* the listener asks for an elaboration of the interesting cases, where intent and lethality don't go together.
Gene Callahan said,

June 24, 2013 @ 11:07 am

prasad ftw.
Howard Oakley said,

June 24, 2013 @ 11:59 am

"a two-by-two graph"?
It all seems seriously jumbled to me.
Howard.
Jeff DeMarco said,

June 24, 2013 @ 12:38 pm

In math education it is often referred to as a "generic" square or rectangle. In genetics, one special case is known as the "Punnet" square, and this terminology is sometimes used in other circumstances.
Jon Weinberg said,

June 24, 2013 @ 12:50 pm

The wikipedia explanation of "contingency table," which ML links to, describes a table displaying frequency distribution of variables, i.e., not the sketch in the novel. In my idiolect, the thing in the novel is a "2×2 matrix".
marie-lucie said,

June 24, 2013 @ 2:43 pm

The explanation of the extra boxes will be obvious to a lot of people, especially those used to interpreting such diagrams, who can "see" the solution at a glance, but many people, including some considered brilliant, have a poor sense of the logic of visually presented information and prefer long verbal explanations which would drive visually oriented people crazy.
leoboiko said,

June 24, 2013 @ 3:31 pm

As someone with a background in computer science, I think of those as "two bits", or

0 0
0 1
1 0
1 1
Rubrick said,

June 24, 2013 @ 4:16 pm

The phenomenon of "I came across this term and didn't really understand it, but it sounded cool and so I decided to use it anyway" is painfully common, but it is a little odd (or perhaps merely sad) that someone with a Psychology Ph.D would make this particular mistake.
Sili said,

June 24, 2013 @ 6:33 pm

Interesting. I try teaching chi-square tests to highschoolers, and I'd never considered the "square" could be interpreted that way (the association is luckily less salient in Danish). On the other hand, far from all out contingency tables are square – is that different in psychology? Always only two options?
Sybil said,

June 24, 2013 @ 9:55 pm

Marie-lucie said what I would add: there is no such thing as a concept that is so obvious that it doesn't need to be said explicitly. In fact, that the other speaker realized that there were two squares of the table which had not been discussed: even that realization does not go without saying, in my experience.

And I'd always said "chi-squared", as MYL indicates, but the "d" is a little hard to hear.

On the other hand,

1) These tables are not always, or even mostly, square. But some people seem to use "square" to mean "rectangular", so who knows.
2) Why not just call them "Tables"? That seemed to be the most common name given them when I tried to Google "a chi square".

Another thing: the first time I read this passage, I initially read "chi square" as Qi Square, and thought it had something to do with the TCM concept of Qi. That resulted in a moment or two of disorientation (!) later in the sentence.
Sybil said,

June 24, 2013 @ 10:02 pm

Ah, #1 as Sili said. Yes.
Joseph Bottum said,

June 24, 2013 @ 10:55 pm

The passage brings to mind a basic rule for editors: If the metaphor is more complicated than the metaphorand, the explanation more confusing than the thing to be explained, the writer needs to rewrite.

But maybe the over-explanation and even the odd use of "chi square" are intended by the author to illustrate the character of the "Alan Gregory" narrator.
AntC said,

June 25, 2013 @ 3:40 am

Coming from a Management Science background, I'd call that a magic quadrant http://en.wikipedia.org/wiki/Magic_Quadrant
But I guess "magic" would not be an appropriate term when discussing suicide behaviour.
Michael said,

June 25, 2013 @ 5:48 am

In the behavioral sciences everyone would immediately understand this. A contingency table (of any size and dimension, i.e. not only nxm, but also nxmxp…) is analyzable by the chi-square non-parametric statistic. Same statistic is used for other circumstances, as well. Obviously unclear to lay readers.
iching said,

June 25, 2013 @ 5:53 am

As a statistician I take the point made by Sybil and Mark that the normal term is 2×2 contingency table. Chi-square is sometimes used as short hand for the test of statistical significance for association (or non-independence) of the 2 variables represented by the rows and columns. (BTW @leoboiko the variables are often coded 0/1 as you suggest). The chi-square test statistic is usually Pearson's but could be the likelihood ratio statistic which has a different formula but is supposed to conform slightly better to the chi-square distribution on which both tests are based. Cramér's V on the other hand is a measure of the strength of association, not statistical significance. The magnitude of the association can be high, but not significant because of small sample size.

However I don't find the novel's use of "2×2 chi-square" too odd. I imagine anyone who has done an introductory statistics course and not gone any further would imediately know what is meant. It's a novel not a text book.
JS said,

July 12, 2013 @ 8:35 am

Yet another abuse of null hypothesis significance testing in the psychological literature.

RSS feed for comments on this post

Chi and squares and contingencies

27 Comments

Pete said,

Matt Whyndham said,

Rod Johnson said,

NW said,

Jens Fiederer said,

Sybil said,

Martin J Ball said,

Dick Margulis said,

nicoleandmaggie said,

Zubon said,

Sybil said,

prasad said,

Gene Callahan said,

Howard Oakley said,

Jeff DeMarco said,

Jon Weinberg said,

marie-lucie said,

leoboiko said,

Rubrick said,

Sili said,

Sybil said,

Sybil said,

Joseph Bottum said,

AntC said,

Michael said,

iching said,

JS said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta