The Wason selection test
« previous post | next post »
This post follows up on two earlier posts ("'Unable to understand basic sentences?'", 7/9/2010; "More on basic sentence interpretation", 7/12/2010), which discussed some experiments by Dabrowska and Street showing that "a significant proportion of native English speakers are unable to understand some basic sentences". I mentioned several times that these results, though new in detail, echo in many ways the results of research by Peter Wason that is nearly half a century old. The discussion below is based on some of my lecture notes for an experimental course taught back in 1999.
Karl Popper postulated that science is based on hypothetico-deductive reasoning, in which the key step is the search for counter-examples, that is, for evidence contradicting a given hypothesis. Peter Wason wanted to explore the possibility that learning in ordinary life is really science in embryo: the formation of hypotheses and the search for evidence to contradict them. The Wason selection test, developed in the mid-1960s, therefore evaluates subjects' ability to find facts that violate a hypothesis, specifically a conditional hypothesis of the form If P then Q.
In Wason's test, four "facts" are presented in the form of cards. Each card has one piece of information on one side, and another piece of information on the other side. The "conditional hypothesis" to be evaluated has to do with the relationship between the information on the two sides of the cards. The subject is shown four cards with one side up and the other side down; the task is to decide which cards should be turned over to evaluate the hypothesis.
For example: the hypothesis might be "Assume cards have a letter on one side and a number on the other. If a card has D on one side, then it must have 3 on the other side."
Here's a schematic presentation of one trial:
Assume each card has a letter on one side, and a number on the other.
Rule: If a card has a D on one side, then it has a 7 on the other.
You can see four cards (one side of each):
D
|
F
|
7
|
5
|
Card #1 |
Card #2
|
Card #3
|
Card #4
|
In order to check whether the rule is true of these cards, which cards do you need to turn over?
Let's see how this situation fits Wason's schema. This is the "rule", expressed as a simple English sentence:
if | a card has a D on one side | then | it has a 7 on the other |
if | P | then | Q |
Here are the known facts for this trial:
D
|
F
|
7
|
5
|
P
|
not P
|
Q
|
not Q
|
Card #1
|
Card #2
|
Card #3
|
Card #4
|
Given the truth table for "if . . . then . . ."
Q is true | Q is false | |
P is true | "If P then Q" is true | "If P then Q" is false |
P is false | "If P then Q" is true | "If P then Q" is true |
a proposition of the form If P then Q is falsified if and only if P is true and Q is false.
Since we are looking for falsifying instances — which are cases of P and not-Q — we need to check anything that is P (to see if it might also be not-Q), and anything that is not-Q (to see if it might also be P). Things that are not-P and things that are Q are irrelevant.
Therefore, the correct answer, in the Wason trial above, is:
"Card #1 and card #4"
— because this corresponds to the instance of P (card #1) and the instance of not-Q (card #4).
(Of course, in real experiments there are many trials, and the order of the cards is varied).
When I presented this task in class, about a third of the (roughly 100) Penn undergraduates present got the right answer. This proportion may have been inflated a bit by students who read ahead in the course pack. In thousands of replications over the 44 years since Peter Wason introduced this task, it's been shown that most people are really bad at it. For abstract and unfamiliar P and Q — numbers, letters, colors, and so on — less than a quarter of the subjects consistently give the correct answer, even if the subjects are Ivy League undergraduates. The commonest responses are just the P card, or the P card combined with the Q card (and these were also the commonest mistakes in class). Few people see the relevance of the not Q card.
This suggested to Peter Wason that scientific reasoning is not much like reasoning in everyday life: the basic mode of scientific reasoning is completely alien to most people. But one reason that there have been so many replications is that performance depends in complicated ways on the content as well as the form of the task.
For example, consider this alternative version.
These cards represent drinkers at a frat party. Assume each card has a beverage on one side, and the drinker's age on the other.
Rule: If someone drinks beer, then (s)he is 21 or older.
You can see four cards (one side of each):
beer
|
diet coke
|
23 years old
|
19 years old
|
Card #1 |
Card #2
|
Card #3
|
Card #4
|
In order to check whether the rule is true of these cards, which cards do you need to turn over?
In my class demos, every single student always gave the logically correct answer to trials of this type.
There are several differences here. The drinking-age problem is what Wason and Shapiro 1971 call "thematic" — it represents a concrete and coherent situation of a type that's familiar from everyday life, with a rule that's connected in a plausible way to the circumstantial details of the trial. But in the 1980s, Leda Cosmides pointed out another important difference — the drinking-age rule is an instance of a social norm. Cosmides argued that "the human cognitive architecture contains a number of [evolved] domain-specific representation and inference systems, such as social contract algorithms and hazard management systems". Cosmides' idea of an evolved psychological module for "cheater detection" was one of the early and effective arguments in the revival of evolutionary psychology.
My impression is that it remains a matter of debate whether the learning in question takes place in the genome, over tens of millennia, or in the brain, over tens of years. And there are some explanations for the task effects that frame the different tasks as different forms of reasoning in purely logical (as opposed to psychological) terms. But everyone agrees that all humans are really bad at abstract, decontextualized versions of the Wason selection task. And this task — exactly like Dabrowska's Q-is and Q-has sentences — involves evaluating the logical relationship between simple sentences ("If a card has D on one side, then it has 7 on the other" vs. "Every basket has a dog in it") and simple pictures of arrays of objects.
As in the Street and Dabrowska paper, Peter Wason and Diana Shapiro showed ("Natural and contrived experience in a reasoning problem", Quarterly Journal of Experimental Psychology 23: 63–71, 1971) that systematic training improves performance. And Mark Chapel and Willis Overton ("Development of Logical Reasoning and the School Performance of African American Adolescents in Relation to Socioecnomic Status, Ethnic Identity, and Self-Esteem", Journal of Black Psychology 28(4) 295:317, 2002) showed that "high SES students outscored low SES students" on the Wason selection task, suggesting that similar differences would be found in Dabrowska's comparison of graduate students to local "shelf-stackers, packers, assemblers, or clerical workers".
And note that you could use Street & Dabrowska's sentence forms in a Wason test: "Every D has a 7 on the other side of it", etc.
A couple of more recent papers on some of the arguments about the evolutionary-psychology explanation for variation in the Wason selection task:
Jerry Fodor, "Why we are so good at catching cheaters", Cognition 2000; 75: 29–32, 2000.
Laurence Fiddick, Leda Cosmides, and John Tooby, "No interpretation without representation: the role of domain-specific representations and inferences in the Wason selection task", Cognition 77:1-79, 2000.
Laurence Fiddick and Nicole Erlich, "Giving it all away: altruism and answers to the Wason selection task", Evolution and Human Behavior 31(2):131-140, 2010.
And here's an interesting paper on the neurology of the task:
Vinod Goel et al., "Asymmetrical involvement of frontal lobes in social reasoning", Brain 127(4):783-790, 2004.
D.O. said,
July 15, 2010 @ 10:36 am
What about other operations from the propositional calculus? Are there tests about how people try to prove the validity of and, or, equivalence?
[(myl) The most relevant work that I know of is the effort, starting with H.P. Grice's William James Lectures, to explain the many complex real-world meanings of words like and, or, if as the corresponding logical constants plus "conversational implicatures" arising from the circumstances and conventions of communication. But this is philosophy, not psychology, and I don't know of experimental work exploring the interpretation of other propositional-calculus constructions. There's quite a bit of experimental work on quantifier interpretation, though…]
C Thornett said,
July 15, 2010 @ 11:13 am
Were the subjects told the P/not P information, or only D F 7 5? This isn't immediately evident to me from the way it is set out here.
Ran Ari-Gur said,
July 15, 2010 @ 12:02 pm
@C Thornett: the entire indented deep-red section (from "Assume each card" to "turn over?") was presented to the subjects.
Brett said,
July 15, 2010 @ 12:21 pm
We were given this task in the introductory psychology class at MIT, and more than 90% of the class got the correct answer to the problem as it was posed to us. However, the lecturer was flummoxed, since nearly the entire class give the seemingly wrong answer that we had to turn over every card except #3. (That is not a very common wrong answer.) It turned out that he had forgotten to specify that the cards had a letter on one side and a number on the other.
blahedo said,
July 15, 2010 @ 1:30 pm
When I've taught the basic logical operators as truth tables, it goes over reasonably well for the symmetric ones (AND, OR, etc), but IF (and ONLYIF) can be a big sticking point. If I then explain it in terms of English-language if-then propositions and ask "how could this possibly be *contradicted*?", the vast majority of students then see why the truth table for IF is set up as it is. (At least in principle. They still sometimes get it wrong if they're not focusing on it; but they don't *object* to it.)
The relationship I see to the Wason test is that even if students are bad at intuitively knowing which cases can form the contradiction, they are *ready* to understand it, and I suspect that this idea—how to contradict—is just the right way to teach these sorts of logical intuitions.
Grep Agni said,
July 15, 2010 @ 2:04 pm
I've known about his research for years, but I've never heard Wason's name pronounced. It it WAY-son approximately?
[(myl) That's how I've heard it pronounced.]
stephen said,
July 15, 2010 @ 3:01 pm
This intersects with etiquette as well. What do we say when others say something misleading or confusing? How do we correct their errors of vocabulary or grammar?
Why do people say "that's okay" for something which is not okay? How do we find out what they really mean?
Why do people say "that's all right…""
I'm sometimes flexible with regard to "okay"; I sometimes use it to mean I understand, agree, or disinclined to argue. But the difference is that I'm not confusing other people when I say "okay". But it might not be tactful to say that their mistakes are worse than mine…what am I supposed to say?
Paging Letitia Baldrige…paging Letitia Baldrige…
Thanks.
Rubrick said,
July 15, 2010 @ 3:06 pm
I would be interested to know the results of a Dabrowska/Street-style (or for that matter a Wason) experiment done under severe time constraints. I hypothesize that the highly eductated perform better on such tasks because they have time to apply their (slow) trained abstract reasoning skills, and that robbed of such time they might do as poorly as the "shelf-stackers, packers, assemblers, or clerical workers", who are presumably relying on a more instinctual assessment (whatever that means).
A human performing abstract reasoning is like the proverbial dog on its hind legs: he doesn't do it well, but it's impressive that he can do it at all.
Stephen Jones said,
July 15, 2010 @ 4:01 pm
I had to spell it out verbally in slow motion to get it.
One other thing I've noticed is that very few people are capable of reading an instruction manual (I am which is why I have gained the reputation of being incredibly technologically savvy). I think it is the same thing. A power of abstraction is required.
S.Norman said,
July 15, 2010 @ 4:05 pm
I have to admit I gave the typical wrong answer of card #1(D) in the first test. For some reason when I read the rule "If a card has a D on one side…" I just don't see that every 7 card HAS to have a 'D' on its reverse. To me, the rule doesn't make any statement about the reverse side of a 7 card(other than it may or may not have a 'D'). But with the drinking one it's obvious. Am I an alcoholic?
C Thornett said,
July 15, 2010 @ 4:23 pm
I must have forgotten all of the symbolic logic course I took 20-odd years ago, but unless the reverse of card 4 were a D, if it were M for example, how could it prove or disprove the rule that D = 7? This is why I wondered if some additional piece of information were missing from the post, such as all letters and numbers having the same relationship to each other, for example B = 5.
Faldone said,
July 15, 2010 @ 4:35 pm
@S.Norman That's why turning over the 7 card is not something you need to do to see if the Rule is true. You need to turn over the 5 card to see if there is a D on the other side. If there is it disproves the Rule.
Peter said,
July 15, 2010 @ 9:15 pm
@C Thornett: The rule being tested is not D = 7, it is D -> 7 (i.e., having a D on one side implies that there is a 7 on the other side). If we turn over card #4 and it shows a D we have disproved the rule (because we have a D on one side and not a 7 on the other). If we turn over card #1 and it shows anything other than a 7 we have disproved the rule (by the same logic). If neither of these tests disprove the rule then the rule is valid for the given data.
Mark said,
July 15, 2010 @ 10:06 pm
The evolutionary story seems to be:
If there is an implied social contract, then the wason task is easier [than its less grounded version].
One would then want to know whether the wason task is also easier under other conditions (e.g., ones that create a more coherent discourse structure over which to reason) that do not involve an implied social contract. Such conditions exist, as I understand the literature. Therefore, either the evolutionary account only applies to a subset of cases, or there is some other factor or set of factors that explains the full range of phenomena.
elinar said,
July 16, 2010 @ 7:42 am
The Wason and S&D experiments tie in with the view (held e.g. by cognitive linguists and construction grammarians) that language is largely formulaic, as opposed to being an abstract rule-governed system of arbitrary symbols.
Speakers use formulaic sequences to achieve a variety of socio-interactional goals (commands, requests, rituals, hedges, etc. etc.). The use of such sequences reduces the processing effort in both production and comprehension.
All speakers employ both holistic and analytic processing strategies, but the former tends to be the default strategy, while the ability to process language analytically (e.g. to interpret novel sentences out of context) is a complex learnt skill that individuals master to varying degrees.
This is one explanation for the difference between S&D’s LAA and HAA groups.
chris said,
July 16, 2010 @ 10:09 am
ISTM that what Wason shows is that Aristotelian if is not the most intuitive meaning of if. People are subconsciously applying some other interpretation(s) of if to reach their other answers.
The alcohol example is an exception because the answerer knows enough context to realize that a 21+ year old can drink apple juice if they feel like it, etc. and thus select the correct meaning of if for the situation.
People who have extensive experience with Aristotelian logic and the formal meaning of "if" are more likely to understand that this is a test and they should use the formal meaning.
Mary Bull said,
July 16, 2010 @ 11:59 am
Quoting:Peter: "@C Thornett: The rule being tested is not D = 7, it is D -> 7 (i.e., having a D on one side implies that there is a 7 on the other side). If we turn over card #4 and it shows a D we have disproved the rule (because we have a D on one side and not a 7 on the other). If we turn over card #1 and it shows anything other than a 7 we have disproved the rule (by the same logic). If neither of these tests disprove the rule then the rule is valid for the given data."
Suppose that a naive, poor-at-abstract-reasoning person like me turned over cards #1 and #4 and found the rule valid. So then, I turn over card #3 (which displays a 7 on the side that is shown to me) and find an F on the back — does that not invalidate the rule?
[(myl) No — for the same reason that a 30-year-old drinking orange juice doesn't disprove the rule that if someone is drinking beer, then they are at least 21. The rule goes only in one direction ("if D, then 7"). In the other direction, there's no implication — a card with 7 on the number side can have any letter at all on the letter side. Similarly, the drinking-age rule doesn't force someone 21 or older to drink beer.]
Ben said,
July 17, 2010 @ 12:19 am
I completely agree with chris about the natural language meaning of "if". Frequently the term, naturally spoken, actually does mean that there either an equivalence or a one-to-one correspondence, and not a simple declaration of implication. The formal logic meaning is a special secondary meaning that is only used in particular cases in natural language.
It's actually kind of a shame that formal logic has co-opted a term whose primary meaning means something different in natural language. I got the answer to the Wason test right, but I daresay that's only because my training in formal logic made that normally secondary meaning more primary in my mind. That, and the fact that when taking a formalized quiz, I know that I'm supposed to apply formal logic, and not what my language tells me I should apply.
Also, I really fail to see how this test in any way relates to evolutionary psychology. Granted, I haven't read any of the cited papers or even done a basic internet search, so I'm saying this from a standpoint of complete ignorance. But still.
Separately speaking, Grep Agni said, "I've known about his research for years, but I've never heard Wason's name pronounced. It it WAY-son approximately?".
This comment sort of broke my thinking for a few moments. I've only known about Wason's research for a few days (and only from Language Log) but I admit that until I read that comment I was reading and internally pronouncing his name as "Watson". When I read that comment, my first thought was "How could Watson possibly be pronounced WAY-son?". Then it struck me.
blahedo said,
July 17, 2010 @ 2:44 am
@Ben "It's actually kind of a shame that formal logic has co-opted a term whose primary meaning means something different in natural language.":
But it's not clear that it actually does mean something different in natural language. Consider a dialogue such as this:
"If the kids are under ten, give them plastic cups." "What if they're ten or over?" "Then it doesn't matter." "Can I give them plastic cups too?" "Sure, whatever."
This is a perfectly normal use of an "if" construction, and it is *not* consistent with a one-to-one or if-and-only-if interpretation.
Ben said,
July 17, 2010 @ 2:54 am
@blahedo, I didn't say that this meaning was not consistent with natural language. It certainly is. It just believe it isn't the only meaning, and it seems to me that it is also not the "default" meaning of "if" when encountered in unfamiliar contexts
Obviously, I don't have any real data to support my statement or I would present it. But in any case it strikes me as a distinct possibility that the problem with interpreting this puzzle is an informal meaning of "if".
Ben said,
July 17, 2010 @ 2:58 am
Also, I apologize for the hand-waviness of my answer — I would like to do some research to see if my claim can be substantiated at all or if I'm just making things up, but I'm already awake too late tonight. Maybe tomorrow.
Mary Bull said,
July 17, 2010 @ 12:42 pm
@myl "[ … The rule goes only in one direction ("if D, then 7″). In the other direction, there's no implication — a card with 7 on the number side can have any letter at all on the letter side. Similarly, the drinking-age rule doesn't force someone 21 or older to drink beer.]"
Thanks for stating the rule so clearly. I got myself confused about its directionality as I read through all the interesting comments.
RadarLake » You were wrong said,
January 6, 2011 @ 7:57 am
[…] same question but instead of numbers and letters you use a concept within a social context–like being over or under 18 years old and drinking beer–people get much better at picking the right […]
Marcial Fonseca said,
July 27, 2011 @ 2:56 am
Hi there, a little late; but I have as serious doubt. I accept that If P true and Q false; the preposition is false. But the Card 1 has both true P on one side and on the other either Q true or Q false, therefore, I do not see why we must to turn two cards, when I turn Card 1, I have P and Q or not Q.
Ibjaw said,
January 16, 2014 @ 6:55 pm
To a layman like me, it's not very clear from the language of Wason's test that the rule being tested is D->7 and NOT 7->D. The fact that the flip side of 7 (Card#3) is irrelevant is not obvious. I picked Card#1, Card#4 and Card#3 in that order which is obviously wrong. Card#3 was picked to verify 7->D.