"Protester dressed as Boris Johnson scales Big Ben"

« previous post | next post »

Sometimes it's hard for us humans to see the intended meaning of an ambiguous phrase, like "Hospitals named after sandwiches kill five". But in other cases, the intended structure comes easily to us, and we have a hard time seeing the alternative, as in the case of "Extinction rebellion protester dressed as Boris Johnson scales Big Ben".

These two examples have essentially the same structure. There's a word that might be construed as a preposition linking a verb to a nominal argument ("named after sandwiches", "dressed as Boris Johnson"), or alternatively as a complementizer introducing a subordinate clause ("after sandwiches kill five", "as Boris Johnson scales Big Ben"). In the first example, the complementizer reading is the one the author intended, while in the second example, it's the preposition. But in both cases, most of us go for the preposition, presumably because "named after X" and "dressed as Y" are common constructions.

Interestingly, some commonly used parsers have more or less the opposite prejudice. Thus the Berkeley parser:

In the second example, the Berkeley parser analyzes scales as a plural noun, but still places it in the structure appropriate for a verb:

If we substitute climbed for scales, the part of speech problem is fixed, but the structure is still the wrong one:

The Stanford parser acts in a similarly inhuman way in the first example:

(ROOT
  (S
    (NP (NNS Hospitals))
    (VP (VBD named)
      (SBAR (IN after)
        (S
          (NP (NNS sandwiches))
          (VP (VBP kill)
            (NP (CD five))))))
    (. .)))

In the second example, it makes a slightly different choice, deciding that scales is a proper noun, and that "Boris Johnson scales Big Ben" is stacked-up noun phrase like "North Dallas tornados property damage".

(ROOT
  (S
    (NP (NNP Extinction) (NN rebellion) (NN protester))
    (VP (VBD dressed)
      (PP (IN as)
        (NP (NNP Boris) (NNP Johnson) (NNP scales) (NNP Big) (NNP Ben))))
    (. .)))

If we prevent this error by substituting climbed for scales, we're back with the complementizer reading:

(ROOT
  (S
    (NP (NNP Extinction) (NN rebellion) (NN protester))
    (VP (VBD dressed)
      (SBAR (IN as)
        (S
          (NP (NNP Boris) (NNP Johnson))
          (VP (VBD climbed)
            (NP (NNP Big) (NNP Ben))))))
    (. .)))

Our intuition — mine, anyhow — is that our analysis is guided by a combination of pattern frequency and common sense. Thus we have trouble with the first example, because "Hospitals named after sandwiches" fits our "X named after Y" pattern well enough to lock in that reading — but the result makes no sense. And we do the right thing with the second example, because "protester dressed as Boris Johnson" fits the "X dressed as Y" pattern, and this time the result works out.

The parsers apparently don't have — or don't use — those patterns. Which is ironic, since such parsers approximate the concept "makes sense" in terms of lexical co-occurrence rather conceptual coherence. More modern NLP systems have more elaborately trained expectations about lexical co-occurrences. But conceptual coherence is still a problem, as underlined by the Winograd Schema Challenge results.

 



18 Comments

  1. Cervantes said,

    October 22, 2019 @ 10:22 am

    Well, one thing here is that if you misinterpret the sentence, the tenses don't match up. "Extinction rebellion protester dressed as Boris Johnson scalesd Big Ben" would make sense as saying that the person put on clothes while Boris Johnson was scaling Big Ben. (Although that is actually wrong since Big Ben is the clock, not the tower. But I digress.) But A got dressed as B scales doesn't work.

  2. Greg said,

    October 22, 2019 @ 11:11 am

    @Cervantes: that's not really true: the tenses match if you read it as "Extinction protester [is] dressed as Boris Johnson scales Big Ben@.

  3. Cervantes said,

    October 22, 2019 @ 11:16 am

    Well yeah but the elision wouldn't likely come to mind, unless as you did someone points it out. The question is why we easily read the sentence correctly.

  4. Sniffnoy said,

    October 22, 2019 @ 11:50 am

    Well I certainly hope he wasn't trying to make a point on Wikipedia this way…

  5. Amanda Adams said,

    October 22, 2019 @ 12:50 pm

    I wanted Boris to be using a fisherman's knife. It wouldn't alter the pattern, but the mental image…

  6. Vincent G. said,

    October 22, 2019 @ 1:05 pm

    @Cervantes

    I don't think the tense "mismatch" is actually an issue—the present tense in the second part would simply mean that the event was happening at the time and still ongoing, i.e. Boris Johnson was scaling Big Ben then and is still scaling it now. Compare hypothetical examples like "Students injured as protests spread", "Murderer named as Oriental Express reaches destination", etc.

  7. Andy Stow said,

    October 22, 2019 @ 1:34 pm

    @Cervantes: Big Ben is the bell behind the clock.

  8. cervantes said,

    October 22, 2019 @ 2:19 pm

    Well we don't have an imperfect tense in English, but it seems to me that Boris scaling the tower would be a fairly time limited event. That doesn't seem like a natural way of speaking in this instance. Again, it's not a question of whether it's possible to justify the construction, it's a question of what people are likely to understand at first glance.

  9. ktschwarz said,

    October 22, 2019 @ 3:26 pm

    @cervantes, how is a computer going to learn that Boris scaling the tower is a time-limited event? That's the question that Prof. Liberman wants to highlight, I think.

    I presume the Berkeley and Stanford parsers actually find lots of alternatives and these demos just show the top-ranked one?

    CMU's Link Parser found 12 parses for the protester sentence. The top one is similar to Berkeley's and Stanford's, with "dressed" as the main verb:

    +—–Cs—-+ +—-Os—+
    +–Ds–+—-Ss—-+–MVs–+ +—G–+—Ss—+ +-G-+
    | | | | | | | | |
    a protester.n dressed.v as.p Boris Johnson scales.v Big Ben

    Constituent tree:

    (S (NP A protester)
    (VP dressed)
    (SBAR as
    (S (NP Boris Johnson)
    (VP scales
    (NP Big Ben)))))

    (This one isn't meant for headlines; it wouldn't accept the sentence starting with "protester" until I added a determiner, and it interprets "dressed" as active and past tense, but I think it still illustrates the ambiguity.) The second-ranked parse is the intended one, with "scales" as the main verb. One of the other parses implies that the protester dressed Big Ben as Boris Johnson's scales—perhaps Boris Johnson is a fish.

  10. ktschwarz said,

    October 22, 2019 @ 3:31 pm

    Sorry, looks like WordPress stripped out the <pre> tags around the LinkParser output in my comment. Any suggestions on how to get it to display?

  11. Andrew (not the same one) said,

    October 22, 2019 @ 3:52 pm

    It seems to me that the elision of 'is' in 'is dressed' is perfectly normal in headlinese – where the word 'is' tends to be avoided if at all possible. So that reading is not at all unnatural, though less salient in this case, because the other reading makes more sense.

  12. Andrew Usher said,

    October 22, 2019 @ 5:23 pm

    Well, yes, but 'be dressed' is less common (esp. in the news) than 'be named'; the fact that it would be elided if used doesn't change that. Let me first note that the sentences aren't exactly parallel, because 'dressed' has an active reading – noted by the first commenter – while 'named' only had passive ones, because active 'named' requires an object (or two).

    This question doesn't seem hard to me. In the protester headline – besides figuring out just what an 'extinction rebellion protester' might be protesting – we have no trouble with the syntax, because the most obvious reading also makes the most sense. The 'hospitals named' sentence is far from that; even after you've rejected (or not considered) 'named after sandwiches', it's still not obvious that 'named' doesn't mean 'given a name' since the context isn't apparent – on the other hand, 'hospitals named in sandwich poisoning cases' would not present that difficulty (and would have been preferable).

    k_over_hbarc at yahoo dot com

  13. Jenny Chu said,

    October 22, 2019 @ 7:52 pm

    There is a third interpretation …

    According to conspiracy theory king David Icke, the Queen and other aristocracy are actually shape-shifting reptiles: https://www.express.co.uk/news/weird/768800/David-Icke-queen-shape-shifting-lizard

    Is Boris Johnson also a member of this dread clan? "Boris Johnson Scales" — is the headline a veiled reference to Boris Johnson's scaly exterior?!

    !!!

    It's up to Language Log to find out the truth …

  14. PFD said,

    October 23, 2019 @ 8:24 am

    @Vincent
    "Compare hypothetical examples like "Students injured as protests spread", "Murderer named as Oriental Express reaches destination", etc."

    Or the more precisely parallel construction (also made up btw) "Killer identified as Marlon Prchawsky dies in hospital." I think it's pretty much impossible to identify the intended meaning to that without additional info.

  15. milu said,

    October 27, 2019 @ 9:32 am

    Wait, but isn't it possible that our brains parse both sentences similarly (spontaneously favouring prepositions over complementizers) and get the Boris Johnson one right because it's the only one where the assumed preposition is actually intended as a preposition?

    But in any case, I suspect Andrew Usher is right, the confusion comes from a failure to adequately evaluate the amount of context available to readers. "As" in "protester dressed as Boris Johnson" is immediately recognized as a preposition because we know that the focus of the news item is very unlikely to be the fact that a particular protester did, in fact, dress.

  16. Andrew Usher said,

    October 28, 2019 @ 10:35 pm

    A tangential point: the grammar I have learned would describe 'as' as a preposition either way, whether it governs 'Boris Johnson' alone or the whole phrase 'Boris Johnson scales Big Ben'. But I may have learned/remembered wrong.

    The same for the hospitals sentence, though there one could argue that 'named after' is a phrasal verb and 'after' not a real preposition there. Cf. the synonymous 'named for'; 'after' and 'for' hardly overlap in their prepositional meanings.

    But the meaning is clear, whatever terminology you use.

  17. ktschwarz said,

    November 4, 2019 @ 10:39 am

    I'll try using the <code> tag on the LinkParser constituent tree. Does this work?

    (S (NP A protester)
    (VP dressed)
    (SBAR as
    (S (NP Boris Johnson)
    (VP scales
    (NP Big Ben)))))

  18. ktschwarz said,

    November 4, 2019 @ 10:51 am

    One more try:

    (S (NP A protester)
       (VP dressed)
       (SBAR as
             (S (NP Boris Johnson)
                (VP scales
                    (NP Big Ben)))))

RSS feed for comments on this post