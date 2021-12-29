"…attacking members of the public found dead"
A striking example of the post-modifier attachment ambiguity: "Police officer jailed for attacking members of the public found dead", The Guardian 12/29/2021.
Bob Ladd, who sent in the link, spent "quite a few hundred milliseconds" puzzling about why the police officer had attacked dead people.
The Berkeley parser gets the attachment even more wrong, construing the headline to refer to "members of [the public found dead]":
I've labelled the three NPs that could be post-modified by "found dead" — and the correct answer would have been #3, the police officer.
The Stanford dependency parser decides on #2 as the modified NP:
And spacy puts the attachment a bit further to the left (though still not in the right place), but still gets the parse wrong, construing "jailed" rather than "found" as the main verb:
Dep tree Token Dep type Lemma POS ──────────────────── ───────── ──────── ─────── ────────── ┌─► Police compound police NOUN ┌─►└── officer nsubj officer NOUN ┌┬┬───────────┴───── jailed ROOT jail VERB ││└─►┌────────────── for prep for ADP ││ └─►┌─────────── attacking pcomp attack VERB ││ └─►┌──────── members dobj member NOUN ││ └─►┌───── of prep of ADP ││ │ ┌─► the det the DET ││ └─►└── public pobj public NOUN │└──────────────►┌── found advcl find VERB │ └─► dead oprd dead ADJ └──────────────────► . punct . PUNCT
The obligatory screenshot:
Lydia W. said,
December 29, 2021 @ 5:42 pm
I have to say that this one didn't really confuse me upon first reading. Perhaps it's because of a tendency to perform a "greedy match", taking as many words as possible to be part of one noun phrase?
Either that, or it's due to the two types of elision here; we can complete the sentence as "(A) police officer (has been) jailed for attacking members of the public (who were) found dead" or "(A) police officer (who was) jailed for attacking members of the public (has been) found dead", and I suspect that the relative clause omission is more common, though this might depend on one's level of exposure to headlinese in particular.
David Marjanović said,
December 29, 2021 @ 5:50 pm
Interestingly, the subtitle shows the headline is shortened too much – he wasn't in jail when he was found dead, he "is believed to have just left prison", so he had been jailed earlier.