The Scunthorpe effect rides again

« previous post | next post »

Alex Hern, "Anti-porn filters stop Dominic Cummings trending on Twitter", The Guardian 5/27/2020:

Twitter’s anti-porn filters have blocked Dominic Cummings’ name from its list of trending topics despite Boris Johnson’s chief adviser dominating British political news for almost a week, the Guardian can reveal.

As a result of the filtering, trending topics over the past five days have instead included a variety of misspellings of his name, including #cummnings, #dominiccummigs and #sackcummimgs, as well as his first name on its own, the hashtag #sackdom, and the place names Durham, County Durham and Barnard Castle.

The filter also affects suggested hashtags, meaning users who tried to type #dominiccummings were instead presented with one of the misspelled variations to auto-complete, helping them trend instead.

This sort of accidental filtering has gained a name in computer science: the Scunthorpe problem, so-called because of the Lincolnshire town’s regular issues with such censorship.

This reminds me of the days I spent in the early 1980s, working on the installation of a text-to-speech demo in AT&T's EPCOT exhibit. Getting the computer set up and running was the easy part — the hard part, which we hadn't thought about until the Disney people clued us in, was identifying all the taboo words, and especially all the creative re-spellings of taboo words, and replacing their pronunciations with a cough sound.

For some background on earlier amusing failures of this type, see "C*m sancto spiritu", 8/7/2006, and "Taboo toponymy", 1/24/2009.

 



15 Comments

  1. Charles in Toronto said,

    May 28, 2020 @ 10:14 am

    Another clbuttic example.

  2. Philip Taylor said,

    May 28, 2020 @ 11:12 am

    I cannot help but feel that when the unfortunate Mr Cummings finally realises that, post Barnard Castle, his position is no longer tenable, at least one of the UK tabloid media will greet his departure by publishing a suitably customised version of the well-known limerick "There was a young man from Tashkent".

  3. Coby Lubliner said,

    May 28, 2020 @ 11:52 am

    Philip Taylor: Is that the one where rhymes are "bent" and "went"?

  4. Philip Taylor said,

    May 28, 2020 @ 12:00 pm

    I regret to say that it is, Sir.

  5. Chris Buckey said,

    May 28, 2020 @ 10:08 pm

    What must these sorts of filters make of former Labour bigwig Ed Balls?

  6. Jenny Chu said,

    May 28, 2020 @ 10:21 pm

    This reminds me of the case where the American Family Association's auto-correct widget generated a surprising headline about the Olympic trials of sprinter Tyson Gay …

    https://www.theguardian.com/technology/blog/2008/jun/30/computerautocorrectssurname

    … as well as my own experience 20+ years ago on the BBS "NetNam" in Hanoi, where a friend tried and failed to post something about wine, and was baffled when it was blocked for being offensive (the culprit? Chardonnay).

  7. Lugubert said,

    May 29, 2020 @ 5:09 am

    On a language forum, I once saw that a post of mine had been changed to refer to Rembrandt's The Nigh****ch. There was also a mangled example that I don't quite remember, where I originally had made a reference to a feline.

  8. Ralph Hickok said,

    May 29, 2020 @ 5:45 am

    The National Football League ran into a similar censorship problem with its refusal to sell jerseys with the name "Gay" in 2004, even though the New England Patriots had a starting defensive back named Randall Gay on their Super Bowl championship team.

  9. Rose Eneri said,

    May 29, 2020 @ 8:19 am

    @Ralph – The great Marvin Gaye added the e to his last name, in part to dispel rumors of his sexuality (but to also to distance himself from his abusive father.)

  10. Cervantes said,

    May 29, 2020 @ 1:22 pm

    It used to be you couldn't write the word socialism in blog comments because it contains cialis. The plague of spam marketing of that particular product has abated, fortunately.

  11. carla said,

    May 29, 2020 @ 2:32 pm

    @Lugubert: My favorite example of ironic automated censorship on a forum was in a post about Jews of Eastern European descent that referred to them as Ashke****.

  12. Roscoe said,

    May 29, 2020 @ 4:49 pm

    "On a language forum, I once saw that a post of mine had been changed to refer to Rembrandt's The Nigh****ch."

    And yet the BBC never censored Neddie Seagoon's "Whatwhatwhatwhatwhat?"

  13. David Morris said,

    May 30, 2020 @ 6:05 pm

    'Specialist' also fell foul of filters for 'cialis'.

    One forum I was part of sent 'snigger' to moderation. We all learned to write 'snicker' instead.

  14. Michèle Sharik Pituley said,

    June 1, 2020 @ 11:50 am

    Microsoft's email platform is called Exchange. The forum for discussion about this product was msexchange (dot com). That URL got flagged in a lot of anti-porn filters unless explicitly whitelisted.

    Re snicker: AmE spells it with the ck instead of gg, and I was raised to never ever say the N-word, so reading BrE books containing the gg version still causes me to cringe a bit internally.

  15. Vulcan With a Mullet said,

    June 3, 2020 @ 5:10 pm

    We have a suburban city near Atlanta called Cumming. I am not aware of it being inadvertently censored, but it probably has been! It's such a familiar name heard growing up on the north side of the Atl metro area, I hardly ever even register the "naughty" connotation, but people from out of town who are unfamiliar with it almost always chuckle at it.

RSS feed for comments on this post