Clueless Microsoft language processing
« previous post | next post »
A rather poetic and imaginative abstract I received in my email this morning (it's about a talk on computational aids for composers), contains the following sentence:
We will metaphorically drop in on Wolfgang composing at home in the morning, at an orchestra rehearsal in the afternoon, and find him unwinding in the evening playing a spot of the new game Piano Hero which is (in my fictional narrative) all the rage in the Viennese coffee shops.
There's nothing wrong with the sentence. What makes me bring it to your notice is the extraordinary modification that my Microsoft mail system performed on it. I wonder if you can see the part of the message that it felt it should mess with, in a vain and unwanted effort at helping me do my job more efficiently?
Here is the message as it actually appeared on my screen:
We will metaphorically drop in on Wolfgang composing at home in the morning, at an orchestra rehearsal in the afternoon, and find him unwinding in the evening playing a spot of the new game Piano Hero which is (in my fictional narrative) all the rage in the Viennese coffee shops.
The Microsoft Office 365 system and its Outlook email manager (which my university has ordained that we shall all use) decided that a certain word sequence in the message might denote an event that I would want to enter on my calendar. It therefore linked the text to a popup calendar dialog box and marked it up in underlined blue to let me know it had done this (without my permission or consent).
Office 365 is the crappiest, slowest, most annoying email system I have ever encountered, and that is really saying something. I could write reams for you about its stupidities and detrimental effects on my productivity. Its attempts at showing intelligence are perhaps its worst feature. I have no idea what kind of natural language processing botch could possibly be implicated in generating the hypothesis that morning, at an orchestra rehearsal in the afternoon, might denote an event (would the event be in the morning, or the afternoon?), but clearly there is no syntax involved in hypothesizing it, and even less semantics. It looks as if the program simply spots words denoting time points or intervals, like morning or afternoon, and makes a guess at the boundaries of the containing constituent. (It often assumes the whole sentence or paragraph is relevant, but here it just took a nonconstituent word sequence.)
I'm not particularly interested in the mechanisms that add these links, except perhaps as an example to illustrate points I have made elsewhere (namely that the state of the art in computer handing of human language is so dire that something needs to be done, though nothing will be done, because nobody cares). I am, however, profoundly interested in the question of how to switch this and other putatively smart features off. But there are no signs of any such way. I have examined all the settings panels minutely. There is no way to stop the system guessing wrongly at event references (and references to other entities like dates and addresses) and linking parts of the text to the Office 365 online calendar system (which I do not use). Just to make things a little worse, the box that pops up contains no "Dismiss" button: it expects you to act on it and modify your calendar. I found I could get rid of it by hitting Escape, but only after wasting a minute trying other options, and only as a guess.
There is likewise (in the installation at my university) no way to stop it underlining every word that is not spelled according to American English conventions: though sold in and configured for the USA, this horrible product is on the lookout for aluminium, centre, defence, colour, licence, marvellous, medalled, realise, rigour, signalling, theatre, etc., and marks them up as errors. As a sophisticated British-born bidialectal user, I expect that when I send an email to a British recipient and choose to use British spelling conventions, my decision to employ British norms will be respected. (Notice that it would be perfectly within the powers of current computational linguistics to recognize consistency in British or American spelling practices within a message.) I do not want to be graded and judged and edited by a clueless piece of misfeatured junkware. [Update: The American spell-checking was ultimately, and silently, switched off; it was apparently under the control of universitywide system authorities who did not respond to my questioning about this, and what they did was not to switch it to a British default but just to stop having highlighted spell-checking of any sort. Individual users do not get the option either of choosing their dialect or of turning the spell-checker off. It must be done by The Authorities.]
It's typical Microsoft: a bloated program with ill-programmed fancy features you never asked for and don't want but can't turn off (or can only turn off with great difficulty after intensive searching through hundreds of preference panes). Recall Mark Liberman's post about the way Excel invents gene names by making incorrect guesses at dates that it imagines the user might have intended, and inserting its guesses into text fields, ignoring what was actually typed.
Microsoft markets software that is utter shit to begin with, which is already bad, but what's worse is that it is evolving shit, and new versions keep emerging with new layers of shit laid over the old shit, and it thinks that although it has shit for brains it is smarter than users like you and me.
The less I have to use Office 365 the less grouchy I will be, so don't email my academic account, OK?
Update, Thursday 6 October 2016: I cannot resist showing you another instance. Today I received another email with an abstract for a technical talk about morphology and syntax on Friday 14 October, and it contained this sentence:
The existence of the words "warmth" and "truth" do not imply the possibility of "coolth," but "warmness," "trueness," and "coolness" are all grammatical. And speakers of English will naturally drop the phrase "on the counter" from "John made dinner on the counter", but not from "John put dinner on the counter".
But here is how it appeared on my screen:
The existence of the words "warmth" and "truth" do not imply the possibility of "coolth," but "warmness," "trueness," and "coolness" are all grammatical. And speakers of English will naturally drop the phrase "on the counter" from "John made dinner on the counter", but not from "John put dinner on the counter".
Clicking on the blue part revealed a message beginning "We think we've found an event." Here's a screenshot:
That's right: Microsoft's Office 365 system thinks the message is suggesting a dinner between 6 and 6:30 p.m. this evening, October 6, eight days before the advertised talk even takes place. This is the kind of brain-dead dopiness that Microsoft thinks is worth marketing to the general public (notice, Office 365 is not in beta test: it is imagined to be fit for general use). I cannot switch this stupid content-triggered calendar-linking feature off — and I cannot even go to the dinner tonight to get free food, since it is a total fiction.
I have no idea what weird interaction between the calendar system, the mail system, and bad computational linguistics could have led the software to be this delusional. But in a way, I don't want to know. It's like when someone is hitting you over the head with a breadboard over and over again (you know how it is): I don't want to know how or why, I just want it to stop.