Enforced francophony from Microsoft

« previous post | next post »

Microsoft Word has really done it to me this time. I need some expert help, Language Log readers. I have a perfectly ordinary file (a simple letter template showing my home address), created in Word on an American Macintosh Powerbook using an American-purchased copy of Word, and when I open it as a copy on my UK-purchased MacBook Pro (though not when I open it as the original) almost everything works except that the file is deranged, and thinks it is supposed to be in French.

Editing the file provokes enforcement of French spacing conventions (colons and semicolons are preceded by an extra inserted space that I do not type); the double quotation symbols (‘‘like this’’) appear as those funny French marks that look a bit like pairs of less-thans and greater-thans (sort of <<like this>>); and, weirdest of all, the spelling and checking of "grammaire et style" turn into French. Word works through the file checking every significant English word and rejecting it for insufficient francophonicity (with no suggestions for respelling), underlining them all in red, though most French words are accepted. The grammar check not only assumes that French is being checked but also reports its results and queries in French. Saving the file preserves the pseudo-Frenchness.

If you use Word, you will know why I am asking the Language Log community rather than Microsoft. The built-in help files in the program are worthless, telling you things you know and remaining silent on crucial misfeatures and problems; the online help is unusable time-wasting crap with similar properties; and there are no sensible ways of getting through to tech support without spending hours on the phone. If I did get someone at Microsoft to look at this they would probably be instructed to lie about it and then fix the bug quietly at the next upgrade; Microsoft likes to bury its failures secretly. I have examined all of the Preference settings, of course, but to no avail. The Language setting is "English (US)", in keeping with the origin of the document.

I only use the appalling, bloated, slow, clunky, maddening, useless piece of software excrement that is doing this to me because administrative staff and occasionally even scholars with no taste or discernment, or even scholarly journals, force me to use it on pain of not being able to work with them at all. If Nature insists on receiving a Word file, and you want to write for Nature, then you grin and bear it; you have to send them the vile format that they request. If my Head of School sends me a Word file to edit in my capacity as Head of Linguistics and English Language, I have no real choice but to fire up Word and edit the damn thing (though that will often fail because of special characters that do not survive the transfer between machines — remember the horror story in this post).

But I can't send out a file to a journal with French quotation marks and spacing, can I? What on earth can be the source of this bug? And is there any kind of workaround or fix, short of just retyping the content into a newly created file? Is there an exorcism ritual for this? How can the program report that my language setting is American English and at the same offer me only French grammar checking?

If you want to experiment, a Word file with the property I am talking about is saved, as a .docx file, here.

I have verified that the effect remains when the file is opened on a different Mac, and in fact on that machine the Options choice under Tools | Spelling and Grammar showed that the document was being treated as French. The trouble is that changing this to British English in the drop-down made absolutely no difference: the spelling was still checked against French so nearly every major word in the document was marked as a misspelling, and the gramar check was still Grammaire et Style.



89 Comments

  1. Barry Deutsch said,

    December 4, 2010 @ 11:54 am

    Have you tried doing this?

    Also, there are options for you other than Microsoft Word. You could try using Open Office's wordprocessor instead (open office is free); just choose "save as" when you're ready to submit the file, and pick out "microsoft word" as the format to save in.

    Good luck!

  2. Birdwell said,

    December 4, 2010 @ 11:57 am

    Maybe try selecting all of the text, copying it to the clipboard, and pasting it into a new file?

    [I did say, "is there any kind of workaround or fix, short of just retyping the content into a newly created file". The awful thing about Word is that I have frequently seen users driven to doing ridiculous duplicative work like this just to get some ghastly feature to go away. If this is what it takes, the product is not fit for purpose. —GKP]

  3. Allan L. said,

    December 4, 2010 @ 12:01 pm

    I think it's a Sign.

  4. Luis said,

    December 4, 2010 @ 12:02 pm

    The problem may be in the definition of the style/s the text is linked to. This seems likely since the problem carries over from machine to machine. I'm not a Mac expert, but check this page and see if it makes sense to you:

    http://word.mvps.org/mac/spellcheck.html

    Luis

    [This link turned out to be useful; thank you, Luis. The crucial clue was that you have to highlight the entire text and then change the language using Tools | Language…. Who would have thought that the language setting would be a property of portions of text! —GKP]

  5. Helen said,

    December 4, 2010 @ 12:03 pm

    For some reason I couldn't open the file from the link, but if you haven't tried it already, select the whole text and then click at the very bottom of the page (some kind of bar, I guess, I have no idea what it's called), where it says 'French' (I suppose). You can select language there. I hope this helps!

    [The file is there, readable and executable by all; I don't know why you can't get to it. I can't either! Another mystery. —GKP]

  6. jsa said,

    December 4, 2010 @ 12:07 pm

    Instead of Word, download the Open Office package, free from http://www.openoffice.org. It looks very like an older version of Word but it will open and save to Word formats, so it deals with the colleague problem, and it is an almost complete substitute for MS Office–better in many ways.

  7. Steven said,

    December 4, 2010 @ 12:08 pm

    I think that language is a property of chunks of text, rather than of the whole document. Try select-all, then Tools->Language…->English (US).

    [This, indeed, seems to be the crucial piece of information. Thank you. One puzzle remains: how the file got into a state where it thought it was in French, when it never was, and I never typed any French or set the Language setting to French, ever… —GKP]

  8. Steven said,

    December 4, 2010 @ 12:08 pm

    (BTW, the file you linked to isn't there, so I couldn't try this on that one specifically.)

  9. JS Bangs said,

    December 4, 2010 @ 12:10 pm

    I would love to help (I work at MS), but your link is broken. Fix, please?

    [Thanks for your offer. If you are interested I can mail the file to you. But for some reason, although other files placed where it is are accessible, it is not. Probably because *.docx files behave so strangely (I have found students are mailing me .docx files that arrive as zipped archives of directories in which no file can be opened by any program I possess, so the entire enterprise of sending me the file is useless — yet another time-wasting IT disaster I didn't need). —GKP]

  10. HP said,

    December 4, 2010 @ 12:11 pm

    I get a 404 error when I click the linked .docx file.

    The .docx format is a compressed XML file. (Of course, it's in hideous, non-DITA-compliant Microsoft XML, because they can't leave any decent open standard alone.)

    If you change the .docx extension to .zip, you can crack the file open with any ZIP file utility, and read the underlying XML source. You might be able to find some sort of LANG=FR statement or something in there. Try deleting that or changing it to EN and renaming it back to .docx.

    I have no idea if that will work, but it's worth a try. Of course, it's a brute-force fix, and doesn't address how the weirdness happened in the first place.

  11. Jason said,

    December 4, 2010 @ 12:14 pm

    Check the language bar in the Task Bar, and ensure that the domain for your computer hasn't somehow been changed to french.

    Have you tried opening a blank document, and just cutting and pasting the content into the new document?

    [I did say above that I realized I could paste into a new document and it would probably work. But really! I want to have a product that does what I tell it to, or at least, to understand why it won't, and know sensible ways of fixing it. I really won't accept that having me start over is a viable solution. —GKP]

  12. Owen Blacker said,

    December 4, 2010 @ 12:19 pm

    In the language selection dialog there's a setting reading "Detect language automatically". If you disable this, it might make a difference.

    Sometimes, Word is really good at detecting languages. Sometimes it really sucks. It's very useful to have Word and email realise I'm typing in French, Spanish or German when I am. But it's much less useful when it thinks my English is foreign.

    I'd try disabling that functionality, resetting the whole document to English and see if that makes a difference?

    [My version of Word doesn't have the "Detect language automatically" feature. Thank God. (If I could pay extra to get fewer features, I would.) —GKP]

  13. DM said,

    December 4, 2010 @ 12:24 pm

    Here's the correct URL to get the file (the linked version causes a 404): http://languagelog.ldc.upenn.edu/~pullum/frenchfile.docx

    [You're quite right; I had a mistaken extra segment in the URL. My bad. It is fixed. —GKP]

  14. DM said,

    December 4, 2010 @ 12:29 pm

    Bingo! You were absolutely right, HP; there is this in an XML tag in word/settings.xml (tag angle brackets removed to post on LL):
    w:activeWritingStyle w:appName="MSWord" w:lang="fr-FR" w:vendorID="64" w:dllVersion="131078" w:nlCheck="1" w:checkStyle="0"/

  15. Tim Silverman said,

    December 4, 2010 @ 12:30 pm

    The correct link should be here.

  16. Nick A said,

    December 4, 2010 @ 12:30 pm

    I think the problem may be that a language can be associated with each piece of text, overriding more global settings. Some of the text in this document seems to have been set to French.
    If that is correct then you can fix it this way: Select all the text, then change the language associated with the text. In Mac Word 2008, that's done by choosing Tools>Language, then picking English (UK) or English (US) from the dropdown menu and clicking OK.

  17. fs said,

    December 4, 2010 @ 12:34 pm

    This doesn't really help you, but if I may ask, what document format do you use when you're not being driven mad by MS Word? LaTeX? TeXmacs? reStructuredText / markdown? … XML?

    [I use LaTeX; beautiful LaTeX. I edit with vi, and run pdflatex, and get beautiful typesetting in PDF format instantly, and nothing ever goes wrong: there are essentially no bugs whatsoever, and everything works two orders of magnitude faster than Word can manage. This isn't posturing: I honestly don't know how people can bear to live their lives using Word. —GKP]

  18. MKD said,

    December 4, 2010 @ 12:35 pm

    Under Tools, there's an option to choose language (three below spelling and grammar in my version of mac word) – have you tried changing the dictionary back over to English?
    That should partly fix the problem, but I'm not sure about the "French" spacing conventions and quotation quirks that it's added in.

  19. Helen said,

    December 4, 2010 @ 12:39 pm

    Well, I tried my method (see above) with the text now, via the link that Tim provided, and it works.

  20. ~flow said,

    December 4, 2010 @ 12:41 pm

    what i sometimes do in such cases is to (1) open the document in word (2) paste the text into a plain-text editor (3) select the text and copy it again (4) paste it back to word.

    of course, this procedure will erase any formatting. if you don't like the ways that word tries to outsmart users, why don't you just edit it in a text editor and paste it into word as a last step?

    as far as i can recall, it used to be the case that you can select text in word and then assign a language to it. no idea where that feature has landed now that m.s. has gone all ribbon.

  21. Tim Silverman said,

    December 4, 2010 @ 12:47 pm

    OK, the methods suggested by Nick A and by MKD appear to fix all the problems when I tried it—spell-check, colon spacing, guillemets, the lot.

  22. Jon said,

    December 4, 2010 @ 12:47 pm

    Sometimes saving the file as 'rich text file' (.rtf) will do the trick. Open the new file, then save again as docx.

    If instead you cut and paste to a new file, choose Paste special > Unformatted text.

    [Once again: it is staggering that people should thing deformatting everything and pasting into a new document and starting again is a solution. I would call it an utter abdication, an admission that there is no solution. —GKP]

  23. Tim Silverman said,

    December 4, 2010 @ 12:51 pm

    @HP: I did find a couple of bits of xml containing something like lang=fr-FR (except not that exact expression), but changing that to en-GB just caused Word to report an error, and I didn't feel like wading through disproportionately large amounts of xml trying to work out the interdependencies.

  24. Luis said,

    December 4, 2010 @ 12:52 pm

    Nick A. and Helen — I agree — just tried it on my wife's Mac and Word:mac 2008: Select all (Cmd-A), Tools –> Language –> US English –> OK.

  25. Levi Montgomery said,

    December 4, 2010 @ 1:01 pm

    Late to the party, as usual, but I have to cast my vote for OpenOffice. What Word wishes it was.

  26. John Cowan said,

    December 4, 2010 @ 1:03 pm

    I too urge you to switch to OpenOffice.org at once.

  27. Ellen K. said,

    December 4, 2010 @ 1:06 pm

    I opened the document in OpenOffice, on my US Windows 7 computer, and then cut and pasted the text into a new document, and it was marked as in French, with the extra spaces showing in grey, and the French quote. Typing addition stuff got the French quote marks from the quote key. Switching the language to English fixed for typing new text, but didn't change what was already typed.

  28. JS Bangs said,

    December 4, 2010 @ 1:13 pm

    Really? OpenOffice? Word is hands-down the best word processor around. This doesn't mean it's actually good, but it isn't as aggressively atrocious as OpenOffice.

  29. Steve T said,

    December 4, 2010 @ 1:15 pm

    I would actually use LibreOffice instead of Open Office. You can find it here: http://www.documentfoundation.org/download/
    for reasons stated here: http://www.timesoftheinternet.com/briefs/openoffice-forked-as-libreoffice/

  30. Dr. Jochen L. Leidner said,

    December 4, 2010 @ 1:21 pm

    Fixing a Word problem by sticking to Word is a bandaid rather than a solution.

    I'd recommend you design a nice LaTeX stylesheet e.g. by copying 'letter.cls' and personalizing it with your address. LaTeX is available for Mac OS X, Unix, Linux, and even for Windows – and it's free.

    You will never have that problem again, and your letters will be typeset beautifully forever after.

    [This is annoying. You haven't read what I wrote, or if you have, you seem to have ignored it. I said: "I only use the appalling, bloated, slow, clunky, maddening, useless piece of software excrement that is doing this to me because administrative staff and occasionally even scholars with no taste or discernment, or even scholarly journals, force me to use it on pain of not being able to work with them at all." Of course I use LaTeX. I'm not a fool. But the thing is, I sometimes have to work with people who can't, people who require that I use Word like they do. And I would like to be able to wrestle with the vile thing and make it do elementary tasks like one-page letters. But it's such a pig that it keeps defeating me, despite the fact that I have wasted hours trying to battle against its brain-dead, over-featured, unresponsive, secretive clunkiness. —GKP]

  31. Lance said,

    December 4, 2010 @ 1:27 pm

    I'm amazed at the number of people who think the correct answer to "MS Word thinks English is French" is "You wouldn't have this problem if you stopped using MS Word". That's honestly about as helpful as the advice no one has offered: "You wouldn't have this problem if you stopped using English and just typed the document in French". Even if OpenOffice were better than Word (IMHO, it's not) and LaTeX did everything you needed it to do (in my experience, it doesn't), it's not remotely a solution to Geoff's problem.

    [Thank God for someone talking sense. The number of people who think I should accept the program and change my life is just amazing. It matters very little whether they suggest I should (i) abandon the good-faith attempt to make this crock of a program work and install something else, or (ii) abandon the file and type it all again, or (iii) type in French. They are all missing the point. —GKP]

  32. HP said,

    December 4, 2010 @ 1:35 pm

    DM, Tim Silverman: Yeah, I wasn't sure if you could open the XML, edit it, and turn it back into a Word file — I never tried round-tripping it before, but it was all I could think of with no Word 2007 installed at home.

    I've round-tripped native Open Office docs before, and I figured it was a possibility.

    As an author of online Help myself, I used to think Word had the best commercial Help out there. Of course, that was a long time ago. I recently was "upgraded" to 2007 at work, and went looking for some way to make it quit doing "smart" things for me that I didn't want it to do. (Yes, I want so select only part of word. No, I don't want to automatically select the paragraph mark.) Word's Help sent me to some topics with writing advice. Writing advice! Dear God.

    I do like Word's outlining functionality, but if there's an open-source dedicated outliner that's any good, I'd switch in a minute.

    (I document engineering software. I have to constantly remind my colleagues, "We don't teach engineering!" I mean, planes fall out of the sky when idiots use our software. Granted, lives aren't usually at stake when people create a Word doc, but it's just a freaking tool!)

  33. Nightstallion said,

    December 4, 2010 @ 1:48 pm

    "Even if OpenOffice were better than Word (IMHO, it's not) and LaTeX did everything you needed it to do (in my experience, it doesn't), it's not remotely a solution to Geoff's problem."

    It is and it does. ;p

  34. Darla-Jean Weaterford said,

    December 4, 2010 @ 1:50 pm

    The Review pane in Word (I'm still on Word 2007 here) has an option under Set Language to "Detect language automatically." If you don't use a lot of different languages, you might want to uncheck that. Won't fix this problem, but might prevent it in the future.

  35. michael farris said,

    December 4, 2010 @ 1:56 pm

    I hate Open Office for one good reason: you (or at least I) can't do shortcuts for typing in special characters, you have to go to the menu and drop each one in separately. Very frustrating for a linguist (without a whole bunch of different language keyboards installed).

  36. Dan K said,

    December 4, 2010 @ 1:57 pm

    At risk of being accused of hijacking this thread, let me disagree that Word is hands down the best word processor around. If it's best for you, then great. It's not best for me, and not even close. I used Word for about 15 years, I've used OpenOffice for about 10, and now LibreOffice (OpenOffice's probable successor) for a month or two. Frankly, they're all atrocious, but for me, OpenOffice/LibreOffice would be far preferable even if they were all free.

    Incidentally, I do work with a bloated bureaucracy (our University's bureaucracy rivals even the US government's in some respects) and I've submitted many journal articles (can't remember if Nature is on the list), as well as NIH grants. I collaborate exclusively with people who use Word and nothing else. None of our administrators have ever heard of OpenOffice. So I save files in doc format. Using OpenOffice has never caused a serious problem, although occasionally (not lately) I regret installing the latest beta. Occasionally I have to work around bugs or other issues. Word itself often caused problems when I used it.

  37. Roadrunner said,

    December 4, 2010 @ 2:02 pm

    I've had a similar, though even more infuriating, problem with Word! Specifically, I often have to use the proper noun "Acoma" in documents I create at work. "Acoma" is the name of an Indian tribe. It is not a word, outside of being the name of this particular tribe, in either English or Spanish. Nonetheless! Whenever I type "Acoma" as a word early in a new Word document, Word decides that I'm writing in Spanish, spell checks everything, and determines that everything, *including "Acoma"*, is misspelled!

    How does Word decide that I'm writing in a language that doesn't contain the word that triggers this? I have truly no idea. I haven't found a way to reverse it, either–I just start a new document, avoid the word until I've gotten most of the rest of the text written, then go back and fill it in later.

    (Perhaps it's relevant to note that I often have occasion to use Spanish words in English-language documents I write–common last names, common Spanish words used as place names, and loan words from Spanish that are common in my area, but uncommon in standard American English. Perhaps this has confused my computer far beyond what it can handle.)

  38. Morten Jonsson said,

    December 4, 2010 @ 2:33 pm

    Roadrunner–can't you just add Acoma and those common Spanish words to your spellcheck dictionary?

  39. Roadrunner said,

    December 4, 2010 @ 2:48 pm

    I have–doesn't make any difference. As long as Word recognizes I'm writing in English, it knows "Acoma" as a word that's in my dictionary. But if I use it early enough in the document, it switches over to Spanish, and sees "Acoma" as a misspelled word, since it's not in its Spanish dictionary. Word only switches languages on me if I use "Acoma" early in a new document. If I've well established that this is an English-language document, then "Acoma" doesn't trip up the spell check. But if I use it early in a document, it becomes a misspelled word in Spanish.

  40. deadbeef said,

    December 4, 2010 @ 2:50 pm

    Others have already beaten me to it, but if you unzip the docx file, there are 5 instances of in word/document.xml and one instance of w:lang="fr-FR" in word/settings.xml. I don't know what you have to do to prevent Word from doing that, but hopefully that is where it is picking the settings up from. Here's a Bourne shell script to convert your file (assuming you have sh,mktemp,and Info-ZIP)

    #!/bin/sh
    tmp=`mktemp -d`
    unzip -d $tmp frenchfile.docx
    for f in document.xml settings.xml; do
    ed $tmp/word/$f <<EOF
    1,$ s/fr-FR/en-GB/g
    wq
    EOF
    done
    (cd $tmp && zip -r englishfile.docx *)
    mv $tmp/englishfile.docx .
    rm -rf $tmp

    I don't have MS Word or OpenOffice, but I was able to open englishfile.docx with Abiword (another crappy, but at least much lighter Word substitute).

  41. HP said,

    December 4, 2010 @ 2:54 pm

    One of the really interesting things about Word is that many obsolete commands dating from Word 4.0 for DOS still work if you remember the default keyboard combination.

    Try Shift-Control-F9 in your latest bloated version of Word to select columns of characters. Still works!

    Shift-F3 still cycles case (lowercase, uppercase, all caps) when the cursor is within a word, but you won't find it in the documentation. I use that all the time.

    Got a bunch of items in a list? Shift-Ctrl-F9 to select the first letter of every word in the list, the Shift-F3 to change all list items to uppercase or lowercase.

    I am so old.

  42. Peter Taylor said,

    December 4, 2010 @ 3:49 pm

    If Nature insists on receiving a Word file, and you want to write for Nature, then you grin and bear it

    A useful tip which someone mentioned to me when I expressed my displeasure that recruitment agencies often insist on Word documents is that if you take an HTML document and rename it to end with .doc then Word will open it quite happily. This allows me to write my CV in a non-WYSIWYG editor which gives me full control and still have the recipient believe that I'm sending a Word document.

  43. Geraint Jennings said,

    December 4, 2010 @ 4:30 pm

    @michael farris – Which special characters can't you make shortcuts for in OpenOffice? Our bilingual office has opted for OpenOffice and we have installed standard shortcuts for accented characters. Admittedly, making a new dictionary file for a new language is too much like hard work, and occasionally we forget to convert our OpenOffice files to Word docs before emailing out to unsuspecting third parties, but… nobody's perfect.

  44. Ray Dillinger said,

    December 4, 2010 @ 5:12 pm

    I quit using Word years ago when it started doing things I didn't tell it to do, inserting characters I didn't type into my documents, screwing up ("smartening!") quotes in chunks of text that had to roundtrip to ASCII code, miscorrecting technical terms and variable names, flatly refusing to allow me to comply with the rigid indentation standards and list formats my department required because it had its own idea of what indentation standards and list formats were supposed to be, etc. Life is too short to deal with that kind of randomness.

    I find OpenOffice Writer has way more respect for the way I want to spell and format my documents, doesn't screw with the text unless I command it to, and can save in Word formats if and when I ever deal with backward people who still care about that particular kind of backward compatibility.

    [Much of what you mention here is a matter of switching off horrible factory defaults in the Preferences panes. My problem was that I had an indetectable and apparently almost undocumented feature buried in the file itself and no amount of inspection of Preferences panes gave any clue as to how to get back into the native language of the file. —GKP]

  45. Breffni said,

    December 4, 2010 @ 5:21 pm

    GKP:

    Who would have thought that the language setting would be a property of portions of text!

    This is a perfectly sensible behaviour when you think about it. Many people (e.g., language students, linguists) regularly create documents containing more than one language. You mark, say, the French parts as French, the English parts as English, and when you're running your spell-check, Word consults the correct dictionary. It would be far more inflexible if the designers assumed that entire documents can only be in one language.

    [OK, elegant idea. Now let's talk about unprompted shifting into forced French features even in input mode, even when typing English, and entire documents shifting silently into French proofing, only after the first modification of the file, under conditions where half an hour of experimentation by a reasonably intelligent user fails to identify any way of switching it off. Is that sounding reasonable? —GKP]

  46. Nathan Myers said,

    December 4, 2010 @ 6:19 pm

    Does "francophony" rhyme with "pinto pony" or with "cacophony"?

    [Cacophony. —GKP]

  47. John S. Wilkins said,

    December 4, 2010 @ 6:43 pm

    I've used word since version 1a on DOS. I use it on a Mac now, and find it as annoying as Geoffrey says. But unwilling to learn Yet Another Markup Language, I use LaTeX only rarely. Instead I am finding Pages to be rather brilliant, especially in conjunction with Endnote.

    It saves relatively reliable Word files (which is to say, as reliable as the format permits), and I open the resulting files in a copy of Word to verify them before I send off the file, but Pages never renumbers, moves or changes my text the way Word does.

    It's not great for books, however. I can live with that.

  48. Mary Kuhner said,

    December 4, 2010 @ 6:51 pm

    OpenOffice has been a success for us (with a similar reason to need to write .doc and .docx) *except* for documents, such as grants, which are repeatedly passed back and forth with actual Word users. This does not quite work.

    The nadir of my Word/OO experience was receiving back from my co-PI a grant proposal document. We had a tight page limit and a lot of carefully placed graphics. I opened the document in OO, and for about half a second it looked fine. Then the graphics suddenly collapsed to a tangled heap at the bottom of each page (probably due to an auto-run repagination routine) and could not be restored. At that point I gave up and designated the Word user as primary editor: I sent him bits of text and graphics and he inserted them. We did get the grant in, but it was a trying experience.

    If only LaTeX could write .doc or .docx! I love LaTeX very much. It does not always do what I want but its behavior is very stable and predictable, unlike either Word or OO. If it does something once it will always do it. Maybe someone needs to write doctex, by analogy with pdflatex?

  49. Dan T. said,

    December 4, 2010 @ 7:10 pm

    An awful lot of people in office environments insist on sending M$ Word attachments for things (e.g., memos announcing building maintenance) that could have been done fine in a plain text message body.

    And then there's M$ Excel, which when given a perfectly reasonable set of CSV (comma separated values) data, often mangles it beyond recovery, stripping leading zeroes from zip codes and (even worse) changing such things as account numbers into exponential notation. There are some people at work who are absolutely impossible to pound a clue into their heads to the effect that they shouldn't open data files in Excel and save them out again and then expect me to deal with them as if they were untrammeled CSV data directly from the data source.

  50. grackle said,

    December 4, 2010 @ 7:33 pm

    If you go to Tools on the menu of Word, click on language/ set language, uncheck the automatic language recognition square and (most important) scroll in the language list to where it indicates that it is using French and change that to English, a box will appear (at least it does in my version) asking if you really want the language to be English. Answer yes. This ends the use of French and, I think, that unchecking the automatic language recognition will forestall future occurrences. You have to manually fix the existing problems.

    [Didn't work. No auto language recognition; and changing to English did not change anything. —GKP]

    I prefer to use Word Perfect for ordinary writing. It easily saves files in the Word format if necessary and is, to my mind, easy to work with.

    [Of course, WordPerfect is a vastly better product. But it only runs under Windows now, and the users are just a small tribe of hunted fugitives in the hills. —GKP]

  51. Maria Gouskova said,

    December 4, 2010 @ 7:44 pm

    I haven't used MS Word in a few years, but back when I was fluent in it, I created a handout on wrangling it. The instructions are probably partially obsolete for Word 2007, since menu items move around from version to version. Adjusting for version differences, the first step in making Word work for you has always been the same: turn off its auto-formatting function, especially the part where it wants to change things based on its best guess of your current needs. My guess is that the original creator of the problematic .docx file did not have this option turned off. So, GKP–I think that to diagnose the problem, it would help to know whether the original file was created by someone who occasionally authors documents in French.

  52. Maria Gouskova said,

    December 4, 2010 @ 7:49 pm

    Grackle–you beat me to it. It seems that this is a new feature that's analogous to defining styles based on formatting.

    Word Perfect stopped making a Mac version sometime in the early 2000's. It was actually a very nice alternative to Word, albeit completely useless for collaborating. While we are on the topic of WYSIWYG editors, I couldn't get OpenOffice to work for me, either (certain crucial subfield-specific features are missing). I opted for LaTeX in the end. I have played around with Pages.app, but it didn't impress me as a serious application either.

    One option that hasn't been mentioned is Google Docs. It allows you to view and edit .doc(x) files without ever firing up OpenOffice or similar, and it seems to handle Word's formatting quite well.

  53. Avi said,

    December 4, 2010 @ 8:33 pm

    If you like to use LaTeX to compose documents but you're required to submit Word-formatted files, you can use LaTeX2RTF to do the conversion:

    http://latex2rtf.sourceforge.net/

    There's a Mac shell available, too.

  54. Olof Hellman said,

    December 4, 2010 @ 9:53 pm

    Others have already pointed to the solution. I'll point to another helpful feature: If you turn on 'Reveal formatting' Under the view menu, you can click on a selection to see the language setting of a selected range of text.

    However, I'm most curious about the statement "Who would have thought that the language setting would be a property of portions of text!". That was sarcasm, right? If one doesn't want auto-detection of language, is there really any other way to sensibly spellcheck a multi-lingual document?

    Disclosure: I'm a developer on the Office for Mac team.

  55. the other Mark P said,

    December 4, 2010 @ 10:03 pm

    One puzzle remains: how the file got into a state where it thought it was in French, when it never was, and I never typed any French or set the Language setting to French, ever…

    Cut and paste can have this effect, if the original words were set that way (which might not be obvious when you copied them). Certainly cutting out of Word into my HTML editor brings all sort of rubbish like that.

    Another option is that someone has entered a "change to French" customised quick key in your settings. Then, say you accidentally hit the Control while typing Shift-F you could get the change.

    "Styles" can do this sort of thing, though a look at the embedded styles did not show that. You do have Eng-US for one paragraph style and Eng-UK for another though!

    I know your hurt, because I've been there too. In my case automatic numbering and section changes have been my bugbears. For many years all my documents had a Russian title in "properties", in Cyrillic, because somehow I had corrupted my Normal.doc template.

    I can't move to OpenOffice because I need Word's (fabulous) equation editor for work, and I can't get wild-cards to work properly on search and replace for OO at home.

  56. Jeff said,

    December 4, 2010 @ 10:55 pm

    Do you often write Word documents in French? If not, you can try uninstalling the French proofing tools by running the installation program, deselecting that option, and accepting the changes; the option lives under the "Office Shared Tools" node on the installation menu.

    It could be that this would just Word to prompt you to reinstall the French proofing tools the next time you try to open that template. I withdraw my suggestion and second the cut-and-paste-into-new-document-boogaloo.

  57. Anne Sturgeon said,

    December 5, 2010 @ 12:02 am

    That's happened to me before (Spanish, not French, I think). Sigh.

  58. Thomas Westgard said,

    December 5, 2010 @ 12:56 am

    Glad to hear you found a workaround. My iPhone has convinced itself that it should use the Spanish dictionary for autocorrect. This makes typing even slower, since I have to click away the suggestion for nearly every word, which is always in Spanish and therefore almost always wrong.

    This is for your amusement. I'm not asking for help. Unlike MS, which neatly shares its initials with a debilitating illness, Apple is a safe ingredient in a healthy life. For more chuckles of a linguistic nature, you will likely enjoy Damn You Autocorrect, if you haven't seen it already:

    http://damnyouautocorrect.com/

  59. Erik Zyman Carrasco said,

    December 5, 2010 @ 1:36 am

    @ Nathan Myers: I assume Francophony rhymes with cacophony—after all, they consist of essentially the same roots and affixes except for one. That's how I read it, and how I'd prefer to pronounce it.

  60. Aaron Davies said,

    December 5, 2010 @ 2:04 am

    i find it odd that you should equate "copy & paste the text into a new document" with "retype it". to me, "retyping" implies actually keying the entire text back in, one letter at a time, e.g. from a printed copy when no soft copy (or scanner) is available, and the time and labor involved is the main objection.

    you quite explicitly asked "And is there any kind of workaround or fix, short of just retyping the content into a newly created file?" and i would consider "copy & paste it" to be an entirely reasonable "workaround" to offer in answer to that question.

    [Look, pasting in from the clipboard is retyping; it's just faster. Every character goes into the buffer again. And remember, people above were saying you have to switch of smart pasting, and lose all the formatting! I refuse to accept that me starting the document again is an adequate answer to this issue. Like millions of other people, it seems, your instinct is to just tolerate the punishment and compensate for the failure of the tool by doing extra work yourself. How did we ever get so subservient? —GKP]

  61. Aaron Davies said,

    December 5, 2010 @ 2:05 am

    @HP: good grief, i think i remember something like those from word 4.0 for mac. (not precisely, as that was on a mac plus which didn't have F keys, but i think they had a standard way of forming equivalents when you didn't have the extended keyboards that were introduced iirc at some point in the mac ii series.

    i also remember hand-compositing what's now unicode 27e6 and 7, double-vertical-barred left and right square brackets, ⟦ & ⟧, using a special interface that let me enter postscript commands directly to be used when printing on a laser printer. (i no longer remember precisely why….)

  62. Dylan Thurston said,

    December 5, 2010 @ 2:35 am

    If Nature insists on receiving a Word file, and you want to write for Nature, then you grin and bear it; you have to send them the vile format that they request.

    Curious here: Nature does claim to accept TeX submissions, although they do have the horrible taste to prefer Word. Do they not actually deal with TeX?

  63. A. Marina Fournier said,

    December 5, 2010 @ 2:36 am

    GKP, you have my complete sympathy–I too use Word only when forced. I use iWork's Pages WP program, which will kindly save docs into a Word doc. I hate the Word interface, and I don't trust MS products to keep nasties ooff of my machine.

    Allegedly they've closed the "back door" that viruses, Trojans, worms, etc enter through, but I'll believe that when hackers stop trying to get in that back door, or I make the Astronaut Corps, whichever comes first. Given that I'm short, 56, prone to motion sickness, and too round for one of the suits…

  64. Julie said,

    December 5, 2010 @ 3:53 am

    I was able to correct the file fairly easily in Word 2003 (Tools=>Language=>Set Language. Setting it to English(US) (with the entire file selected) made it act normally. Although the quotes and extra spaces still needed fixing, they were then fixable. But my seven-year-old version makes it somewhat of a moot point. If MS neglected to put that control where you can find it…that's their fault.

  65. michael farris said,

    December 5, 2010 @ 3:54 am

    "@michael farris – Which special characters can't you make shortcuts for in OpenOffice? Our bilingual office has opted for OpenOffice and we have installed standard shortcuts for accented characters."

    I have the Polish version in which anything that's not ASCII or Polish has to be dropped in one. character. at. a. time. Maybe the English version is more versatile.

    Word has its failings (don't get me started) but making it hard to type non-ASCII characters is not one of them.

  66. Spinoza said,

    December 5, 2010 @ 4:34 am

    It isn't my place to criticize, so pardon what follows.

    Isn't it just a bit inconsistent to say "How did we ever get so subservient?" to those who suggest ways to make that pile of excrement called Word work, and yet suggest to those who insist on using LaTeX that there are instances where using Word is forced? Isn't it subservience to accept the forced usage of any particular editing tool as well?

  67. Geraint Jennings said,

    December 5, 2010 @ 4:55 am

    @michael farris "I have the Polish version"

    I use both French and English versions (work and home) and have identical shortcuts for accented characters set up on both. Does the Polish version of Writer really not permit recording insertions of characters as macros and assignment of those macros to keyboard shortcuts? Bizarre and unhelpful.

  68. michael farris said,

    December 5, 2010 @ 5:15 am

    @geraint, we all have our blindspots, mine include macros (I fiddled around with them in open office for a while and never got anywhere, someone smarter might have).
    I also spent a long time searching around the help sections looking for ways of making key shortcuts and found nothing. I finally gave up and got Word 2007 (which I like much less than 2003 if that's possible). But I can work around weirdities in Word whereas I couldn't in Open Office (the special characters were not the only thing causing me problems though they were a big one.

  69. Dave R. said,

    December 5, 2010 @ 6:13 am

    @michael farris: if you're using Windows, you can use ALT+NNN where NNN is the number of the special character you want to insert (these are given in the special characters palette). Note: You MUST use the keypad to enter the numbers, NOT the number keys above the letter keys. Also, hold ALT down until you have entered the entire number.

    ————-

    I thought everyone knew to first select the text before changing language. Of course language is a property of characters/words, not documents: how else could you use multiple languages in the same document?

    Similarly, the language property of OpenOffice Writer is in the character formatting palette. Although you can set a document language in both programs, it is better in all cases to first select the text.

    Be thankful it isn't a PowerPoint file: you have to change the language element by element in PP.

  70. Breffni said,

    December 5, 2010 @ 7:41 am

    GKP:

    OK, elegant idea. Now let's talk about […] Is that sounding reasonable?

    No, but I wasn't talking about those other things. I was pointing out that this one factor in your problem – the fact that language is a text property, not a document property – is not perverse or inexplicable.

  71. Army1987 said,

    December 5, 2010 @ 8:08 am

    @Peter Taylor: You're a f***ing genius!

  72. TonyK said,

    December 5, 2010 @ 10:47 am

    I have written a number of documents which contain more than one language, so I knew at once what your problem was. As to its source, I suspect that you inadvertently changed the language of some of your text to French by clicking on the wrong menu item. I know, I know — the trouble with accusations like this is that you can't plausibly deny them, even if they're false. But it's the simplest explanation

  73. Denise Wood said,

    December 5, 2010 @ 10:48 am

    @GKP

    If you didn't mind sharing your information with Google, they provide GoogleDocs for free. Platform-agnostic, all the simple formatting, no fancy (unnecessary) features. It's what I use when I'm prevented from using LaTeX.

  74. Johanne D said,

    December 5, 2010 @ 11:43 am

    Ha! It's like a white person walking around in blackface. Now you know what it feels like, GKP!!! ;-)

  75. daiyami said,

    December 5, 2010 @ 2:41 pm

    @Roadrunner: Have you sorted out that the "detect language automatically" setting (mentioned in earlier comments) is probably the cause of your Acoma problem? I didn't see anyone explicitly connect those.

    I originally came by just to make Breffni's point (and since I left my RSS reader, I don't want to waste the effort and must say something anyhow)—a linguist, of all people, should not be snarking about language being a text property rather than a document setting. I understand LaTeX has no spellcheck at all? I bet it would be a text-level setting if it did. But it's hard to appreciate Word for doing something right when it does so many things wrong.

    Tangent: MS Word is not going away. I was running seminar sections for 56 undergrad thesis writers this term. Of the 56, one used Open Office, and one was a math major who was going to be learning LaTex to write the thesis, but did not yet know it and used Word (in Japanese). Everyone one else used some version of Word. And students, who like free and don't do complicated collaboration, should be prime audiences for OpenOffice, etc. I've learned how how to make it work for me, like Maria Gouskova's handout.

  76. Bruce said,

    December 5, 2010 @ 3:29 pm

    As you have discovered, MS Word allows for different portions of text to be in different languages. There is a very useful feature for quoting text in other languages. I also use it frequently for technical text (programming code) to prevent MS Word from applying spelling check to C source code (I set the take to "do not check")

    You can in fact modify a Formatting Style to include a specific language (like British English)..

  77. the other Mark P said,

    December 5, 2010 @ 3:55 pm

    no fancy (unnecessary) features

    I know most users don't do a lot of indexing or writing of mathematical equations, but no feature MSWord has is "unnecessary".

    The best we can say is that some are little used.

    (I have used virtually every set of commands except "Mailings". I have no idea what they are for.)

  78. Read Weaver said,

    December 5, 2010 @ 6:21 pm

    A suggestion for the inevitable future problems (which someone has already mentioned, but I'm emphasizing). Saving as RTF and then closing and re-opening sometimes fixes inexplicable problems.

  79. Azimuth said,

    December 5, 2010 @ 9:18 pm

    When I engage a setting with nothing selected, I expect the change to be made document-wide. This often does not happen, which is a major reason I dislike word processors.

  80. a different Steve said,

    December 6, 2010 @ 6:36 am

    It looks like you've solved the problem, but if similar problems occur in future it's worth keeping Open Office in mind. I'm not suggesting abandoning Word in favour of Open Office: although you probably could, there is always the chance that OO's Word emulation might not be absolutely perfect when you need it to be, and therefore there's a small chance you could end up with a "word format" doc that won't open in Word. However, as this thread demonstrates, there's also a small chance that Word will create "word format" docs that don't open properly in Word.

    What I would say is that I've been able to rescue someone's dissertation before when Word refused to open the file properly because of an issue with table definitions, and the backup copy was similarly affected. The file opened perfectly in Open Office, and when saved in word format would then open in Word with no fuss.

  81. Rick S said,

    December 6, 2010 @ 5:16 pm

    Geoff, I suspect the problem is in your template file. Language should default to what is stored in the paragraph style for a new paragraph. The menu language setting would only be the default for newly created styles, but wouldn't affect styles or text already in existence.

    One way to test this easily is to shut down Word, rename Normal.dot (or your custom template) to something else, and restart Word. When it can't find Normal.dot, it recreates it from program defaults, and the language should then come from your computer's default language setting. If that doesn't work, you can always restore the original Normal.dot.

    Or, if your document was somehow created under a custom template (perhaps a French version of Normal.dot) but uses only built-in styles, you can switch it to an English Normal.dot, select everything, and reapply the styles. In any case, I think you'll have to edit the spaces and guillemets, but future documents should then be all-English.

  82. Rick S said,

    December 6, 2010 @ 5:22 pm

    Oh, one more thing. Word styles form a hierarchy, with the "Normal" style as the root. Any style element not overridden in a descendent inherits from its parent, so if your Normal style specifies French, chances are all your styles are inheriting from it. Try checking the Normal style first.

  83. Jeffrey Westall said,

    December 6, 2010 @ 8:20 pm

    I'm glad someone has already pointed out the solution. I first ran into this problem 15 years ago when I was a lab monitor. Back then, there was actually a language "do not spell check", and a student had somehow set the language of her text to this "language", and when she tried to spell check the document, Word would skip to the end and offer the cryptic comment that "Text marked do not spell check skipped." It took me about 15 minutes to figure out how to fix it, and I was equally dumbfounded.

  84. Aaron Davies said,

    December 7, 2010 @ 2:56 am

    my instinct is to avoid microsoft products when possible and make myself their master when not. i've learned though that not many people have the patience or inclination to learn their software thoroughly enough to fully control anything from redmond, so i tend to offer the simplest possible solutions.

    i do the vast majority of my "writing" (coding, note-taking, and very basic record-keeping) in pure plain text. if i had to write any significant quantity of formatted text, i'd probably learn some TeX system (or possibly one of the XML authoring systems, like DocBook), and find and/or create a decent xtodoc or xtopdf workflow for dealing with people who couldn't handle my preferred formats.

  85. Jon said,

    December 7, 2010 @ 6:53 am

    Geoff, you waste your time and risk your sanity trying to get to the root of every problem. Those of us who have to work in an organisation where Microsoft Office is mandatory learn a few workarounds, like the rtf trick, that work on many different kinds of problem. It takes a few seconds, and preserves most kinds of formatting.
    Yes, we bellyache to our colleagues when a program does something particularly bizzare, but we'd rather get a quick effective fix than understand the causes. There will be a different problem tomorrow, with a different cause, but the same workaround will fix it.

  86. Tim said,

    December 7, 2010 @ 10:19 pm

    I get an odd bug myself, when typing in Chinese, Word often mismarks it as Hindi… No clue why, but it'll reset for me. Sorry I can't be of help, just sympathy here.

  87. Ben Hemmens said,

    December 8, 2010 @ 2:57 pm

    Commiserations on Word. We all suffer from it. Soon, I like to think, it will be gone.

    Here's a non-solution that works quite well: write the damn thing in Pages and export it as a .doc. Pages is really quite nice once you get into the logic of its little "inspectors"; a similar enough interface to Adobe PhotoShop, InDesign, etc., just less complicated.

    Unless you're dealing with serious Word freaks who need some of the "fancier" functions such as Tables of contents, tracking changes (my customers all want that and get fed up when I say "I've changed things everywhere"), fields, graphics etc., this should be fine.

  88. Keith said,

    December 9, 2010 @ 4:03 pm

    I fixed this in about five seconds flat, but then I've been fighting Word for about the last twenty years…

    Open the file, press CTRL and A to select all.

    Press SHIFT and F1 (or select the Format menu, then the Reveal Formatting option).

    A box appears, in which you will see in the Font section two entries for Language.

    Select the language (by clicking on the underlined blue word Language) to open the list of languages. Select English (U.K.).

    K.

  89. Qov said,

    December 9, 2010 @ 7:58 pm

    As for how it happened, I imagine you pasted text, or accidentally hit ctrl-V when you were reaching for shift-V, and the last thing in your paste buffer happened to be from a document by someone whose default language was French. A document of mine once decided it was in Swedish, after I had cut and pasted English text from an e-mail sent from Sweden.

    Software language handling still makes assumptions it shouldn't, such as when I'm travelling and open my own laptop, replete with language preference settings, and Google, Blogger and Facebook all come up in German or Cantonese. Even at home, I frequently get served pages in Tagalog because its language code is tl, and I requested tlh.

RSS feed for comments on this post