Finger spoonerisms and conservation of caps

Jennifer Ouellette, "The Higgs Boson May Have Five Faces", Discovery News, 6/21/2010:

And now the team is back with even more intriguing results to announce from their subsequent analysis, published on arVix.

The link will take you to Dobrescu, Fox, and Martin,  "CP violation in B_s mixing from heavy Higgs exchange", arXiv:1006.4238. And the arXiv, as Wikipedia explains, is "pronounced 'archive', as if the 'X' were the Greek letter Chi, χ", and

was originally developed by Paul Ginsparg and started in 1991 as a repository for preprints in physics and later expanded to include astronomy, mathematics, computer science, nonlinear science, quantitative biology and, most recently, statistics. [...]

It was originally hosted at the Los Alamos National Laboratory (at, hence its former name, the LANL preprint archive) and is now hosted and operated by Cornell University, with mirrors around the world. [...]

Its existence was one of the precipitating factors that led to the current revolution in scientific publishing, known as the open access movement, with the possibility of the eventual replacement of traditional scientific journals.

Jennifer Ouellette may be "a recovering English major", but as associate editor of APS News, with writing credits from The Industrial Physicist magazine as well as Discover, I'm sure that she knows perfectly well what the arXiv is, and how it's pronounced and spelled.

The typo "arVix" for "arXiv" interchanges the two non-adjacent consonants 'X' and 'v', while maintaining the pattern of capitalization independent of the letters involved.

I've written before about the conservation of patterns of orthographic gemination (also here, here, here). But I don't think that we're previously noted a similar conservation of capitalization — perhaps this is because capitals are usually found either initially (where letter metathesis is unlikely to go unnoticed) or throughout a word (where it wouldn't matter), or quasi-randomly arranged in the case of StudlyCaps (where it also wouldn't matter).

Here's the obligatory screenshot:

[For those who care about the science, the back-up evidence for this multiplication of Higgses seems to have been reduced from a 2-sigma event to a 0.8-sigma event.

And of course, no linguistic discussion of the arXiv should omit mention of the snarXiv, where particle physicists are replaced by a stochastic context-free grammar.]



  1. sarang said,

    June 23, 2010 @ 1:32 pm

    Bit of a solecism to say something is "published on arXiv" anyway. It's _preprints_ and they're _posted_ (or "put") on _the_ arXiv.

  2. mgh said,

    June 23, 2010 @ 1:41 pm

    she may have had interference from the roman numerals V and X — possibly enhanced by 5 being an important number in her story?

  3. Mark P said,

    June 23, 2010 @ 1:46 pm

    It's an interesting slip. The letters use different fingers on the keyboard, so it's not just a finger misplacement. I often mistype x and z, probably because they are used so infrequently that my fingers are not used to typing them, and the z requires the weakest finger I have. But that doesn't seem like the reason for this slip. The x and v are similar-looking letters (in fact, nearly identical in some sense), but surely that's not the reason. Anticipation?

  4. xfi said,

    June 23, 2010 @ 1:49 pm

    This could be treated very neatly in Autosegmental Ortography.

  5. Colin Reid said,

    June 23, 2010 @ 1:50 pm

    I don't have a source on this, but it seems quite likely that the 'X' in 'arXiv' is inspired by the 'X' in the name of the preferred typesetting format 'LaTeX', where the 'X' is likewise supposed to be a chi (and pronounced as /x/ or /k/ rather than /ks/). (The 'T', on the other hand, is an instance of CamelCase, because an earlier incarnation was called 'TeX'.)

  6. Brett said,

    June 23, 2010 @ 2:13 pm

    @Colin Reid— The X in "arXiv" was indeed partly inspired by the one in "LaTeX." It's an obvious inference to anyone who works with both, but there are more details about where the X came from on their What's New list from December 1998.

    As somebody who regularly uses both and, I find I have developed slightly different pronunciations for them. However, the differences would probably not be noticeable to anyone except myself (and perhaps my wife).

  7. Chris said,

    June 23, 2010 @ 2:33 pm

    I have heard the arXiv referred to as both "the arXiv" or merely "arXiv" (without a proceeding "the"). I tend to say "the arXiv", and that's what seems to be most common (at least in my conversations with other mathematicians), but if you go to their main page, you will see they omit the article:

    arXiv is an e-print service in the fields of physics, mathematics, non-linear science, computer science, quantitative biology, quantitative finance and statistics. Submissions to arXiv must conform to Cornell University academic standards. arXiv is owned and operated by Cornell University, a private not-for-profit educational institution. arXiv is funded by Cornell University Library and by supporting user institutions.

  8. fs said,

    June 23, 2010 @ 2:45 pm

    @sarang: nevertheless, the preprints on arXiv are made publicly available to all, and could thus be considered "published". The requirement for a document needing to have been physically printed in a dead-tree journal before it can be considered truly "published" is becoming somewhat irrelevant these days, I think.

  9. fs said,

    June 23, 2010 @ 2:47 pm

    sorry, extra "needing" slipped in there.

  10. Jonathan Badger said,

    June 23, 2010 @ 3:32 pm

    Well, in biology, we have PLoS One, which like arXiv is on-line only and freely available. The difference is that unlike arXiv, the papers in PLoS One are peer reviewed. That latter bit is hardly irrelevant; while dropping the dead-trees may simply be a change in publishing technology (much like is happening with ebooks), without peer review something can't really said to be published.

  11. sarang said,

    June 23, 2010 @ 3:44 pm

    I was just making a usage point: I've never heard anyone describe their preprints as having been "published" on the arxiv. Google search results aren't very conclusive on "arxiv" vs. "the arxiv" but in my experience physicists almost always use the article.

  12. sarang said,

    June 23, 2010 @ 3:47 pm

    FWIW the snarxiv ( ) takes a definite article.

  13. spv said,

    June 23, 2010 @ 3:49 pm

    Every time I see 'arXiv' I subconsciously want to read it as 'arvix' because that's the obvious, English-pronounceable ordering of those letters, especially if you don't know the history of that X; whereas in LaTeX it doesn't matter if you pronounce the 'X' as to /ks/ or /k/ because both sound acceptable in context.

    I'm sure there's been research on this, but I assume it is human nature to want to make everything pronounceable, which relates to the seemingly inexplicable acronyms the US government comes up with for various bills and programs (e.g., why do we shorten 'North American Aerospace Defense Command' into NORAD, instead of NAADC?)

  14. Mark P said,

    June 23, 2010 @ 4:20 pm

    spv, people in the military always try to find acronyms that sound good. I think they have an Army command for the purpose of finding acronyms and changing them once everyone gets familiar with them.

  15. David Deterding said,

    June 23, 2010 @ 6:35 pm

    If you look at the previous article on 'Conservation of (Orthographic) Gemination' written by Mark Liberman on March 29, 2004 (following the link near the end of the current posting), you will find that the second paragraph ends with a mention of 'conversation of gemination'.

    This is an interesting instance of metathesis, especially given the topic under discussion. Or perhaps it was a deliberate little joke.

    [(myl) Clearly it was a joke, though on whom is less certain.]

  16. Jerry Friedman said,

    June 23, 2010 @ 11:20 pm

    I'd just like to report that today I produced my best typo in a long time, maybe ever: "I'll though" for "although". (Like a lot of Americans, I pronounce I'll much like the first syllable of although, in casual speech.)

  17. Jon Foote said,

    June 24, 2010 @ 12:22 am

    OK, here's a far-fetched hypothesis: it's the Cupertino effect from a spellchecker that uses bigrams or similar statistics to correct words not found in a dictionary. It might preserve the capitalization, rendering that moot, but the bigrams "RV", "VI" and "IX" seem much more probable than "RX", "XI", and "IV". Keyboard adjacency could be another factor for replacing the X with V.

  18. Army1987 said,

    June 25, 2010 @ 6:36 am

    One of the typos I make most often in English is "If i" for "if I".

  19. JimG said,

    June 26, 2010 @ 9:18 pm

    Semi-digression: Fifty-odd years ago, NORAD was the Northern Air Defense Force, established to protect us all from Russian bombers flying over the polar regions. Then the Eastern and Western forces were melded with it. I don't recall ever hearing or seeing NADF, even though EADF and WADF were used (by spelling them out.) "Air" was later changed to "aerospace" to include the missile threat as well as that from bomber planes. I *THINK* the unification of the forces came when SAGE, the Semi-Automatic Ground Environment, was built at the end of the 1950s, first computerizing the air defense system — a SAGE computer was a large, windowless concrete building with whole floors of racks of vacuum tubes, each with much less processing power than your computer of today.

  20. ignoramus said,

    June 27, 2010 @ 3:08 pm

    arXiv, I tort it meant R x 4.

    As the Angles like words of 1 sound, and the Saxons [Germanic] like to see how many sounds they can roll together before coming up for breather, we compromise and call for reducing all to one meaningful breath and use mnemonic

  21. Mariana said,

    July 31, 2010 @ 11:16 am

    If you scroll all the way down on the comments here, someone responds to a comment by NiamhAgain addressing her as MianhAgain. I'm assuming that this is the Irish name pronounced /niv/ and the responding commenter isn't familiar with it.

    The post and especially comments are quite funny and worth reading as well.

  22. Mariana said,

    July 31, 2010 @ 11:17 am

    It would help for me to post the link, wouldn't it?!

