Having it both ways

« previous post | next post »

"Data is" or "data are"? "That data" or "those data"? Michael Calia, "EBay asks users to change passwords after cyberattack", WSJ 5/21/2014 [emphasis added]:

EBay Inc. on Wednesday asked the nearly 150 million active users of its namesake marketplace to change their passwords following a cyberattack that compromised a database containing encrypted passwords and other data.  

The database didn't include financial data, said the company, which owns its namesake marketplace business as well as online payments operation PayPal.  

The company said it had no evidence that personal or financial information for PayPal users was compromised. That data are stored on a separate, secure network.

The obligatory screenshot is here.

[Tip of the hat to Will Leben]


  1. Brett said,

    May 21, 2014 @ 11:59 am

    Looks like a copy editor changed "data is" to "data are" without fixing the rest.

    However, this usage doesn't sound that odd to me. I guess I'm just used to "data" behaving funny with respect to its number.

  2. Cass said,

    May 21, 2014 @ 12:19 pm

    "Data are" always sounds wrong to me. It's my belief that if more people understood the distinction between count nouns and mass nouns, this whole business of data as a supposed plural never would have gotten started. No one ever talks about a datum. This is a failure of teaching English grammar.

  3. Eric P Smith said,

    May 21, 2014 @ 12:43 pm

    I guess I'm stuffier than Brett, for “That data are” strikes me as absurd. Choose your number and stick to it. “That data is”, or “Those data are”.

  4. Nik Berry said,

    May 21, 2014 @ 1:00 pm

    @Cass: "No one ever talks about a datum." 105 million Google hits says they do.

  5. Ben Zimmer said,

    May 21, 2014 @ 1:10 pm

    A nice observation of the data/datum distinction: "When 'data journalism' becomes 'datum journalism'."

  6. J. W. Brewer said,

    May 21, 2014 @ 1:23 pm

    The google n-gram viewer (using the corpus that goes through 2008) shows "these data" as still more common than "this data," but "that data" as more common than "those data." My only theory as to why the "number" (or count v. mass) treatment should play out differently for those seemingly parallel minimal pairs is that the sequence "that data" can also occur for entirely different uses of "that," as seen e.g. in phrases like "the probability that data are missing," making the total number of "that data" hits an apples/oranges comparison to the "those data" hits.

  7. tapsa said,

    May 21, 2014 @ 1:25 pm

    So, if data should be prescriptively analyzed as a mass noun (or just analoguous to 'media'), how about band names? Sure, The Beatles is nominallly plural, but it's still a proper name referring to a singular entity (a group of people). 'Metallica are a thrash metal band' just sounds weird to my ESL ears. Most theatrical troupes and classical orchestras seem to require singular agreement.

    I'd still argue that 'data' and 'Metallica' are aggregate nouns similar to 'people' and 'bananas' since in both you can pick out individuals – rock bands are easily perceived as a collection of individuals since rock music puts so much emphasis on individual achievement unlike orchestras and theatre troupes which are more collectivistic, so to say. And data are a confederation,of separate data points.

  8. tapsa said,

    May 21, 2014 @ 1:28 pm

    Oh, seems that media is a plural noun in English. The way people use "mainstream media" as a singular must've fooled me.

  9. Jonathon Owen said,

    May 21, 2014 @ 1:47 pm

    "No one ever talks about a datum." 105 million Google hits says they do.

    I'm not sure a Google search is the most useful source in this case; most of the hits are for products or companies named Datum or discussions of the word.

    Most of the actual uses of datum are in a specialized sense and are not the singular of data. The regular count noun datum (plural datums) is a technical term in surveying and related fields.

    Here's my own stab at the data is/data are problem. In a nutshell, data fails some basic tests for count nouns, and it seems that its use as a plural is mostly artificial.

  10. Levantine said,

    May 21, 2014 @ 2:01 pm

    tapsa, in British English at least, I think that "'Metallica are a thrash metal band" would be far more usual than the alternative. And I can't imagine that any variety of English would favour "The Beatles is".

  11. Chris Waters said,

    May 21, 2014 @ 2:18 pm

    The feeling I get is that data is a mass noun that can be its own plural. E.g. "We received data from [source]. That data is [blah blah]. We also received data from several other sources. Those data are [blah blah]." Nothing in that series of sentences feels wrong to me, although the second part is a bit unusual.

    Thus even though I might use data in the plural in certain circumstances, that doesn't mean I think it's the correct (or even preferred) way to use the word.

  12. Chris Waters said,

    May 21, 2014 @ 2:21 pm

    ETA: of course, "that data are…" feels wrong no matter how I try to parse it.

  13. Andrew Bay said,

    May 21, 2014 @ 2:31 pm

    Similar to the band names thing, TV Shows:
    I always have trouble with "'Ruby and Max' is coming next on Nick Jr." I know that this is technically correct, but is it screwing up some child's general sense of plural/singular grammar?


  14. J. W. Brewer said,

    May 21, 2014 @ 3:23 pm

    I think Levantine's point suggests, separate and apart from BrEng/AmEng distinctions, an asymmetry. In AmEng morphologically singular band names can take either singular or plural agreement depending on whether in context the band is being conceptualized as a unit or as a bunch of guys. Morphologically plural band names are hard to match with singular verbs if (but only if) the name is conventionally arthrous. Anarthrous plurals (Talking Heads might be one good example – compounds like Hunters and Collectors or otherwise complex NPs like Birdsongs of the Mesozoic would also fall into this category) can, however, take singular verbs.

    There's a separate issue with pronouns, but perhaps it's animacy-driven and using "they/them" rather than "it" for a band you would use a singular noun with is just a particular applied instance of one of the factors underlying "singular they."

  15. eye5600 said,

    May 21, 2014 @ 4:01 pm

    My father had a grad school prof who was a stickler for the use of "data" as a plural. This led to the following exchange at a dept softball game:

    Prof: What's the count?
    Father: The datum is one ball!

  16. Milan said,

    May 21, 2014 @ 4:08 pm

    Nik Berry, Jonathan Owen:
    The word "datum" or "Datum" means date in many European languages, including German. Probably most of the Google results aren't even hits for the actual English word 'datum'.

  17. MattF said,

    May 21, 2014 @ 4:53 pm

    Note that 'datum,' as used in geodesy, is not the singular of the generic term 'data'– in fact, the plural of geodesic 'datum' is 'datums'.

  18. Eric P Smith said,

    May 21, 2014 @ 7:50 pm

    The actual number of Google hits for the phrase "a datum" is 420. So it's not very common.

  19. Dr. Decay said,

    May 22, 2014 @ 2:06 am

    I often have discussions with coauthors about this word. My strategy now is to ask them whether they would say 'We don't have many data.' or 'We don't have much data.' I feel very clever when asking this, but I don't always carry my point.

  20. richardelguru said,

    May 22, 2014 @ 5:57 am

    Remember Hitchcock…
    "The Birds is coming"

  21. Tom Ace said,

    May 22, 2014 @ 11:15 am

    I think "fewer data" sounds weird, and the only times I've said it have been for comic effect. The AMA Manual of Style, on the other hand, uses it as an example of good style:

    reported fewer data (not: reported less data)

  22. Craig S said,

    May 22, 2014 @ 11:52 am

    I realize this probably isn't what really happened, but my own first instinct was to parse this as an incomplete sentence fragment. Instead of "That [particular] data are…", my brain imposed "[The company said] that [the] data are…"

  23. J. W. Brewer said,

    May 22, 2014 @ 3:02 pm

    Craig S.: further to my comment earlier, if you google for instances of "that data are" you will find plenty of perfectly grammatical ones like what you hypothesized (i.e. where "that" is being used to set up a subordinate clause rather than as a demonstrative), and it is not implausible to think that the quite weird-feeling example here could have started as part of such a construction and then been sloppily edited in a way that made a grammatical reading of the string difficult-to-impossible.

  24. Steve said,

    May 22, 2014 @ 3:19 pm

    My initial impulse was to parse "that data are stored…" as a dependent clause. So, I expected "that data are stored on a separate, secure network" to be followed with something like "is a darn good thing." I'm not entirely sure why I had that impulse: I say "That data is" regardless of whether "that" is referencing a particular [sub]set of data or is marking the clause as dependent. (That data is on a secure network & That data is vulnerable to hacking attempts is a sad fact of life.) All I can figure is that "that data are" felt so wrong that my poor monkey brain wasn't sure how to process it, and "that data are[NP]" [is X] was the only thing it could come up with as even a plausible possibility.

    Also, perhaps, unconsciously, I was influenced by the constructions used with nouns like "fish" or "deer": I would say "That fish is small" but "That deer are anything but delicious does little to dissuade hunters." That doesn't really explain my impulse here (since, as noted, I don't say "That data are…" in any context) but perhaps those examples, while not directly on point, were still the closest I could find to an analogous usage.

  25. Steve said,

    May 22, 2014 @ 3:25 pm

    I started my comment before JWB's comment was posted.

    Also, I belatedly realized that, to a "data are" user, something like "That data are [X] [is Y]" would be perfectly cromulent. Perhaps I knew that unconsciously, and that is what drove my misread. (Meaning there is no need to bring fish or deer into it.)

  26. Mike Briggs said,

    May 24, 2014 @ 2:01 pm

    Just wondering what the criteria is for insisting that data is singular.

RSS feed for comments on this post