Data

« previous post | next post »

Today's PhD Comics:

Interesting that we haven't seen "datums", like "spectrums" and so on.



30 Comments

  1. Matthew Wright said,

    August 10, 2015 @ 6:47 am

    Sometime last century I saw an earnest American economist on Channel 4 News describe something or other as an important indice of something else.

  2. Dick Margulis said,

    August 10, 2015 @ 6:49 am

    Um—or is that ums?—datums is the accepted plural in surveying and related fields:

    https://www.ahdictionary.com/word/search.html?q=datum

  3. Jonathan Badger said,

    August 10, 2015 @ 7:19 am

    My graduate advisor always liked to say "datums", but I suspect it was meant in jest.

  4. Rodger C said,

    August 10, 2015 @ 7:45 am

    Snooka.

  5. Acilius said,

    August 10, 2015 @ 8:50 am

    Reading the strip panel by panel, I wondered what the "deep philosophical question" would be. My guess was that the question would be about the role of etymological information in the process of deciding which of various constructions in current use would fit best in a particular context. How exactly you get from that stylistic process to a "deep philosophical question" about the nature of language in four panels and still have room for a punchline isn't clear to me, but hey, PhD Comics is a big enough deal that I assume Jorge Cham can pull it off.

    Instead we get this claim that "It depends on whether you consider data to be facts (plural) or information (which is singular.)" To which the only appropriate response is: No, it doesn't! English speakers treat the words "scissors" and "trousers" as grammatical plurals, from which it does not at all follow that we "consider" the things they name to be in any sense multiple. It is all too similar to today's xkcd, which you reproduce in today's other post, except that relatively few of the people who like to say "There is no 'I' in team" seriously believe that they are raising a "deep philosophical question."

  6. Jonathon Owen said,

    August 10, 2015 @ 10:07 am

    The best way to determine whether it's singular or plural, of course, is not to reflect on the philosophy of meaning but to look at the data. And it's pretty clear that data doesn't behave like a normal plural—it has no real singular form (notice that the comic uses "data point" rather than "datum"), and it's not actually countable (nobody ever says "as these 317 data show").

    I go over some of the arguments for and against it being plural in more detail here and here.

  7. John Roth said,

    August 10, 2015 @ 10:40 am

    It does depend on how you regard it. "How much sand do we have?" can either be mass (6 tons) or count (586 grains). (*) This is only one example of many common nouns which we measure in some contexts and count in others.

    (*) These are not intended to represent the same quantity.

  8. Linda said,

    August 10, 2015 @ 11:04 am

    And then you have datum not being the singular of data, but a fixed reference point

  9. J. W. Brewer said,

    August 10, 2015 @ 12:31 pm

    Yeah, saying "data point / datapoint" rather than "datum" is probably by itself something of a "tell" that the speaker/writer tends toward a mass-noun rather than count-noun understanding of "data." Which somewhat spoils the joke.

  10. Ken Miner said,

    August 10, 2015 @ 12:45 pm

    @John Roth And then there are some that it seems we can neither measure nor count, e. g., beans. Both "How much beans do you want?" and "How many beans do you want?" sound wrong, at least to me. (And the classic problem of "troops" is endlessly fascinating.)

    I disagree with the fourth panel: there are no bigger problems than grammar…

  11. MikeM said,

    August 10, 2015 @ 1:54 pm

    A few observations:

    Like garbage, data is a collected noun.
    And Acilius, although there's no I in "team," there is a "me," but it's backwards.

  12. Alex said,

    August 10, 2015 @ 2:46 pm

    @Dick Margulis Yes, "datums" is used in surveying and GIS. But "datums" is used only for fixed reference points and reference models. It has created a useful distinction– "Send me all your data. Which of the datums did you use?"

  13. Guy said,

    August 10, 2015 @ 3:38 pm

    It seems to me the claim that this raises philosophical issues is a clear example of confusing syntax with semantics. Since furniture is a non-count noun, does this have philosophical repercussions regarding the ontological status of my couch via a vis my coffee table? I think not. I demand that my comics be free of conceptual confusion like this. This is serious business.

  14. Neil Dolinger said,

    August 10, 2015 @ 4:52 pm

    Rodger C. kinda beat me to the punch(line), but if Data from Star Trek ever got into a romantic relationship, his friend might start calling him "Datums".

  15. Keith said,

    August 10, 2015 @ 5:03 pm

    Ken, both "How much beans do you want?" and "How many beans do you want?" sound wrong

    If you remember the story of Jack and the beanstalk, it's quite grammatical to talk about a countable number of beans. In general, though, I'd talk about a certain weight of beans, such as "three pounds of beans" or "a kilo and a half of beans".

  16. maidhc said,

    August 10, 2015 @ 5:19 pm

    "Media" has a similar usage as a mass noun.

    However you never hear people say things like "When I tried to contact my dead uncle, I had to visit ten media before I found one who could get through."

  17. Gene Callahan said,

    August 10, 2015 @ 5:27 pm

    The deep philosophical question is, "Will PhD comics ever have a funny strip?"

    Signs say "No."

  18. Gregory Kusnick said,

    August 10, 2015 @ 5:48 pm

    Acilius: "English speakers treat the words 'scissors' and 'trousers' as grammatical plurals"

    I'll give you "trousers", but in my idiolect, "Hand me that scissors" is a perfectly valid construction.

  19. Chris C. said,

    August 10, 2015 @ 6:10 pm

    @Jonathon Owen — I have to disagree on one point. Data are, as far as I can tell, always countable. When used in the singular we're typically speaking of masses of digital data, after all, which is countable by its nature. The distinction is whether or not one typically bothers to count them. If you don't, the data is; otherwise the data are.

  20. Acilius said,

    August 10, 2015 @ 7:20 pm

    @Gregiry Kusnick: I don't have any datums, but I suspect you're in the minority with regard to "scissors."

  21. Guy said,

    August 10, 2015 @ 7:32 pm

    @Chris C.

    The word "data" is not syntactically contable, just like "furniture" and "cattle" are not countable. The fact that the referent of a word can be divided into discrete units that are "countable" in a different sense of the word "countable" is not relevant. Of course, the fact that a word is not countable does not mean that it is necessarily singular. "Clothes" is an example of a plural non-count noun.

  22. J. W. Brewer said,

    August 10, 2015 @ 7:55 pm

    I'm not sure that syntactically-plural data are unproblematically countable as a real-world matter, but the same is true of "facts," and that's unambiguously a count noun. That's not because of some difference between count nouns and mass nouns but because of the nature of the referent – two observers ideally (even if it's a hassle in practice) ought to come to the same answer if you ask them how many potato chips (count noun) or grains of rice (mass noun subdivided into countable sub-instances) are on a given plate, but a question like "exactly how many facts are in that textbook" has no single or stable answer. It's not an objective dividing-nature-at-the-joints kind of analysis, because it depends inter alia on level of generality, and which level of generality is sensible depends on the context and the inquirer's purposes. If you are used to electronic data maintained via database software you may think you know exactly how many datapoints a given dataset has, but if so you are perhaps just confusing the tidy-seeming structure of your software (i.e. it may be designed to let you sort or search along some dimensions but not among others even though you could easily imagine different software that would permit such sorting/searching) with the more elusive reality it is trying to capture. (Simple example for a database of human beings, is "address" one datapoint or a composite of multiple datapoints and if so how many. Same for "date of birth.")

    But moving from metaphysics to actual language usage, I think there may be a valid distinction implied upthread insofar as you sometimes hear people saying things like "we have three or four separate datapoints supporting this inference," but saying "we have three or four separate data (= "datums")" sounds highly unidiomatic, and possibly ungrammatical, to my native-speaker ear (although perhaps other ears have other intuitions).

  23. Viseguy said,

    August 10, 2015 @ 8:33 pm

    And remember, if there's only one item on it, it's an agendum.

    H/t to Yes, Prime Minister!

  24. Jerry Friedman said,

    August 10, 2015 @ 11:38 pm

    maidhc: For that matter, no one ever says "We need three larges, two media, and two smalls."

  25. Gregory Kusnick said,

    August 10, 2015 @ 11:47 pm

    On the other hand, if the dead uncle had accounts on Facebook, Twitter, and Instagram, you might hear "I had to try three medias to get through to him."

  26. Robert Coren said,

    August 11, 2015 @ 9:57 am

    @John Roth: No matter whether you count the sand in grains in tons, you're not going to treat the word "sand" as a plural.

    Presumably what happened with "data" is that English-speakers in general are not inclined to regard words that don't end in "s" as plural¹; it has largely also happened to "bacteria", and "phenomena" is not far behind.

    ¹With the obvious exception of common — and I emphasize "common" — nouns that have "strong" or invariant plurals: men, children, sheep, etc.

  27. Mike M said,

    August 11, 2015 @ 10:14 am

    It rather irks me when people insist on using data as a plural. It just sounds forced and so unnatural.

  28. Jonathon Owen said,

    August 11, 2015 @ 11:10 am

    @Chris C.:

    Guy already basically said what I would have said. Just because you can count data points doesn't mean that the word data is a plural count noun. I can count grains of rice, too, but this doesn't make rice a plural count noun. The fact remains that data fails some important tests for countability in the grammatical sense.

    And even if we grant that it can be countable in some contexts, the fact is that no one does. Writing "the data show" doesn't show that it's countable. All signs point to such grammatical agreement being an artifact of the prescriptive rule that data should be treated as a plural.

  29. Derwin McGeary said,

    August 11, 2015 @ 7:21 pm

    I think that plural-data is a sort of shibboleth for people who have some formal training in working with them/it/data*. I certainly picked it up at university.

    * Quick test: "Once I've processed the data you can have a look at it/them". (I'm on team 'it' although if I was using the noun I would treat it as plural.)

  30. JohnG said,

    August 13, 2015 @ 2:21 am

    @Derwin McGeary:

    You didn't have to declare what team you were on; it was obvious. If you were on team 'them', your quick test would have begun with "Once I've processed these data…".

    And yes, totally a shibboleth.

RSS feed for comments on this post