The long tail of religious studies?

« previous post | next post »

Google Books isn't the only outfit that sometimes has trouble with metadata. I happened to notice this morning that Oxford University Press has classified Herbert A. Simon's "On a class of skew distribution functions" (Biometrika 43:425-440, 1955) as "Religious Studies..Death":



I can usually see why an automatic classification algorithm went off the rails, but in this case, I'm somewhat at a loss. Here's Simon's abstract:

It is the purpose of this paper to analyse a class of distribution functions that appears in a wide range of empirical data-particularly data describing sociological, biological and economic phenomena. Its appearance is so frequent, and the phenomena in which it appears so diverse, that one is led to the conjecture that if these phenomena have any property in common it can only be a similarity in the structure of the underlying probability mechanisms. The empirical distributions to which we shall refer specifically are: (A) distributions of words in prose samples by their frequency of occurrence, (B) distributions of scientists by number of papers published, (C) distributions of cities by population, (D) distributions of incomes by size, and (E) distributions of biological genera by number of species.

No one supposes that there is any connexion between horse-kicks suffered by soldiers in the German army and blood cells on a microscope slide other than that the same urn scheme provides a satisfactory abstract model of both phenomena. It is in the same direction that we shall look for an explanation of the observed close similarities among the five classes of distributions listed above.

If the topic were assigned by a human, then I guess it might be a humorous reference to the strange mathematico-theological war between Herb Simon and Benoit Mandelbrot that followed:

Herbert A. Simon "On a class of skew distribution functions", Biometrika 43:425-440, 1955 (link)
Benoit Mandelbrot, "A note on a class of skew distribution functions: Analysis and critique of a paper by H.A. Simon", Information and Control 2(1):90-99, 1959.
Herbert A. Simon, "Some further notes on a class of skew distribution functions", Information and Control 3(1):80-88, 1960 (link)
Benoit Mandelbrot, "Final note on a class of skew distribution functions: Analysis and critique of a model due to H. A. Simon", Information and Control 4(2-3):198-216
Herbert A. Simon, "Reply to 'Final Note' by Benoit Mandelbrot", Information and Control 4(2-3):217-233, 1961 (link)
Benoit Mandelbrot, "Post scriptum to 'final note'", Information and Control 4(2-3):300-304, 1961
Herbert A. Simon, "Reply to Dr. Mandelbrot's post scriptum", Information and Control, 4(2-3):305-308, 1961 (link)

[Update — I've added links to Simon's side of the exchange, courtesy of CMU's Herbert Simon Collection, as pointed out by Matthias in the comments. Mandelbrot's side remains behind the Information and Control paywall.]

This is one of the most intense, bitter and personalized exchanges that I've ever seen in the scientific or technical literature. And it doesn't fit either of the usual explanations for such debates.

Since the subject matter is mathematics, it seems to contradict Benford's Law of Controversy: "Passion is inversely proportional to the amount of real information available".  And since neither Simon nor Mandelbrot represented a grouping or sect, whether mathematical or social, there's no evidence here for Freud's "Narzißmus der kleinen Differenzen" (Chapter 5 of Das Unbehagen in der Kultur, translated as Civilization and its Discontents, 1930)

… it is precisely communities with adjoining territories, and related to each other in other ways as well, who are engaged in constant feuds and in ridiculing each other — like the Spaniards and Portuguese, for instance, the North Germans and South Germans, the English and Scotch, and so on. I gave this phenomenon the name of 'the narcissism of minor differences', a name which does not do much to explain it. We can now see that it is a convenient and relatively harmless satisfaction of the inclination to aggression, by means of which cohesion between the members of the community is made easier.

You may have noticed that linguists tend to be passionate about their ideas. As a result, some of us can be downright, um, vivid when  dealing with those who Just Don't Get It. But unless you define "real information" circularly as "information that would serve to settle a controversy", it doesn't seem to me that there's much correlation between information content and the intensity of controversies, whether in linguistics or elsewhere.

As for the Narcissism of Small Differences, it's of course true that the people with relevant alternative opinions are generally in the same field. If you're going to argue about, say, lexical diffusion or strong crossover phenomena, you're probably going to have to argue with another linguist, because geophysicists and archeologists won't know what you're talking about. And when an argument involves groups of people on both sides, you can expect more publications over a longer period of time. It's plausible that there's an additional effect of small-group dynamics "by which the cohesion between the members of the community is made easier", but I'm not sure how to determine whether that's really true.



22 Comments

  1. Dan T. said,

    August 5, 2010 @ 10:58 am

    And even half a century after publication, those papers are locked up behind a paywall; academic publishing sucks.

    [(myl) Weird, isn't it? Simon's 1955 paper is available here, and his first response to Mandelbrot is here, but I haven't been able to find accessible copies of the other episodes. Penn doesn't have an online subscription to Information & Control, and the physical copies of the bound journals are in remote storage. 25 years ago I made xerox copies of the exchange in the Bell Labs library, but lord knows which dusty box or folder those are in. I once took the trouble to get the relevant issues fetched for me at Penn, and made xeroxes, but I can't remember where I put those either. I would have liked to be able to find them today, in order to quote some of the extraordinary examples of formal academic invective in the exchange. As I recall, one of them accuses the other of an undergraduate error in calculus, and the other responds with a counter-accusation of a high-school level error in algebra. Or something like that.

    You can get some of the flavor of the exchange from the abstracts of the last three contributions, which are available without paying $31.50 a shot:

    Mandelbrot, "Final Note": We shall restate in detail our 1959 objections to Simon's 1955 model for the Pareto-Yule-Zipf distributions. Our objections are valid quite irrespectively of the sign of ρ — 1, so that most of Simon's (1960) reply was irrelevant. We shall also analyze the other points brought up in that reply.

    Simon, "Reply to 'Final Note'": Dr. Mandelbrot's original objections (1959) to using the Yule process to explain the phenomena of word frequencies were refuted in Simon (1960), and are now mostly abandoned. The present “Reply” refutes the almost entirely new arguments introduced by Dr. Mandelbrot in his “Final Note,” and demonstrates again the adequacy of the models in 1955.

    Mandelbrot, "Post Scriptum": My criticism has not changed since I first had the privilege of commenting upon a draft of Simon 1955.

    Simon, "Reply to Dr. Mandelbrot's Post Scriptum": Dr. Mandelbrot has proposed a new set of objections to my 1955 models of the Yule distribution. Like his earlier objections, these are invalid.

    Luckily, dueling was pretty much obsolete by 1961.]

  2. Ryan Denzer-King said,

    August 5, 2010 @ 12:00 pm

    Luckily? If it weren't there wouldn't be all that wasted paper.

  3. Carl Burke said,

    August 5, 2010 @ 12:14 pm

    If I had to guess at features that suggest 'Religious Studies — Death', I'd have to go with the word 'urn' and the suffix 'xion', almost never seen except on 'crucifixion'. Granted that Biometrika is published by Oxford University Press, and 'connexion' is a perfectly good British word, the classification algorithm might be more familiar with the American form 'connection'.

    [(myl) Looking a bit further into the paper, one finds things like

    it is well known that the negative binomial and the log series distributions can be obtained as the stationary solutions of certain stochastic processes. For example, J.H. Darwin (1953) derives these from birth and death processes, with appropriate assumptions as to the birth- and death-rates and the initial conditions.

    ]

  4. stephen said,

    August 5, 2010 @ 12:24 pm

    Aha! I think I found it!

    "Urn scheme".

    "than that the same urn scheme provides a satisfactory abstract model of both phenomena."

    I'd never heard of an "urn scheme". I googled it and found at least two different meanings for it.

    An urn is used for keeping the ashes of someone who has been cremated.

    And, "phenomena" is often used in reference to metaphysical, religious experiences. So those two words could be the reason you're looking for.

  5. stephen said,

    August 5, 2010 @ 12:26 pm

    Darn, I didn't see Carl Burke's reply before I posted.

  6. Grep Agni said,

    August 5, 2010 @ 12:27 pm

    I know absolutely nothing about this exchange or the people involved, aside from vaguely recognising Mandelbrot as a pioneer in dynamical theory. Nevertheless, I have trouble imagining a feud like this erupting ex nihilo if their only interaction was through correspondence and published papers. Was there some pre-existing animus between the two? It wouldn't have to be something major — did one accidently spill a drink on the other's lap in a bar during a conference, maybe?

    [(myl) Simon's 1955 paper starts with these acknowledgments: "I have had the benefit of helpful comments from Messrs Benoit Mandelbrot, Robert Solow and C. B. Winsten. I am grateful to the Ford Foundation for a grant-in-aid that made the completion of this work possible." And he cites a 1953 paper of Mandelbrot's in the references. So I think the obvious ill-will developed during the exchange rather than preceding it, though I admit there may be personal circumstances unknown to me.]

  7. Brett said,

    August 5, 2010 @ 12:37 pm

    The relevant meaning of "urn scheme" is this one: http://en.wikipedia.org/wiki/Urn_problem

    As somebody with extensive training in probability theory, I didn't even notice the oddness of "urn scheme," which I agree is the probably origin of the "Death" connection. (In fact, I find the use of "urn" in connection with cremation a little odd, being so used to the mathematical usage.)

  8. Ken Brown said,

    August 5, 2010 @ 1:26 pm

    "If you're going to argue about, say, lexical diffusion or strong crossover phenomena, you're probably going to have to argue with another linguist, because geophysicists and archeologists won't know what you're talking about."

    But this exact topic crosses loads of field boundaries. We talked about power laws and distributions when I was studying botany and ecology, years ago. Much later the same statistics turned up in the analysis of genomes and various bits of bioinformatics. But you also meet them in computer security, cryptography, and in networking. Mandelbrot is a mathematician – Simon won the Nobel Prize for economics, was a professor of Political Science, and is apparently well-known for his contributions to the sociology of organisations as well as decision-analysis and artificial intelligence (if we can trust Wikipedia)

    I've never read their original papers (and maybe I wouldn't understand them if I did) but perhaps one reason they found this topic inflammatory was precisely that they approached it from different fields and using different methods and jargon?

    [(myl) Just to clarify — neither lexical diffusion nor crossover has any obvious connection with these arguments about the genesis of power laws. And in this case, I don't think that the difference between (Simon's kind of) economics and (Mandelbrot's kind of) mathematics was a critical contributor to the intensity of their argument — though it's certainly true that power-law and other long-tailed distributions come up in lots of fields, sometimes appropriately.

    (See Cosma Shalizi's posts "Gauss is not mocked", 10/28/2005; and "So you think you have a power law — well isn't that special?", 6/15/2007; and the other things linked here.)]

  9. MattF said,

    August 5, 2010 @ 1:42 pm

    Just wondering– was there ever any final 'verdict' on who was right? Or were they both wrong?

  10. Dan T. said,

    August 5, 2010 @ 2:50 pm

    Though to those involved in World Wide Web standards, "URN scheme" might bring to mind the schemas for Uniform Resource Names.

  11. Rubrick said,

    August 5, 2010 @ 4:10 pm

    Mandelbrot's arguments always look reasonable at a glance, but the closer you examine them the more they look like themselves.

  12. D.O. said,

    August 5, 2010 @ 5:15 pm

    Looks like usual preferred professorial pastime — deciding who's an idiot.

  13. Robert said,

    August 5, 2010 @ 5:40 pm

    I think there's an extra rule that probability tends to start arguments (with raised voices). The residue of these arguments is visible even in Feller's book, and is especially visible in E. T. Jaynes's book ("Probability: The Logic of Science").

  14. Matthias said,

    August 5, 2010 @ 9:43 pm

    In some sort of post-humous green open access, Simon's half of the exchange is available from CMU's Herbert Simon Collection. They also have a scan of Simon's copy of Mandelbrot's 1959 note (complete with some hand-written annotations).

  15. Dan T. said,

    August 5, 2010 @ 11:55 pm

    I guess one would expect a journal called Information and Control to be pretty tight about giving free access to its content!

  16. ajay said,

    August 6, 2010 @ 5:54 am

    it is precisely communities with adjoining territories, and related to each other in other ways as well, who are engaged in constant feuds and in ridiculing each other

    You don't need any sort of psychological insight at all to work out why the Spanish might go to war with the Portuguese more often than they do with the Malaysians. You just need a map…

    [(myl) Indeed. And adjacent cultures are more likely to be in the sort of contact that create stereotypes, genres of jokes, and so on. Still, some naive people might think that the intensity of inter-cultural conflict might be proportional to the degree of inter-cultural difference. And in fact — whether because of Narcissism or maps — it's probably truer to say that the proportionality is inverse.]

  17. Ken Brown said,

    August 6, 2010 @ 6:32 am

    A map? Well yes, off-topic & all, but the early modern history of Portugal probably follows that model less well than any other country on earth… they got everywhere. And the war that lost them the best part of their Indian Ocean empire was against Oman! :-)

    Thanks for those links, Mark. They confirm my prejudice that Fisher rules. Even if I don't usually understand him. But I'm a mathematically-challenged biologist – we do it all with broken sticks.

  18. carla said,

    August 6, 2010 @ 1:29 pm

    Mandelbrot's contributions illustrate the risk of trying to end an argument by declaring your word the last word. It leads you to into the Hobson's choice of, on the one hand, sitting silent and letting your opponent have the last word, or, on the other, having to give your piece a ridiculous title like "Post Scriptum to Final Note."

    I am reminded of my past life as a litigator, when I filed a memorandum entitled "Response to Surreply."

  19. Jason Eisner said,

    August 7, 2010 @ 3:53 pm

    The classic joke about Freud's "narcissism of minor differences" was written by Emo Philips in the early 1980's. There are several variants.

    Slightly earlier was the squabbling among the Judean People's Front and related organizations in Monty Python's Life of Brian (1979).

  20. Mike Anderson said,

    August 8, 2010 @ 2:01 pm

    Great grist for the mill of my dissertation on weighted counting models…Thanks!

  21. John Cowan said,

    August 9, 2010 @ 10:29 am

    "These things tend to drag on; in future work, Higginbotham will argue that my eyes are too close together, and I will argue that on the contrary, his head is too round." —Geoffrey K. Pullum

  22. peter said,

    August 12, 2010 @ 2:53 pm

    Robert said (August 5, 2010 @ 5:40 pm)

    "I think there's an extra rule that probability tends to start arguments (with raised voices). The residue of these arguments is visible even in Feller's book, and is especially visible in E. T. Jaynes's book ("Probability: The Logic of Science")."

    And also there were Ronald Fisher's vitriolic arguments with just about everybody else.

    I once witnessed the following angry exchange between a speaker and an audience-member, both prominent professors of statistics, at a conference on Bayesian statistics:

    Audience member: Your position is not coherent. I have been reading your papers for 25 years and I still don't understand your argument.

    Speaker: I can't be held responsible if you are too slow to keep up.

RSS feed for comments on this post