Quantification of swearosity

« previous post | next post »

The latest lesson from Surviving the World:


  1. Chris said,

    November 18, 2011 @ 9:30 am

    Cute, but in fact this may be a good example of pure stats failing to explain the facts. O(f) in the above formula is somehow cognitively motivated (rather than cultural). I can't find the reference right now, but I believe there has been cross-linguistic research establishing a universal hierarchy of swearing severity, something like sexuality — body functions — religion. Anyone recall this?

    Also, is there a word for a serious comment on a unserious post?

  2. Rod Johnson said,

    November 18, 2011 @ 9:44 am

    Who says "swearosity"? doesn't -osity pertain to adjectives in -ous?

    In my part of the country we say "sweariness."

  3. David L said,

    November 18, 2011 @ 9:49 am

    It seems to me that there ought to be some sort of relationship between K and A_sub_p. If a word is widely accepted in public use, then knowledge of it a swear word must be confined to a small demographic; conversely, if a word is widely known as a swear word, then public acceptance would be small.

    However, this relationship might also involve the swearosity itself, since a word that has high K but low S_sub_w might nevertheless have a moderately high A_sub_p. So there's some kind of recursiveness in there too.

    In short, I think the originator of this formula needs to do more research to establish what are the truly independent parameters.

  4. David L said,

    November 18, 2011 @ 9:52 am

    Now that I think about it more, isn't public acceptability by definition the inverse of the swearosity?

  5. KevinM said,

    November 18, 2011 @ 10:02 am

    David L – That's why it's in the denominator of the fractional expression – you divide by Ap to get the swearosity index, or swearitude.

  6. Rod Johnson said,

    November 18, 2011 @ 10:18 am

    Also, what are the units?

  7. David L said,

    November 18, 2011 @ 10:23 am

    KevinM: but what I'm saying is, Ap and Sw are not independent quantities.

  8. Jerry Friedman said,

    November 18, 2011 @ 10:28 am

    @David L: I think "damn" and "hell" are very widely accepted in public, but still swearous (or swearose), at least in America.

  9. David L said,

    November 18, 2011 @ 11:02 am

    Well, yes, but to the extent that damn and hell are widely accepted, they're less swearous than [insert bad word here], which is less widely accepted.

    I propose a law: Ap times Sw =T(R ), where T is the position-dependent naughtiness tolerance function.

  10. Rod Johnson said,

    November 18, 2011 @ 11:05 am

    Good point about position dependence. They're also deictic (that is, speaker-dependent). I express myself colorfully, you swear, he has a filthy mouth.

  11. Shangwen said,

    November 18, 2011 @ 11:25 am

    Should phonotactics not be included into the equation? I would guess that the infra-syllabic distribution of unvoiced plosives would increase swearosity, particularly in the coda. Also, the ratio of such plosives per total phonemes in the sweareme would be important to know.

    Dr. Lieberman, can you run the stats?

  12. GeorgeW said,

    November 18, 2011 @ 11:26 am

    @Chris: "I believe there has been cross-linguistic research establishing a universal hierarchy of swearing severity, something like sexuality — body functions — religion. Anyone recall this?"

    According to Hughes ("Swearing," 1998), swearing itself is not universal. However, these are common referents in swearing which also include nationality and madness.

  13. MikeA said,

    November 18, 2011 @ 11:32 am


    You mean, something like "fucking bat-shit crazy wop papist"?

  14. Jonathan Lundell said,

    November 18, 2011 @ 12:52 pm

    While parts-of-speech flexibility no doubt contributes to the utility of a swearous word, in no way is it an exponential relationship.

  15. Jimmy H. said,

    November 18, 2011 @ 1:03 pm

    But Jonathan, remember that parts-of-speech flexibility has a very low bound. In traditional grammar, it can't be greater than 7 or 8, and rarely exceeds 2.

  16. KevinM said,

    November 18, 2011 @ 1:05 pm

    David L: Oh, sorry, I was being dense. I see your point (though probably none of the variables are strictly independent of the others). Interesting, though, that he phrases the variable, not as "Public Acceptance," but as "Acceptability of the Word In Public." So there may be a hypocrisy/gentility factor in there that causes the variables to diverge for purposes of calculating the sweariciousness.

  17. Jonathan Badger said,

    November 18, 2011 @ 1:24 pm

    Really? Religion as the most severe swearing? Yes, in Quebec French, there is religious based swearing (tabernac, etc.) and at least to the over 40 crowd it can still offend, but elsewhere? In English "christ" and "goddamn" are pretty weak sauce

  18. Jim said,

    November 18, 2011 @ 1:33 pm

    @ Jonathan Badger, I think that order is given from hardest to softest swearing.

  19. alex said,

    November 18, 2011 @ 1:57 pm

    can i just say that the most helpful thing I took away from Mark Liberman's lecture on Wednesday at the University of Maryland was reminding me of the existence of this website, which I hadn't visited in over a year. Posts like this are the reason I love you UPenn people. Thank you!

    also, the aforementioned claim is untrue. His lecture was great! But amount of weekly greatness on this website is far superior. AH!

  20. Dante Shepherd said,

    November 18, 2011 @ 3:24 pm

    If a thread like this leads to an actual useful equation sometime far down the road, I hope my initial, somewhat-facetious relationship above is remembered like the plum-pudding model for the atom.

  21. Sili said,

    November 18, 2011 @ 5:18 pm

    Rod Johnson said,

    Also, what are the units?


  22. D.O. said,

    November 18, 2011 @ 8:18 pm

    I especially enjoy taking the square root of a no-number. Unless somebody already Goedelized the semantics.

  23. Tim said,

    November 18, 2011 @ 8:25 pm

    How does the "meaning of the word" get quantified as a number, I wonder?

  24. Tim said,

    November 18, 2011 @ 8:26 pm

    And I think I just got beaten to that idea while I was reading.

  25. kktkkr said,

    November 18, 2011 @ 11:28 pm

    I would replace "Meaning of the word" with "ghits".

    Nerdy observation: The "degree of offense" cannot be on a scale from 0 to 1 for the swearosity to increase in the direction it's supposed to. (Ideally, nice words would have a degree less than 1 (but still positive!) and bad words would have a degree more than 1. What kind of measure would have that effect?)

  26. maidhc said,

    November 19, 2011 @ 1:28 am

    Does this explain why "yellowman" is now an obscenity when it used to be a kind of sweet? (It was a sweet in Ulster, anyway.)

    It's the new set of rules for texting in Pakistan: http://www.bbc.co.uk/news/world-asia-15793721

  27. Duncan said,

    November 19, 2011 @ 9:25 pm

    @ David L: Playing with equations like this, and noting the relationship of the various factors to each other is fun, isn't it?

    The first thing I noted (after you drew attention to the AsubP and K factors) is that in inverse relationship like that, if they are taken to refer to the same thing but in inverse, then one could do away with the AsubP divisor by squaring K.

    The next thing I noticed, after a bit of time recalling rusty high-school science class equations, initially appears to disprove this whole equation due to its absurdity:

    If other factors including Ssubw are held constant, then as AsubP increases, so must K. Translating back to English, if swearosity and the other factors are held constant, the equation states that as public acceptability increases, in ordered to keep the equation in balance, so must knowledge of the word as a swear word. That seems absurd on the face of it! Intuitively, knowledge of the work as a swear word should decrease public acceptability, other things being held constant. "The" must rank pretty high up there on public acceptability judging by how common it is, but knowledge of it as a swear word?

    Meanwhile, I'd posit that the factors are different in that while it's not directly stated, they must be measuring different targets. Acceptability of the word in public, by definition of public, would seem to be global (note the use of the word global, not the word universal, in cosmic scope =:^). But I'd posit that the intended K factor, knowledge of the word as a swear word, must be a measure taken of the audience, not of the public in general. Thus, they are different factors.

    Does that let the equation off the absurdity hook just established above?

    Not likely, but there's a corner case argument for it given the geometric effect of the P (parts of speech) factor, if that too is held to be audience specific along with the offensiveness factor, and the three audience specific factors are held to be sufficiently interrelated so that audience knowledge of the word as a swear word changes in relation to the number of parts of speech said audience can find a use for that word as. That might help explain Bevis and Butthead level peer group dynamics regarding swear words.

    But that's still a stretch and indicates that there's more work to do isolating and defining the true variables, since there's now obviously a missing variable reflecting how the target audience relates to the global "public" and helping to explain how the audience-defined variables relate to each other and to the global variables. But at least it brings back the equation from /full/ absurdity, toward one simply lacking variables for one or more factors, with other variables and their scope defined somewhat incorrectly but with some insight already present.


RSS feed for comments on this post