Nth Xest

« previous post | next post »

In the course of writing about the "fourth highest of five levels", I looked around at how the pattern "Nth Xest" is used in general. I found that uses of such expressions overwhelmingly count from the "top" where X names a top-oriented scale (high, big, long, etc.), and count from the "bottom" where X names a bottom-oriented scale (low, small, short, etc.)  In other words, unsurprisingly, "Nth Xest" normally counts (up or down) from whatever end of the scale "Xest" names.

Another (less logically necessary but still unsurprising) thing I noticed is that top-oriented counts are always a lot bigger than corresponding bottom-oriented counts, and that counts decrease almost-proportionately as N increases. Thus from Google Books ngrams:

second third fourth fifth sixth
highest 34447 9692 3148 1411 784
lowest 6006 1455 491 293 138

The numbers from COCA are pretty much in proportion, though lower:

second third fourth fifth sixth
highest 305 95 33 23 12
lowest 55 9 4 3 2

Here are the Google Books counts for a larger set of values of X (values of 0 generally reflect cases where the count didn't reach the threshhold of 40 required for retention of ngram counts):

second third fourth fifth sixth
highest 34447 9692 3148 1411 784
lowest 6006 1455 491 293 138
biggest 6001 1402 608 264 156
largest 124598 50022 20712 10595 6246
greatest 8333 1762 423 209 162
smallest 2703 605 200 92 49
most 114727 28723 8192 4028 2163
least 988 302 57 58 0
best 55695 7009 2337 649 426
worst 2417 501 142 95 0
oldest 14955 3041 661 202 128
youngest 2772 454 92 0 0
longest 3739 1660 713 412 171
strongest 3087 735 151 46 45
richest 1486 683 228 136 91
poorest 598 196 82 82 0

Adding them all up column-wise:

The left-hand figure below plots the counts on a log scale. And on the right, I've normalized the top-oriented and bottom-oriented counts, normalized by the count for "second Xest":

The same things for COCA counts:

It would be nice if the recently-developed distributional semantics methods could induce patterns of this type — but I don't think that they can do so yet.

 

 



1 Comment

  1. D.O. said,

    September 2, 2014 @ 11:51 am

    Raw counts of the ordinal number words (without any coöcurrences) also show approximately exponential fall with somewhat diminishing exponent. Data from Google ngrams averaged for years 2000-2008 (they are really pretty stable over many decades) in words per million
    first       815.7
    second  264.4
    third      130.5
    fourth     36.3
    fifth        21.9
    sixth       13.3
    seventh  10.7
    eighth      8.8
    ninth       6.5
    tenth       7.8

    I also included "first" which is not in Prof. Liberman counts for obvious reasons. Counts for "first" through "fourth" fall with exponent of 1 (that is, by the factor of e for any subsequent number), quite close to what happens with Nth Xest. So far, excluding the obvious case of the first Xest, there is no evidence that the use of ordinals with rankings is any different from the use of the ordinals overall.

RSS feed for comments on this post