A couple of days ago, Victor Mair wrote about some provocative behavior on the part of "Kŏng Qìngdōng 孔庆东, associate professor in the Chinese Department at Peking University, who also just happens to be the 73rd generation descendant of Confucius (Kǒng Fūzǐ 孔夫子 ; Kǒng Qiū 孔丘), or at least he claims to be a descendant of Confucius."
In the comments, Victor names someone else who he believes to be a true descendant of Confucius, and notes that there is some doubt about Kŏng Qìngdōng's claim to this status.
Well, I'd like to come to Kŏng Qìngdōng's defense, at least on the specific and limited question of whether he is descended from Confucius. My standing to make this argument is based on the fact that I too am descended from Confucius. And I can prove it mathematically.
The basic argument was first made by Joseph Chang, in his 1999 paper "Recent common ancestors of all present-day individuals", Advances in Applied Probability, 1999:
Previous study of the time to a common ancestor of all present-day individuals has focused on models in which each individual has just one parent in the previous generation. For example, 'mitochondrial Eve' is the most recent common ancestor (MRCA) when ancestryi s defined only through maternall ines. In the standard Wright-Fisher model with population size n, the expected number of generations to the MRCA is about 2n, and the standard deviation of this time is also of order n. Here we study a two-parent analog of the Wright-Fisher model that defines ancestry using both parents. In this model, if the population size n is large, the number of generations, Tn, back to a MRCA has a distribution that is concentrated around lg n (where Ig denotes base-2 logarithm), in the sense that the ratio Tn / (lg n) converges in probability to 1 as n → ∞. Also, continuing to trace back further into the past, at about 1.77 lg n generations before the present, all partial ancestry of the current population ends, in the following sense: with high probability for large n, in each generation at least 1.77 Ig n generations before the present, all individuals who have any descendants among the present-day individuals are actually ancestors of all present-day individuals.
In other words, there is a date in the past such that at that time and before, the human population can be divided into two groups: those who have no current descendants and those who have all currently-living people as descendants. For people living at or before that time, if they are the ancestor of any modern humans, then they are the ancestor of all of us.
How long ago is that threshold time? Well, Chang's 1999 paper shows that in a model with random mating,
in each generation at least 1.77 lg n generations before the present, all individuals who have any descendants among the present-day individuals are actually ancestors of all present-day individuals.
where n is the effective population size.
And according to Jack Fenner, "Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies", American Journal of Physical Anthropology 128(2) 2005, tells us that
… absent of other information regarding ancient reproductive behavior, values of 25, 28, and 31 years should be used for the female, overall, and male generation intervals, respectively, for those studies in which a speciﬁc generation interval value (rather than a range of years) is appropriate.
Wikipedia tells us that Confucius lived between 551 and 479 B.C., and that his first child was born when he was 20, in 531. 2012+531 = 2543 years ago. And 2543/28 = 91 generations ago.
According to Chang's formula, 91 generations would guarantee full mixing in a population of 2^(91/1.77) = 2^54.4 = 1.93*10^16 = 19.3 quadrillion, which is many times the current population of the world, much less its population over the past 2500 years. Specifically, it's about 2.8 million times the current world population — this seems like plenty of headroom to allow for geographical and cultural barriers to mating.
But is it? I have to confess that Confucius and I are a slightly marginal case for a more realistic version of this argument. A more elaborate and realistic version is exactly the goal of Douglas Rohde, Steve Olson & Joseph Chang, "Modelling the recent common ancestry of all living humans", Nature 431, 2004. They did a variety of calculations or simulations based on models with a simplified but realistic geographic distribution of population over time, with conservative assumptions about migration, exogamy, and so on; they explicitly modeled overlapping generations and the production of offspring at different times in the life cycle; and in general they tried hard to construct a concrete and realistic model of the statistics of human descent on earth over the past few millennia.
These elaborations don't affect the main point — it remains true that
… there was a threshold, let us say Un generations ago, before which ancestry of the present-day population was an all or nothing affair. That is, each individual living at least Un generations ago was either a common ancestor of all of today's humans or an ancestor of no human alive today. Thus, among all individuals living at least Un generations ago, each present-day human has exactly the same set of ancestors. We refer to this point in time as the identical ancestors (IA) point.
And more quantitatively:
With 5% of individuals migrating out of their home town, 0.05% migrating out of their home country, and 95% of port users born in the country from which the port emanates, the simulations produce a mean MRCA [Most Recent Common Ancester] date of 1,415 bc and a mean IA date of 5,353 bc. Interestingly, the MRCAs are nearly always found in eastern Asia. This is due to the proximity of this region to both Eurasia and either the remote Pacific islands or the Americas, allowing the MRCA's descendants to reach a few major world regions in a relatively short time.
Arguably, this simulation is far too conservative, especially given its prediction that, even in densely populated Eurasia, only 55.3 people will leave each country per generation in ad 1500. If the migration rate among towns is increased to 20%, the local port users are reduced to 80%, and the migration rates between countries and continents are scaled up by factors of 5 and 10, respectively, the mean MRCA date is as recent as ad 55 and the mean IA date is 2,158 bc.
Even under the second of these scenarios, Confucius lived about 1500 years too recently to be a Universal Ancestor of all contemporary humans. But I would submit that he's more rather likely to be among MY ancestors, since some of them lived in Anatolia and in the Crimea, much more accessible to East Asia than (say) the Amazon basin is. And as for Prof. Kŏng Qìngdōng, his case is surely an easy one. Within China, there's been more than enough mixing to ensure that by now, if anyone is a descendent of Confucius, everyone is.
When Chang's result was first published, I explained it to an anthropologist friend, who refused to believe it, raising the obvious objections about assortative mating, geographical barriers, and so on. So I coded up a simulation, much simpler and less realistic than the ones done by Rohde et al., but still having a hierarchy of population groups, with parameters to control the probabilities of mating outside one's native group at each level, the probability of migrations of different sizes and distances, etc. I got my skeptical friend to suggest numbers for these parameters. Like Rohde et al., we found that such additions had no effect on the qualitative conclusion, and remarkably little effect on the quantitative results.
On the basis of that experience, I'd be willing to bet a substantial sum that a realistic simulation of Chinese mating behavior over the past couple of millennia would prove the result that if anyone within that population is now descended from Confucius, then everyone is.
For those of you who still find these results hard to believe, it may help to keep in mind two things:
(1) This is about two-parent lines of descent. Single-sex lines of descent (as in the Mitochondrial Eve scenario and so on) work differently.
(2) This is about genealogy, not about genetics. For autosomal genes, each individual gets half at random from each parent, and so you can expect to inherit G/(2^n) genes with an ancestor n generations ago, where G is the total number of genes. The current best guess for G seems to be about 25,000. Thus someone who is a 73rd-generation descendent of Confucius can expect to inherit 25000/2^73 of the sage's genes.
25000/2^73 = 2.646978e-18
So such a person has about 3 chances in a quintillion of having inherited a single gene from their illustrious ancestor.
An entertaining summary of similar work for a popular audience can be found in Susanna C. Manrubia, Bernard H. Derrida and Damián H. Zanette, "Genealogy in the Era of Genomics: Models of cultural and family traits reveal human homogeneity and stand conventional beliefs about ancestry on their head", American Scientist 91(2) 2003. Their conclusion:
The next time you hear someone boasting of being descended from royalty, take heart: There is a very good probability that you have noble ancestors too. The rapid mixing of genealogical branches, within only a few tens of generations, almost guarantees it. The real doubt is how much "royal blood" your friend (or you) still carry in your genes. Genealogy does not mean genes.