Ancestry and Mathematics

      People who have traced their genealogy extensively often reach three conclusions that the rest of us find improbable. First, they conclude that they are descended from some famous person of the distant past. For persons of European descent, a very common example is Charlemagne, founder of the Holy Roman Empire in 800 AD. Secondly, they conclude that they are related to some famous modern people (for example, a recent American president or two). Thirdly, they conclude that they are their own cousins, which is to say that some pair of their ancestors (Grandma and Grandpa Jones, for example) were cousins. This sounds improbable to most of us, but a look at the math of ancestry says that such relationships aren't improbable, and that in fact they're almost inevitable.


      Let's suppose you're asked to fill out a chart showing your ancestry. Your first task is to list your ancestors one generation back: your parents, of which their are two. Then you list your ancestors two generations back: your grandparents, of which there are four. Then, perhaps with some help from a knowledgeable family member, you list your ancestors three generations back: your great-grandparents, of which there are eight. If you're exceptionally aware of your ancestry, you can list your ancestors four generations back: your great-great-grandparents, of which there are sixteen.

      If you're following the numbers here, you'll have noticed that the number of ancestors doubles each generation back: two one generation back, four two generations back, eight three generations back, and so on. If we let "n" be the number of generations back, the number of ancestors in that generation is 2n, or 2 raised to the nth power.

      Now let's go back to someone's claim to be descended from Charlemagne. Charlemagne lived about 1200 years ago, which turns out to be about 40 generations back in most lineages. 240 is a little over one trillion. This means that, if our claimant filled out an ancestry chart back the forty generations to about 800AD, he or she would have to list a trillion names in that fortieth generation back.

      If you've been following the math, you'll have noticed a problem. The present total human population of the world is about six billion. Scholars estimate that world population in 800AD was a lot lower, around 300 million. Even if we're generous by a factor of more than three and assume world population in 800 AD was a billion, that would only be one-thousandth of the people needed to fill in the trillion spaces in the ancestry chart. The name of each person alive in 800AD would have to appear, on average, one thousand times in the fortieth generation of our claimant's chart.

      If we're more realistic about where people's ancestors lived and how much they crossed geographic barriers, populations of potential ancestors get even smaller. If world population in 800AD was 300 million, most people are probably descended from a pool that was at most a third of the world's population, or about 100 million in 800 AD. With that in mind, the number of potential ancestors is only one ten-thousandth of the number of spaces in the chart forty generations back.

      This tremendous shortfall of potential ancestors implies that almost anyone alive and reproducing in our area of geographic interest in 800AD had a high probability of being an ancestor, that persons descended from one geographic area have to have many ancestors in common, and that, among any one person's ancestors, intermarriage of cousins was almost inevitable. The three apparent improbabilities discussed at the beginning of this essay (famous ancestry, extensive cousinhood, and self-cousinhood) now seem like inevitabilities.


      For the last few paragraphs, we've focused on 800AD, or forty generations back. The plot below is a more general treatment of the issue. The red curves show the number of ancestors back in time for a person born in 2000 AD (you can shift them to the left forty years if you're forty years old - it won't matter much). There are three red curves to show the what happens if one assumes that an average generation has lasted 20 years, 25 years, or 30 years. The orange curves show world population and some fractions thereof. Someone named Hans Memboko Xiao may be descended from the entire world population of a few centuries ago, but many of us maybe descended from smaller pools that were at most one fourth or one sixth of the total world population.


      The plot lets you pick your likely length of generations (a red line) and the fraction of the world population in which you think your ancestors lived and mated (an orange line). The point at which those two curves cross defines a time on the horizontal axis at the bottom. Earlier than that, your ancestry was inevitably redundant: you had more ancestors than there were people to be ancestors. The plot shows that, even with the most generous assumptions, the time of required redundant ancestry can be no earlier than about 1150 AD. For most of us, it's probably a century or two later. For someone whose ancestral pool has been a small subset of the human population, required redundancy may be as recent as the 1500s.

      Persons concerned with details will note minor inaccuracies of this plot. One is that not everyone in past populations reproduced to have descendants today: either they had no children, or all their offspring had no offspring. This means each red curve could be moved down a little on the plot and relabeled "reproducing human population". A second quibble is that recent redundancy in ancestry - grandparents who were first cousins, for examples - lessens the number of possible ancestors any number of generations back. This means that the red lines would be a little lower, or wouldn't rise as quickly to the left. The first factor moves the time of required redundancy to a later data, and the second shifts it to an earlier data. The two factors combined probably move the time of required redundancy at most to a date 40 years earlier or later than that shown.


      So what does all this mean, other than that your braggart neighbor may really be descended from Charlemagne, or Confucius, or Asoka?  It means that you may be, too.  More importantly, it means that we're all considerably more inter-related than we might think.  The staggering number of ancestors each of us has (for example, perhaps 200 million of them in 1300 AD) also means that your ancestry is probably more diverse than you think.  Somewhere among those 200 million people who were your ancestors in 1300, there are probably some folks who came from places you wouldn't think likely, or who were members of ethnic groups that would surprise you.  You'll never know who all those 200 million people were, but they can be abundant fuel for your imagination.

      Finally, the most important point to appreciate is that we're all more closely related than commonly supposed.  Despite our apparent differences across nationalities, cultures, and religions, we have more in common than we usually think.  If ties of kinship matter to us, we're all closer kin than our attitudes and policies would often suggest.



Some pages making roughly the same mathematical point are at, and

A recent scientific paper on this topic and related implications regarding genetics is Manrubia, S.C., Derrida, B., and Zanette, D.H., 2003, Genealogy in the era of genomics: American Scientist, v. 91, p. 158-165.

This page was generated in November 2000 and most recently revised on 12 March 2003. The author thanks Father Lazarus of the St Nectarios Orthodox Monastery in Richford, Vermont for his insightful comments, which contributed to the revisions.  

This page is a frivolous waste of time on the part of Bruce Railsback.