Remembering Ancestors

By Paul VanRaden

©2025

 

Topics:

         Displaying ancestors in a pedigree

         Ancestry origins from pedigree

         Ancestry regions from DNA

         Finding DNA matches for relatives

         Predicting traits

         Data sources

         Relationships, inbreeding, and heterosis

         Conclusions

         References

 

Displaying ancestors in a pedigree

At USDA we wrote programs to display ancestor information from a database onto a screen beginning in 1993. Most pedigree displays become very wide with many generations and then leave much empty space at the left for the recent generations. The display style we invented lists standard data fields such as birth date in one long column together. After those fields, each generation indents only a few spaces more than the last, greatly reducing the width and giving a vertical pedigree with one ancestor per row that can be easily emailed or printed on standard paper. I used those ideas for my own pedigree below. This style could be programmed to provide the same report for any pedigree by pulling data from a database or an export file, but I had to type this one by hand:

         VanRaden Pedigree

 

Ancestry origins from pedigree

People have wanted to know where their ancestors lived, how long they lived, and stories about them since Biblical times. Often, historians recorded only the all-male path from father to grandfather to great grandfather, etc., or from father to son to grandson, etc. because family names, property, and power were often inherited that way, such as with kings. Many people even worship a father and a son but not a mother or a daughter. But in this last century, females may own property, hold power, have equal rights such as to vote, and deserve their history (herstory) told also. Hallelujah!

My nieces Hillary King and Andrea Sabaka found or stored about 230 of my ancestors in a commercial database (ancestry.com) and added many pictures and stories about their lives. We Americans had forgotten much of this history, but our German cousins wrote a 100-page book about our VanRaden ancestors and those families. Then I transferred the key fields into the pedigree document above and used that to trace where our family’s DNA is from. My sister Judy Winship thought of giving each generation a different color to help see the connections. Humans will never get pedigrees as deep and accurate as cattle because farmers gave cattle a 100-year head start and because cattle generations take only 2-5 instead of 20-40 years.

Pedigree depth averages about 7 generations for me and my sibs, 8 generations for our kids, and 9 for their kids. For 3 of my 4 grandparents, all 14 paths trace back to my 2nd or 3rd great grandparents who immigrated to northwest IL from Germany around 1850. My father’s ancestors were all from northwest Germany close to Netherlands and my maternal grandmother’s were from southwest Germany near Stuttgart and one from Frankfurt. Of my maternal grandfather’s ancestors, 94% trace to northeast U.S. states (CT, MA, NH, RI) and only 6% go back to my 7th to 9th great grandparent immigrants from England. In the 1700s, most New England colonists were from England and my ancestors’ last names sound British, so my pedigree is 75% German, 2% British, and 23% probably British.

 

Ancestry regions from DNA

My genomic origins were 63% western Europe (45% northwest Germany, 11% Netherlands, 7% southern Germany), 34% southeast England or northwest Europe, 2% north Wales or northwest England, and 1% Czechia to the east of Germany. My 11% from Netherlands makes sense since my paternal ancestors lived just across the border and spoke the Low German (Plattdeutsch) language common across northern Germany and northeast Netherlands.

The only surprise was 1% Czech DNA which is why we check our DNA. Generally, a pedigree listing the former hometown of each immigrant ancestor is more precise than a DNA test unless that immigrant had some earlier ancestors from far away. But the DNA test more precisely estimated origins for the 23% of my pedigree that ended in New England and did not trace back to immigrants. My previous job at USDA included estimating genomic origins for millions of cows. For cows or for dogs, scientists use DNA to check breeds of origin instead of places of origin.

 

Finding DNA matches for relatives

DNA testing companies can find close relatives also tested by that same company, and 25% of my DNA matches my niece Hillary’s DNA, as expected. My highest other matches were 5-7% with 2 sons of my paternal grandmother’s sisters and with 1 granddaughter of my maternal grandfather’s brother. She is my second cousin, but Ancestry suggested her as a 1st cousin once removed or half first cousin. Some relatives share more or fewer DNA segments than average, like not always getting 11 heads and 11 tails when flipping a coin 22 times. Each parent has 2 sets of chromosomes, and at each location you get their DNA from their mother or their father, but not both. Ancestry correctly guessed which relatives were on my Dad’s side or my Mom’s side because my great grandparents each had many descendants to help sort the DNA into individual chromosomes, a mathematical process called phasing.

Ancestor discovery for cows is much more accurate because we have DNA from 10 generations of ancestor bulls and very complete and accurate pedigrees for all bulls across the whole world. Future generations of humans will have more of their ancestors’ DNA already analyzed. That will simplify the process of connecting relatives because each next generation can just confirm their direct relationship to their ancestors’ DNA instead of indirect relationships to cousins’ DNA. Many dairy farmers across the world now trust our programs to create pedigrees for their cows and calves using the DNA matching system that my coworkers and I developed at USDA. Parent matching is very easy, and already > 2 million unknown grandparents and great grandparents were discovered by DNA and automatically added to the global cow pedigree file.

 

Predicting traits

Ancestry.com also predicted many traits as yes/no or high/low and a few traits had high/average/low. For predictions of my traits, 28 seemed correct but 19 seemed wrong. That may not be very accurate if one of the two categories is much more frequent than the other. For example, if 15% of people have freckles you could get 85% correct predictions by always guessing no freckles with no data. Some Ancestry traits use markers known from literature to have large effects but most traits use survey replies from their DNA customers. Strangely, they did not predict height, which is highly heritable and was the first trait that human geneticists studied in detail in 2010. That study used almost the same genomic methods we developed at USDA and used for predicting cow stature since 2008.

Predictions for 50 cattle traits each come with percent reliability estimating how accurate they are expressed as a squared correlation. The trait predictions are expressed using 2 or 3 decimal digits instead of binary yes/no. Genetic effects can be estimated very precisely even if the observed traits are predicted less well due to low heritability. Farmers use the predicted traits to directly manage the current generation or to select the next generation, and most cattle traits have known economic values. Thus, farmers can easily profit from cattle DNA predictions available since 2008 and pedigree predictions since 1926. Human DNA tests are mostly for entertainment. Trait predictions began much earlier at 23andMe, off and on since 2007 due to changes in regulations, than at Ancestry.com since 2018.

 

Data sources

         The technology to read DNA is almost the same across many different species. For humans, several different companies collect DNA samples and store pedigree data. Ancestry.com has sold over 28 million DNA tests as of 2025 and 23andMe sold 14 million DNA tests as of 2024. The largest cattle DNA database begun by USDA and managed by the Council on Dairy Cattle Breeding has over 11 million DNA tests as of 2025.

For humans, competing companies likely do not yet share pedigree data. Even within Ancestry.com, their German database had pedigrees for several older ancestors that were not in the American database. For dairy cattle since 1995, Interbull in Sweden receives full pedigrees from about 25 countries for all commercially available dairy bulls. Interbull then sends to each country a complete file of combined and edited pedigrees from all the world’s bulls for several different breeds. Dairy cattle breeders want to see the full global pedigrees because most Holstein bulls have foreign fathers or grandfathers. Cattle are not forced to stay within country borders. Cows are one big, happy family across the whole world. Hallelujah!

Knowing names and locations of ancestors is not as interesting as learning stories or seeing pictures from their lives. For example, our German cousins documented the lives of our VanRaden ancestors that we American descendants had forgotten:

         VanRaden Genealogy

Relationships, inbreeding, and heterosis

Inbreeding coefficients are very important in livestock breeding because progeny may be less healthy if parents are too related. Most humans are much less inbred than most livestock. I did not get either DNA or pedigree estimates of inbreeding from Ancestry, but my pedigree inbreeding may be 0 because I did not notice the same names among my maternal and paternal ancestors. We could imagine finding 5 or 10 more generations of ancestry and then my inbreeding coefficient might be a tiny bit above 0 but nothing to worry about.

The pedigree methods that quantify how inbred or how related any 2 individuals are to each other were published in 1922 by geneticist Sewall Wright from the same USDA research center where I worked. His parents were cousins which likely got him interested in the topic. His example in paper was a cow pedigree, but his methods are now used for all species. Then in 2008 I published DNA-based methods to estimate how inbred and how related cows are to each other. Those methods are now used for DNA predictions of traits for countless other species, but the DNA relationships from most companies now use haplotypes instead of genotypes after statistically separating the 2 sets of chromosomes into maternal vs. paternal DNA.

People in the Americas might be a little less inbred than people within Europe, Asia, or Africa whose ancestors all lived in the same nation or small region of it. DNA can quantify those differences. For example, my DNA may be a little more diverse than my father’s because 100% of his DNA came from northwest Germany whereas I have 75% German and 25% British ancestors. My mother may have lower DNA inbreeding than both of us because her father had British DNA and her mother German DNA with even less chance of any common ancestors. My daughter has no inbreeding and some positive heterosis with a mixture of European and African DNA. Latin Americans often have mixes of native American, European, and African DNA. The term heterosis describes effects that cannot be traced by pedigrees when the 2 parents have DNA from different populations that were separate for many generations. Inbreeding and heterosis are similar topics.

 

Conclusions

         Genealogy was a hobby for millions of people but was a big business for livestock breeders for more than a hundred years after breeds were defined and trait data collection began. Large databases allowed farmers and breeding companies to predict and to select for traits they wanted livestock to have. Knowing about your pedigree and your DNA is still mostly a hobby and a curiosity for humans but is beginning to be useful for separating genetic from environmental causes in medical treatment. Future generations may find services offered by genealogy companies more useful or even get better DNA than we got. Our genes have big effects on many of our traits, and better understanding our DNA could improve everybody’s lives in the future.

 

References

Sewall Wright, 1922. Coefficients of inbreeding and relationship.

Discovering ancestors and connecting relatives in large genomic databases - ScienceDirect

Confirmation and discovery of maternal grandsires and great-grandsires in dairy cattle - Journal of Dairy Science

Fast two-stage phasing of large-scale sequence data: The American Journal of Human Genetics

Accounting for Inbreeding and Crossbreeding in Genetic Evaluation of Large Populations - ScienceDirect

Efficient methods to compute genomic predictions - PubMed

Invited Review: Reliability of genomic predictions for North American Holstein bulls - ScienceDirect

How Common Are Freckles? Global Statistics

Common SNPs explain a large proportion of the heritability for human height | Nature Genetics

Timeline for 23andMe DNA testing

Countries Where AncestryDNA® is Available

Ancestry Launches a New Take on Genetic Traits

How We Develop Traits - Ancestry DNA

TraitsPredictionWhitePaper_042024

Impact of Genetic Testing on Human Health: The Current Landscape and Future for Personalized Medicine - PMC

Genes for the Next Generation

 

Return to Solutions To Personal Problems