Skip to main content
Guide

How to use shared matches in DNA family history

DNA Advisor Karen Evans tells us how shared matches can help us to solve genealogical puzzles and knock down stubborn family history brick walls!

| 0 min read
Read a free guide to how to work with shared matches for DNA family history

DNA testing can feel overwhelming. It has a ‘sciency’ and ‘mathsy’ look to it that conjures school homework rather than a helpful tool for your family history. That can be rather off-putting. And then there is all the vocabulary!
To help us, let’s focus on just one very exciting aspect of DNA testing: your shared matches.

What is a DNA match?

Whichever company you test with, you are going to have a list of DNA matches. These are people who have tested with the same company (or have uploaded their results from another company, if your test company allows this) and share DNA with you.
Your list will, by default, be ordered from matches who share the most DNA with you at the top, descending to the company’s cut-off point. 

Tell me more about cut-off points

Ancestry and MyHeritage, for example, set a cut-off point at 8 centiMorgans (cM). This means that they show you only matches who share 8cMs or more with you. (The centiMorgans could be from several segments).
23andMe meanwhile usually requires at least one continuous segment of matching DNA that is longer than 7cMs.

What is the significance of the amount of DNA we share?


In theory, matches who share more DNA with you are people who you will be able to identify in your tree, as they are more closely related than those who share less DNA. As a general rule, we say a match who shares 30cMs or more with you shares a common ancestor most likely within the last 5 or 6 generations and therefore could be identified using traditional research. Once we get to matches who share just 20cMs of DNA with us, they may be your 4th cousins, 7th cousins, 10th cousins or even false matches. 
The less DNA we share with a match, the more remote our shared ancestor and the greater the chance of a false match.
 

How Ancestry’s Timber algorithm works to avoid false matches

Ancestry developed the Timber algorithm to downweight shared segments that are unlikely to be from a common ancestor. You can see this in action on matches who share 90cMs or fewer with you on your match list. If you want to know more, check out this helpful web page: www.ancestry.co.uk/cs/dna-help/matches/whitepaper

1. In this example, we see that Pat and Brendan are shown as sharing 59cM in the match list 2. Clicking on the hyperlinked shared DNA information will reveal the 3 bullet points, shown above. These report the number of segments of DNA over which the cM are shared; the number of cM shared prior to the application of the Timber algorithm (in this case 69C), and the longest segment of DNA shared (in this case 38cM).

What is a shared match?

Shared matches are people who have tested with the same DNA testing company, and who are on both your match list and your match’s match list. This is a game changer in DNA terms! Put simply, shared matches help us create clusters of matches.
How does it work in practice? We have a list of matches, and when we look at a specific match we can see that each company gives us a ‘shared match’ option’. 23andMe call shared matches ‘Relatives in Common’ and FamilyTreeDNA refer to them as ‘In Common With’.


Shared matches help us with two major pieces of information:

  1. Each of these clusters should point to a branch of our tree, because every person within that cluster most likely shares a common ancestor or ancestral couple. In genetic genealogy we are trying to work out which branch that cluster belongs to. Sometimes it’s easy – attached trees or hints from a tool identify links and we can then do our own research to validate the finding.
  2. Once we know how someone in a cluster ‘fits’ into our tree, we can use that information to find out how the shared matches within that cluster fit into our tree.
     

Enhanced shared matches

Three testing companies give us another layer of shared matching. MyHeritage, 23andMe, and Ancestry (depending on the level of subscription) allow us to see how much DNA our shared matches share with each other. 
Why is this helpful? Well let’s say you have a match, ‘Bob’, who shares a decent amount of DNA with you, but has no tree or clear way of finding out who he might be. You look at your shared matches with Bob and see that ‘Alex’ shares enough DNA with Bob to be his sibling. There is also another match, ‘Chris’, who shares 500cMs with Bob. Now you have three matches who are closely related; perhaps one of them has a tree or information on their profile that will give you the clues you need to start a Quick and Dirty tree. You’re actually working on three matches at once rather than Bob, Alex and Chris as individuals.
 

Solving DNA family history puzzles with shared matches

As I explained above, shared matches enable us to create clusters that point to different branches of our tree.
If you have a well-researched tree, with good documentary evidence, and a DNA cluster supports that evidence with genetic matches, this gives you another layer of validation.

Why eliminate clusters?

One of the main reasons for using a DNA test for family research is to try to identify the father of an illegitimate ancestor. Autosomal DNA tests can be a fantastic tool in helping us find a missing parent, grandparent, great-grandparent or even great-great-grandparent in our tree. We can do this by a process of elimination – removing matches (and their shared match cluster) that we have identified as belonging to a known part of our tree, and then working on the unknown clusters that ‘could’ help us to identify a missing ancestor. 

How to eliminate DNA clusters

1

Cluster your matches

Here is a step-by-step guide of how to do this with a missing paternal grandfather, but the same technique will work for a missing closer or more distant ancestor. I call it ‘cluster, research, remove, repeat’. I’m using Ancestry in my example as it has the largest database and clear clustering tools.

We are going to begin with top-level clustering – simply aiming to divide our DNA matches into two large clusters – one of which matches our father’s side of the tree; the other our mother’s. As long as you can identify one side, you can, by inference, identify the other.
On Ancestry you can see matches as ‘Parent 1’ and ‘Parent 2’. This tool is known as SideView, and basically clusters your matches into two large groups. 

2

Research your clusters

If I’m looking for my missing paternal grandfather, I want to find out which of these two groups represents my mother’s lines.
If I’ve attached a tree to my results that can help. Companies such as Ancestry and MyHeritage will offer hints that may point me to matches that include a surname in common or a shared ancestor. I may have to do more work to identify which is the maternal and paternal side (and you should always verify the hints yourself anyway).

3

Remove clusters that do not relate to the focus of your search

Once I know which matches belong to the maternal side of my tree, I can label them and ‘remove’ them from my missing paternal grandfather research. I am not physically removing them from the database (I’m looking forward to working on those matches later!). I’m removing them from any further search for my paternal grandfather.
 

4

Repeat the process

I’m now ‘left’ with my paternal matches. These are made up of matches from my paternal grandmother’s lines and my unknown paternal grandfather’s lines.
I can now return to Step 1 and the remaining clusters.

If I haven’t already done so, I will now create the clusters and this is where the shared matches come into play. As above I may need to research to verify my deductions. Once I’ve clustered the remaining matches, I can remove those clusters that relate to my known paternal grandmother’s line and am now (in theory) left with the unknown paternal grandfather’s lines. These are the groups I will focus my research on.
If I were looking another generation back I would then cluster again to remove more known lines before focusing on the research element. 

Interrogating your DNA matches

Now I have potentially identified matches which point to my missing grandfather I can look for clues.
This is my favourite bit! I’m going to take each cluster I’ve kept and see if there are any links between the matches. If I’m lucky, there will be several well-built trees that show a common ancestor among the matches – but that is very unusual. Often there might be a few trees which may show a geographical area in common.
The aim is to find the Most Recent Common Ancestor (MRCA) in a cluster by building out trees. (Note that usually you will be related to an MRCA couple; sometimes you will just be related to one person in a couple, indicating a half relationship).
To help me identify the MRCA  I spend a great deal of time creating Quick and Dirty trees to find how the matches link. 

I have the MRCA (Most Recent Common Ancestor), now what?


I have a (MRCA - Most Recent Common Ancestor) couple that I’m related to through my unknown paternal grandfather but I need to find that link. I’m going to build the MRCA’s tree both back and out. The couple could be, for example, the parents of my missing grandfather but they may be his grandparents, his sister’s line or his uncle’s family.
I need to look at how a man who fathered my dad in 1940 might link. If the MRCA couple were born in 1800 it would be unlikely a son would be my grandfather, but their grandson or even their great grandson could be. All this research will play on the skills you have been honing for many years on your family history journey.
As I build the MRCA’s tree out I am hoping to find out how the clusters relating to my missing grandfather link. For example, I might find the grandson of an MRCA in one cluster married the great granddaughter of the MRCA in another cluster. If that happens, I’m a very happy person!
As I get more laser focused on a potential grandfather I can look for links in documentation to my father and the missing grandfather. Do they live in the same town, road or house as my father when he was conceived? 
 

I know nothing about my family, what now?

Even if you have very little information about your family history, shared matches are still the key. Usually when I work on a new match list I don’t look to see how it relates to any tree. I want to cluster the matches and then see if the clusters and the tree fit together – that way I can avoid confirmation bias. I will work down the match list creating clusters that I label either numerically or alphabetically. 

When you start clustering your matches you need to be aware that very high matches are not a good starting point. Why is this? It’s because you are trying to cluster your matches into different branches of your tree; choosing a sibling as the start of a cluster would create all the matches who share both your parents. Similarly, choosing a first cousin would include all shared matches who share a set of grandparents. Ideally, I start at 2nd cousin level to ensure I create clusters that belong to a branch for each grandparent as the minimum. But as I cluster I often notice more than four groups, which indicates I am creating clusters that highlight a great grandparent.
 

Key take-aways for working with shared matches for DNA family history research


Here are the main points to consider when solving puzzles using shared matches:

  • Clustering your matches using shared matches is the key to discovering your unknown ancestor.
  • No clear matches or clusters? It doesn’t matter how skilful you are or how hard you work, if the matches aren’t there then there are no clues to follow.
  • Test on as many platforms as possible to find matches, as this will give you more matches, and therefore more shared match clusters.
  • Distance from the ‘missing’ ancestor. This is a big one! The further you are away from your unknown brick wall the harder it is to identify the possible clusters and the lower amount of DNA the matches will share. Even if you can find an MRCA, it may be impossible to work out how you fit into their tree. Autosomal tests such as AncestryDNA, 23andMe, MyHeritageDNA, and FamilyTreeDNA’s Family Finder could help for up to five generations. If you can, try testing the closest descendant to the unknown ancestor, such as your mother or maternal uncle – rather than yourself – if you are looking for your unknown maternal great grandfather.
  • Multiple puzzles can ‘muddy the waters’: I have a great grandmother who was illegitimate and whose mother was also illegitimate! Finding unknown clusters isn’t necessarily the hardest thing – the hardest thing is working out which cluster goes with which missing father.
  • Check yourself for confirmation bias. Are you cherry-picking information that fits into your theories? For instance, a family story tells you that a Mr Southam is the missing great grandfather, so you will look only for matches with this surname.
  • Watch out for clusters showing pedigree collapse or endogamy. This can make it very difficult to unravel the clusters and then to find the correct branch or branches you are interested in.
  • Using ‘quick and dirty’ trees will unlock your clusters and help find the common ancestors.
  • Combining shared matches and traditional research could also help solve your puzzle.
  • Focus on the best type of test to get the results you’re looking for. It could be autosomal, but a Y-DNA test may help for a direct patrilineal unknown father who is more distant.

Solving a genealogical puzzle is usually a very long process!