Forensic DNA phenotyping predicts people’s appearance and reveals their ancestry, finds Andy Extance, but has some significant challenges to overcome

DNA evidence was initially no help in finding who raped and murdered 19-year-old Milica van Doorn on Saturday, June 7, 1992. She disappeared going home from a party in Zaandam, in the Netherlands, with a pastor finding her body floating in a nearby pond the following Monday. No water was in her lungs, meaning she hadn’t drowned. Investigators found a man’s pubic hair and semen inside her body.

While forensic scientists first used DNA from semen to identify a rapist in 1987, in that case police already suspected him. They could compare the suspect’s DNA with their forensic sample. In van Doorn’s case, the police didn’t have a suspect. Comparing crime scene DNA with their forensic DNA database also didn’t find a match. So, the case remained unsolved, until 25 years later when forensic scientists could exploit how DNA shapes who we are.

Face fragment

Source: © Pierluigi Longo @ Heart Agency

In the 1990s, forensic scientists mainly focused on characteristic short tandem repeat (STR) sections of human DNA that change a lot between people. This technique still dominates forensic DNA analysis. STR-based DNA profiling can confidently identify an individual based on a match between a sample known to be theirs and one from a crime scene. STR sequences usually don’t tell investigators anything else about the suspect though, so DNA profiling is no use if you can’t link someone known to crime scene samples. Therefore, scientists have turned to forensic DNA phenotyping (FDP).

FDP can predict some external appearance traits. Eye, skin, and hair colour are easiest to identify, explains Susan Walsh from Indiana University–Purdue University Indianapolis, US. ‘They were the nicest traits to begin with,’ Walsh says. They’re associated with genes that control how we make pigment molecules known as melanins across many types of tissue, she adds.

Researchers and police can use FDP to generate pictures of potential suspects, but also narrow down their ancestry. Forensic scientists have used it successfully to solve various cases, including Milica van Doorn’s. Shifting DNA forensics into helping investigations with evidence akin to highly reliable descriptions is clearly a potentially powerful tool. But that power could be alarming, especially when it leads to discrimination based on skin colour or ancestry.

However, forensic scientists are yet to comprehensively exploit its power for several reasons. Because it reaches across more of our genome, FDP often needs DNA technology that is currently much more expensive to use than STR profiling, limiting how widely it is used. Also, FDP’s main current applications do not yet achieve the full potential for predicting appearance. That relates to the final reason why FDP is not as powerful as it might seem: unlike DNA profiling, it cannot identify an individual with enough certainty to present in court yet. Researchers are therefore seeking to deal with these shortcomings.

Massive improvement

Modern next-generation sequencing technology, also known as massively parallel sequencing (MPS), enables large-scale FDP. Tunde Huszar from the University of Strathclyde, UK, calls it ‘a zoomed-in approach to genetic variation’. Historic STR-based approaches rely on commercial kits containing primer sequences that recognise STR regions and ready them to be copied by the polymerase chain reaction (PCR). PCR scales up DNA to amounts where researchers can measure them by capillary electrophoresis (CE), which Huszar explains separates STRs by length.

Face fragment

Source: © Pierluigi Longo @ Heart Agency

By contrast, MPS technology can directly ‘read the building blocks of the DNA’, Huszar explains. Many forensic MPS sequencers come from US biotech Illumina. They use sequencing by synthesis (SBS), denaturing DNA into many small, single-stranded pieces and trapping them in a transparent glass flow cell, Huszar adds. SBS copies DNA pieces one nucleotide base building block at a time, but the copies have fluorescently tagged, colour-coded nucleotides. Each of the four different possible bases shows up as a different colour in the flow cell, which the sequencers read and compile.

FDP still involves kits containing primers for PCR amplification. CE can only measure genetic markers, DNA sequences with known locations on chromosomes, from around 40 regions at once, Huszar says. MPS can measure over 10,000, although forensic applications only target hundreds of markers. That allows scientists to get much more genetic information from a small DNA sample, she adds. ‘This is important because forensic samples are precious.’

MPS can also power genome-wide association studies (GWAS) identifying genetic variations linked to identifiable traits, explains Kelly Elkins from Towson University in Maryland, US. GWAS often look at thousands of people’s genomes, comparing those with a particular trait, such as blonde hair, with those without it. Many differences found are single-nucleotide polymorphisms (SNPs, pronounced ‘snips’), where just one base swaps for another. Usually one SNP ‘does very little by itself, but several SNPs together can determine traits,’ Elkins says.

Hair, eye and skin colour are the main current FDP traits determined from DNA from the main 22 numbered human chromosomes, also known as autosomes. Biogeographical ancestry is another common FDP trait. Beyond autosomal DNA, Elkins adds, DNA from our mitochondria, organelles in our cells that make chemical energy, provides specific details of a person’s maternal ancestry. Likewise male Y-chromosomal DNA provides paternal ancestry information. Usually it’s best to combine autosomal, Y-chromosomal, and mitochondrial ancestry testing.

Kitting up

Manufacturing kits is relatively straightforward for companies that supply them, adds Walsh. ‘Where the work comes is what they put in there,’ she notes. Teams like hers produce statistical models that translate DNA sequence information into probabilities of specific traits, identifying key genes of interest kits need to amplify. Walsh and Kayser’s team developed pigmentation prediction models for HIrisPlex kits phenotyping eye, hair and skin colour and made them available online, for example. ‘If they don’t have those variants included, then what’s the point in making commercial forensic kits?’ she says. ‘They can’t do anything with them.’

Face fragment

Source: © Pierluigi Longo @ Heart Agency

In the Milica van Doorn case, the Netherlands police called on biogeographical analysis when they decided to use FDP in 2017. Using the semen they collected, they just did Y-chromosome STR analysis. According to Manfred Kayser from the Erasmus University Medical Centre in Rotterdam, the Netherlands, most Y-chromosomes don’t have a geographical signature for a specific group of countries. Yet Kayser, who advised the police, reveals that they were lucky, because in this case they could show where the suspect’s ancestors came from. In Zaandam ‘the largest group from these countries are the Turks’, he notes. In November 2017 the police asked 133 Turkish-background men in Zaandam to provide DNA for Y-chromosome testing, of whom 126 agreed. The police identified a distant relative of the perpetrator, and tracked him down from there. Yet despite this impressive success, FDP’s use in such cases remains unusual – something Kayser and colleagues would like to change.

In advancing FDP, GWAS made scientists realise that appearance traits are far more genetically complex than previously assumed, Kayser says. Because some genes have larger effects, relatively simple DNA tests looking at a few SNPs can predict pigmentation traits like eye, hair and skin colour. Such tests can predict blue and brown eye colour from just six SNPs. While red hair is determined by a single gene, its tests need dozens of SNPs to predict it reliably. Other appearance traits involve hundreds of genes with only small effects, Kayser notes.

Within the EU-funded Visage project Kayser and colleagues found new predictors for appearance and ancestry. These include eye, hair, skin, eyebrow colour, freckles, hair shape and male hair loss. They could also use MPS to detect epigenetic changes that add methyl groups to eight marker regions to estimate age. Yet these tools have not yet been commercialised. ‘The reason may be the market size,’ says Kayser.

Detail – at a cost?

Market size is limited partly because MPS can be more expensive than CE-based STR profiling. Huszar highlights that the Miseq FGx, made by Illumina spinout Verogen, also based in San Diego, is a widely used MPS forensic genomics kit. It offers DNA sequencing detail at a cost in terms of time and money. Forensic scientists must create a library of DNA fragments that will bind to the flow cell. After this step, the sequencing process can take from one to three days, Huszar says, with data analysis also taking a long time. However, that disadvantage can be offset by combining different samples in one analysis run. ‘It’s allowing you to get more data out of less material,’ Huszar says.

Investigators usually still opt for classic CE forensic analysis, Huszar admits. However, if they invest in MPS-based analysis, they can combine STR identification with phenotyping. Sometimes, the only evidence is a limited amount of DNA, and police are not confident that CE analysis will match any suspects. Then, ‘they can make a decision to go for all the possible extra information,’ Huszar says.

Generated faces

Source: © 2023 American Academy of Forensic Sciences

Facial likenesses can be generated from DNA evidence, but their use is still limited

In early 2023, Elkins and her team published findings that could circumvent limited interest in commercialising new kits. They boosted existing forensic DNA profiling kits, discovering several new predictable externally visible characteristics. Exploring GWAS studies, they found 15 SNP loci in Verogen kits that could predict traits including hair greying, shape and thickness, and pattern balding. They also predicted freckles, height, eyebrow thickness, obesity, vitiligo and propensity to tan. The Towson team used the predictions to generate images with MetaHuman software often used in computer games. Elkins says that the ‘associations are pretty weak so using multiple loci for predictions and using care in interpretation is going to be important’, however.

Traits like hair greying could help one shortcoming of existing FDP appearance predictions. ‘[An] outward characteristic may look very different on somebody who’s 30, as opposed to somebody who’s 50,’ Elkins explains. Similarly, body mass index-related SNPs can offer predictive information about potential weight ranges, but lifestyle factors like diet and exercise can mitigate their effects. SNP-based predictions could be valuable for creating sketches but must be used cautiously, Elkins warns.

FDP today can just be used to help investigations, rather than as evidence that definitively identifies an individual who has committed a crime. The only possible way to use FDP identify an individual is by accurately predicting their detailed facial appearance from DNA with high accuracy and reliability, Kayser explains. ‘That would be the game changer,’ he says. ‘Can we do this? No. Are we close to this? No. Why not? Because the face is a very complex trait.’ Kayser and others have found that hundreds of genes are involved in facial appearance, but their effect is so small that many thousands are likely to be involved.

Even Walsh, whose team works on predicting face shape from genetics admits ‘we really don’t understand it yet’. She notes that for pigmentation, several genes work together to produce the effect. With facial structures, many different genes have small effects at different times in our lives, mostly before we’re born. ‘It will be much more difficult to do prediction,’ Walsh says. Walsh explains that while they can identify genes associated with facial shape from GWAS, understanding their functional role requires lab work examining cells and tissues, which is time-consuming. Only a few genes are well understood, and many still lack a clear function. Yet making FDP more specific would be welcome, as when its findings are too general, they could lead to discrimination.

Race at heart

Filipa Queirós from the University of Coimbra, Portugal, feels that there are many problems with FDP. One typical example is a sexual assault case in Edmonton, Canada, from 2019, where police turned to FDP as a last resort. They used it to produce a composite FDP sketch of a black man that included very few specifics about his appearance. Queirós argues that it provided little more detail than the victim’s statement that her assailant was a black man. Critics in Edmonton said that it implicated too broad a proportion of the community. The case illustrates that it’s important that police not be seduced by the idea of DNA ‘as a truth machine’, Queirós says. She also suggests that some forensic geneticists deny that race is a scientific problem in FDP, while the technology silently perpetuates the concept of race.

Face fragment

Source: © Pierluigi Longo @ Heart Agency

‘Genetic dragnets’, where police ask a certain population for genetic samples, like Turkish men in Zaandam in the Milica van Doorn case, can also be problematic. A contentious example that Queirós highlights came from the UK before FDP was well established. Delroy Grant is suspected to have committed over 100 offences between 1990 and 2009, including rapes of elderly women. To try to track him down, in 2004 the Metropolitan Police collected DNA samples from many people with known ancestral links to the Caribbean, including its own officers. The police put pressure on those who refused to cooperate, but ultimately they prosecuted and convicted Grant without using DNA evidence. Queirós underlines that this problem arises when FDP technology is used speculatively in an attempt ‘to generate new leads and to maximise existing resources’.

Ultimately, Queirós accepts FDP’s technical potential. However, she stresses that it must be carefully regulated, and that those that use it must realise its challenges. ‘The race issue, it’s not avoidable,’ she stresses. However, it can be mitigated, she says. ‘We can discuss it more to have less impact on those communities that already are subjected to surveillance and police attention.’

Elkins is among those trying to improve education about ethics in FDP. In her work, she emphasises the need for policies and procedures to prevent infractions and misconduct in crime labs. She explains that training, supervision and adequate resources are crucial to ensure valid and reliable results. Guidelines can help prevent past mistakes and avoid biased outcomes. ‘It is essential to recognise that skin colour does not determine criminality,’ Elkins stresses.

Andy Extance is a science writer based in Exeter, UK