Model predicts effect of mutations on sequences up to 1 million base pairs in length and is adept at tackling complex non-coding regions
Google’s new deep learning model can predict the effect of small changes to DNA sequences up to one million base pairs in length and is particularly good with non-coding DNA, which has proven especially difficult to understand. The artificial intelligence (AI) tool – called AlphaGenome – offers researchers a way to better understand the human genome and may help scientists develop treatments for disease.
Small variations in the human genome can have a big impact on a person’s health, causing genetic disorders like cystic fibrosis or certain cancers. Most changes occur in the genome’s non-coding regions that make up 98% of the total DNA. These regions influence the expression of genes rather than coding for proteins and alterations can often have a range of biological effects, making it hard to predict their impact.