Algorithmic fix allows ultra-accurate sequencing of rare cancers and genetic conditions

A virtually error-free new method of DNA sequencing could one day be used to diagnose extremely rare cancers and hereditary genetic conditions.

The approach – called error-correction code (ECC) sequencing – is based on an existing technology called fluorogenic sequencing that works by splitting DNA into fragments and copying these using a process similar to the polymerase chain reaction. Fluorescently labelled nucleotide bases are incorporated into the new DNA strands, and the signals from these are then used to determine the sequence.

‘In our previous work, we added one kind of fluorophore-labelled nucleotide in each reaction cycle, and got the sequence of DNA according to the fluorescence intensity,’ says Yanyi Huang of Peking University in China. ‘In this work, we added two nucleotides alternately in each cycle. There are three different ways to mix the four bases: AC/GT, AG/CT and AT/CG. So we sequenced the same DNA molecule using the three different mixing ways to provide extra information.’

This method essentially produces three sets of results for each DNA fragment sequenced, which a specially designed algorithm can combine to identify and fix any errors, and deduce the unambiguous sequence. The group were able to adapt a commercially available fluorogenic sequencing machine – which is normally about 98% accurate – to incorporate their method, and found it could generate completely error-free sequences up to 200 base pairs long.

‘The instrument was just a lab prototype and not very user-friendly,’ Huang says. ‘We are now optimising both the chemistry and the instrumentation to make it more practical.’ He adds that such accurate sequencing would have an advantage over existing approaches for some applications, for example identifying rare mutations in the DNA of cancer cells, particularly where different parts of a single tumour contain subtle genetic differences. ‘We are working on further developing the ECC technology into a high-throughput sequencer that produces massive, highly accurate DNA sequences,’ he adds.

Keith Robison, a computational biologist at US drug discovery company Warp Drive Bio who blogs about DNA sequencing, says the work is an ‘interesting twist on existing systems’. ‘The big plus is getting the error rate down to nearly 0 for the first 200 bases […] for diagnostic uses that could be valuable,’ he tells Chemistry World. He also says it looks like the system could be run on existing hardware with only ‘modest’ modifications.

But he also points out the approach may not always give completely error-free results. There are circumstances under which the accuracy could be compromised, for example areas in certain genes which are known to contain sequences where the same two nucleotides are repeated over and over, which could make the signals harder to interpret.

Whether the method will be a commercial success depends on how it compares to other, potentially faster, approaches for real-world applications, Robison says. ‘It will be interesting first to see what an overall process time would look like for this. Can they compete in the heating up rapid sequencing market? When is this better error rate really going to be a good marketing edge?’

Like Huang, Robison thinks cancer diagnosis may be a promising application. ‘What they describe is quite adequate for [preserved cancer biopsies] where getting the DNA back out inherently shears it into small fragments. [This] is where high accuracy is perhaps most important,’ he says.