A new computer program can tell how likely a chemical structure is to be right or pick the right isomer from a range of possibilities

UK chemists have developed a computer program that can work out how likely a chemical structure is to be correct, or identify the right structure from a range of possibilities. The method could help unravel the structures of complex molecules more accurately and save wasted effort in synthesis.

Complex organic molecules, especially those isolated from natural sources, can be difficult to assign structures to with great accuracy. To improve the accuracy of structure determination, chemists have developed ever more powerful methods of predicting nuclear magnetic resonance spectra to match up against experimental data.

However, as Jonathan Goodman from the University of Cambridge explains, this isn’t always simple. ’When you do a calculation you don’t expect to get exactly the same answer as the experiment, because of approximations in the calculation,’ he says. ’So if, for example, you had a series of calculated spectra for different isomers of a structure, all of them are likely to more or less match the experimental spectrum, but they will all be a bit different to it, too.’ This can make determining the right structure with any confidence even more tricky. 

Goodman and his student Steven Smith had previously developed an algorithm that could take two sets of experimental data and say which of two proposed structures matched each spectrum. ’Here we’re working with a lot less information,’ says Goodman, adding that often only one experimental dataset is available, with tens or hundreds of possible structures.  

teens-computer-laptop-structures-300

As chemical structures get more complex, matching calculations with experiments gets more and more uncertain

The new program can solve that kind of problem, and also gives its answers a confidence rating, which previous programs have lacked. To tune the algorithm, the pair tested it out on a wide range of molecules for which the experimental data of several isomers are known. The results show that the confidence ratings are accurate, so a rating of 90 per cent means that, for that family of molecules, the assignment will be right nine times out of ten. They have also created an online applet for people to feed their own calculations into the algorithm (see link at bottom of page).

’It’s an exciting development,’ says Scott Snyder from Columbia University in New York, US, who specialises in synthesising complex organic molecules. ’Every tool you can have that can help to tell you correct structures is for the better.’ Snyder points particularly to cases where groups have spent several years and synthesised multiple isomers of molecules before finally hitting on the one that matches their target. ’They’re a little beyond the level of what’s presented here,’ he says, but in the long term predicting such complex structures would be an invaluable tool. 

Goodman adds that they are working to further improve the accuracy of the algorithm, and sees a wide variety of applications. As computing power increases, the calculation times should drop significantly from the couple of days needed currently, which would open the door for automatic datachecking systems. ’One could imagine linking it to an electronic lab notebook, or trawling through structural databases like ChemSpider.’ 

Phillip Broadwith