An artificial intelligence performing experiments using a synthesis robot has fished out what may be the most generic conditions for cross-coupling reactions from a pool of thousands of possible combinations. The AI-created reaction more than doubled the average yield in 20 tricky cross couplings compared with benchmark conditions.
Reaction conditions that work for compounds with different shapes, sizes and functional groups are ‘critical for automating small molecule synthesis, which in turn is critical for democratising molecular innovation’, says study leader Martin Burke from the University of Illinois Urbana–Champaign, US. In 2009, Burke’s team created a version of the Suzuki–Miyaura cross coupling that combines chloroarenes and N-methyliminodiacetic acid (Mida) boronates.
The team then turned to AI to adapt the reaction so it would work for the broadest possible substrate range. They asked it to scour the literature and find the most generic reaction conditions out of an almost infinite number of catalyst, ligand, temperature and base combinations. Yet rather than finding new, general conditions, all the algorithm would ‘discover’ were the conditions that are already the most popular. One of the reasons, Burke explains, is that ‘the literature is remarkably deficient in negative data. No one publishes what doesn’t work.’ But algorithms need both good and bad examples to learn from.
Now, the teams of Bartosz Grzybowski at the Polish Academy of Sciences and Alán Aspuru-Guzik at the University of Toronto, Canada, along with Burke and colleagues have combined AI and robotic experimentation to tackle the generality problem without biased literature data. First, clustering analysis selected 22 coupling partners that would represent the chemical space of 5400 commercial aryl halides and 54 Mida boronates. Then, the team let the AI lose on the data. ‘We made zero decisions once we had it set up,’ says Burke.
Over several months, the AI went through five rounds of designing experiments, running them on the synthesis robot and evaluating the outcomes. ‘The algorithm was set up not necessarily to try to just find the best conditions, it was trying to minimise uncertainty,’ Burke explains. For this, the AI ran a surprisingly large number of low-yielding reactions – around 100.
The conditions the algorithm came up with don’t seem to be vastly different to Burke’s 2009 paper: 100°C instead of 60°C, sodium instead of potassium carbonate and a slightly different phosphine ligand for the palladium catalyst. Yet in 20 heteroarene cross couplings, the average yield more than doubled – from 21% to 46%. The AI’s speed and precision ‘was very humbling and kind of inspiring, but also shocking and disturbing at the same time’, Burke says.
‘This is great for parallel medicinal chemistry, for example in high throughput labs where they use automation for making libraries of compounds,’ says Cayetana Zárate, who works on process development at Janssen Pharmaceuticals. However, she points out that the chemical space the team looked at lacked molecules with pharmaceutically relevant substituents such as halogens and aliphatic amines.
If a similar AI could optimise conditions for one particular reaction, it could hugely accelerate labour-intensive process development, Zárate says. After all, the Suzuki–Miyaura cross coupling is ‘probably one of the reactions where we invest most time in screening’.
Burke says he now wants their AI setup to look for molecules that have specific functions, such as solar cell materials. ‘In the field of chemistry, we tend to focus on structure. To me, what is so exciting about this closed-loop engine, [is] it’s an opportunity for us to pivot towards a function-first approach.’
NH Angello et al, Science, 2022, 378, 399 (DOI: 10.1126/science.adc8743)