Palladium-catalysed C–N cross-couplings are ubiquitous in research and industry, and often require specific investigations to determine the most effective conditions. Now, cheat sheets created from a meta-analysis of the entire chemical literature on Buchwald–Hartwig reactions could allow users to bypass some of those laborious experiments. The team behind the work has also produced a suite of tools to recommend reagents and optimum reaction conditions depending on the desired transformation.

‘We developed computer code to carry out all the steps automatically, as for the amount of data we had, a manual curation and analysis would have been unfeasible. We think this kind of purely digital approach is rare in chemistry,’ comments Martin Fitzner, from Roche in Basel, Switzerland, who led the work. His team harvested over 62,000 unique reactions from online databases such as the Chemical Abstract Service (CAS), Reaxys and the US Patent and Trademark Office.The current state-of-the-art in synthesis to obtain good reaction conditions is basically screening through a large set,’ says Fitzner. ‘One of the aims of our work is to augment this approach by providing a more sound basis for what kind of conditions should be considered for which substrates in a manner that uses the knowledge implicitly stored in the databases.’

Meta-analysis (a statistical analysis combining the results of multiple scientific studies) is a prevalent concept in fields such as epidemiology and medicine, but has not been extensively explored within synthetic chemistry. Current approaches to synthesis tends to focus on a small number of reported reactions, which limits the diversity of reagents and conditions, and generally disregards less successful experiments, which are known to be valuable assets for data-driven approaches.

‘This work clearly shows that the anthropological choices made by chemists and the holes in how we record data – there’s a lack of failed reactions and systematic comparison of relevant substrates across methods – make it hard to predict specific solutions to specific problems,’ says Spencer Dreher, from Merck’s chemistry capabilities and screening group in the US. ‘It strongly suggests that unbiased generation of complete, systematic data sets using high-throughput experimentation is required to make headway in reaction prediction.’

‘I believe that it is absolutely essential for the chemistry community to move in this direction; to look more deeply into data and to maximise its worth,’ comments Cara Brocklehurst who leads the synthesis and technologies group at Novartis, also in Basel. ‘One important area where I see great potential is in pharmaceutical research where time is very much of the essence. Being able to rapidly and efficiently build new chemical entities will ultimately enable us to get drugs to patients as fast as possible.’