Philip Ball considers the creation of a collective chemical brain, and what it might dream up

Bartosz Grzybowski of Northwestern University in Illinois, US – who has already established himself as one of our most inventive chemists – has unveiled a ‘chemo-informatic’ scheme, Chematica, that can stake a reasonable claim to being paradigm-changing. Grzybowski and his colleagues have spent years assembling the transformations that link chemical species into a vast network that codifies and organises the known pathways through chemical space. The nodes of the network – molecules, elements and chemical reactions – are linked together by connecting reactants to products via the nexus of a known reaction. The full network contains around 7 million compound nodes and about the same number of reaction nodes. Grzybowski calls it a ‘collective chemical brain’.

I predict a mixed reaction from chemists. On one hand, the potential value of such a tool for discovering improved or entirely new synthetic pathways is tremendous, and has already been illustrated by Grzybowski’s team. On the other hand, Chematica seems to imply that chemistry is indeed ‘just cookery’, as the old jibe puts it, and is now better orchestrated by computers than by chemists.

Grzybowski first described the network in 2005, when he was mostly concerned with its topological properties rather than with chemical insights. Like the internet or some social networks, this chemical network has ‘scale-free’ connectivity, meaning that the number of nodes with n links is proportional to n–a, where a is a constant. So the network is bound together by just a few very highly connected nodes; hubs that provide shortcuts between different parts of the network. 

In a trio of new papers, the researchers have now started to put the network to use. In the first, they perform an automated trawl for new one-pot reactions that can replace existing multi-step syntheses.1 The advantages of such processes are obvious: no laborious separation and purification of products after each step, with consequent reductions in yield. Identifying one-pot processes linking molecular nodes that hitherto lacked a direct connection means subjecting the relevant reactions to several filtering steps to check for compatibility – for example, checking that a water-solvated synthesis will not unintentionally hydrolyse functional groups. This filtering is painstaking in principle, but very quick once automated.

It is one thing to demonstrate that such one-pot syntheses are possible in principle, but Grzybowski and colleagues have ensured that at least some of those identified work in practice. Specifically, they looked for syntheses of quinoline-based molecules – common components of drugs and dyes – and thiophenes, which have useful electronic and optical properties. Many of the new pathways worked with high yields, in some cases demonstrably higher than those of the alternative multi-step syntheses. Of course, its performance is only as good as the data from which it is built and so some false positives arise from errors in the literature.

Another use of Chematica is to optimise existing syntheses. Looking for improved – basically, cheaper – routes to a given target is a matter of stepping progressively backwards from that molecule to preceding intermediates.2 An algorithm can calculate the costs of all such steps in the network, searching to a specified ‘depth’ (maximum number of synthetic steps) for the cheapest option. Applied to syntheses conducted by Grzybowski’s company, ProChimia, Chematica offered potential savings of up to 45% for 51 of the company’s targets. The greater the number of targets, the greater the savings because of the economies of shared ingredients and intermediates.

Finally, and perhaps most controversially, the researchers show how Chematica can be used to identify threats of chemical weapons manufacture.3 The network can be searched for routes to harmful substances such as nerve agents using unregulated ingredients. Of course, it can also disclose such routes, but as with viral genomic data,4 open access should be the best antidote to the risks they pose.

Does this mean that synthetic organic chemistry can now be automated?  The usual response is to insist that computers will never match human creativity. But that defence is looking increasingly under threat in chess and maths, for example, and perhaps even in music and visual art. In some ways, chemical synthesis is as rule-bound as music, if not chess, and thus ripe for algorithmic apprehension. Synthetic schemes designed by humans surely won’t become obsolete any time soon, but there seems no harm in acknowledging that the time may come when the art and creativity of chemistry resides more solidly in our decisions of what to make, and why, than in how we make it.

(Image credit: Wiley-VCH)