What does AI mean for chemistry?

1 Comment

Phil Ball looks at whether letting machines do our thinking for us will change our understanding of chemistry itself

Several years ago I shared a taxi ride with an eminent astrophysicist, and we got chatting about the future of AI. Jobs are gradually going to become expendable, he said. AI will replace taxi drivers, doctors, teachers, poets, you name it (with the recent advent of ChatGPT, some now think it will eventually replace writers too).

And astrophysicists, I asked? Oh no, not astrophysicists, he replied. What they do can’t be automated.

I don’t doubt that he was being a little wry, but the instinct among many scientists is that human intuition is too central to the way they work for AI to make them obsolete in the foreseeable future. And in truth, today’s AI, based largely on machine-learning algorithms that can mine huge data sets for patterns and correlations, seem best regarded as an assistant to, rather than a replacement for, the human researcher. It can do an awful lot, especially when coupled to robotic systems (see The robots revolutionising chemistry): not just analyse data but plan and execute experiments, make iterative improvements and even formulate and test specific hypotheses. Little of this is yet routine in the laboratory, but it is becoming ever more so.

How might AI shift the questions chemists ask?

In some ways, chemistry is ripe for AI colonisation. A great deal of chemical synthesis makes use of tried-and-tested methods and synthetic pathways that even those conducting them find dull and repetitive. Microfluidic technology enables much of the wet chemistry to be performed robotically. Even planning a retrosynthetic strategy can be rather formulaic and thereby suited to an algorithmic approach. Already there exist efforts to make automated synthetic systems that allow one to ‘dial in’ any molecule one chooses. Many analytical techniques, such as crystallography, are also being automated.

But it’s too simplistic to frame these developments in terms of the ‘demise of the human chemist’. At least in the near term, it’s more likely that the roles and the skill sets of chemists will shift. Eliminating the burden of repetitive tasks is hardly to be lamented, since it might free up chemists to think creatively rather than to slog routinely.

It might be more edifying to ask whether AI might change chemistry conceptually. Historically, changes of methodology were accompanied by changes in the intellectual framework of the discipline. Improvements in analytical chemistry in the 18th century enabled a focus on questions of composition: what were the elemental substances of chemistry, and how were they constituted in different compounds? From the mid-19th century, the focus shifted to considerations of molecular structure, brought about in part by improvements in an ability to separate and purify closely related compounds. The advent of quantum theory, as well as for techniques such as spectroscopy, led in the early 20th century to an emerging understanding of chemical bonding. Towards the end of that century, molecular beams and ultrafast lasers made it possible to study reaction dynamics in detail. As AI becomes increasingly a tool in the chemist’s toolbox, how might it shift the questions chemists ask?

What a tool

As the role of AI expands ever more in the laboratory, a question widely asked is simply whether chemists will need the practical skills they have been traditionally taught. If synthesis is automated synthesis using microfluidic and other methods, do you need to know how to use a pipette or how to titrate?

Zachary Baum, a scientific content engineer at the American Chemical Society’s informatics division CAS (formerly the Chemical Abstracts Service) thinks you will. Multistep microfluidics is improving, he says, including methods for purifying samples at each automated step. So ‘flow chemistry may become more routine for researchers’. But scaling up such methods towards a pilot-plant operation will need to be done manually, at least at present. ‘I don’t see the practical skills that give synthetic chemistry its charm becoming any less important,’ he says. ‘We will continue having graduate students toiling away on silica columns and trying to get their distillations to work.’

Every new tool in the toolbox has resulted in a shift of focus for chemists

Anatole von Lilienfeld of the University of Toronto in Canada, a specialist in using machine learning to predict chemical behaviour, agrees that ‘machine learning will not replace but rather assist the chemist of the future’. He thinks of it as the ‘fourth pillar of science, after experiments, theory and computation’. As such, it is merely a tool for gaining a better understanding and control of chemical processes and properties.

But new tools don’t just expand the ways of conducting and studying chemical processes; they may alter how we think about them. ‘Every new tool in the toolbox has resulted in a shift of focus for chemists,’ says Felix Strieth-Kalthoff, a postdoctoral researcher with computational scientist Alán Aspuru-Guzik at Toronto.

‘Take the development of NMR spectroscopy,’ he says. ‘Its evolution into a robust, routine technique has enabled the characterisation of increasingly complex molecules, which has allowed researchers to shift their attention towards them and their reactions.’ And once the polymerase chain reaction made it easy to amplify DNA sequences for analysis, biologists and biochemists were able to focus on higher-level tasks, like figuring out what the sequences mean. Through such shifts, techniques once central to chemical practice, such as titration, may become relegated to an educational purpose, as they get replaced by modern automated techniques. ‘I believe that AI tools can enable a similar paradigm shift, with chemists having more time and capacity to focus on the complex, higher-level design task,’ says Strieth-Kalthoff.

Scheme

Source: © Benjamin J Shields et al/Springer Nature Limited 2021

AI can explore a reaction space (altering groups X and Y as well as the ligand, base and solvent) faster and potentially better than a human

All the same he is convinced that ‘chemists will stay at the forefront of conceptualising, abstracting and directing research problems, given how vast and diverse the entirety of chemistry is, and how little of it we have explored to date.’

Cheminformatics expert Jean-Louis Reymond of the University of Bern in Switzerland agrees. ‘My view is that AI should indeed match or surpass chemists’ expertise in the long term, but there is still a long way to go,’ he says. ‘And when that day comes AI will remain an enriching tool for the chemist, not a replacement, simply because deciding to which problem to apply an AI, and when and how to implement AI solutions, requires a human expert.’

As AI and automation expand the possibilities for exploring synthetic strategies, however, chemists might become more systematic in evaluating and optimising them. For example, last year a team led by Abigail Doyle at Princeton University showed that an AI algorithm was able to out-perform human judgement in optimising reaction conditions for a palladium-catalysed cross-coupling reaction between organic molecules (a variant of the well-known Suzuki reaction). The algorithm uses an approach called Bayesian optimisation, in which expectations about the best solution are constantly updated by new data – in this case, supplied by an automated high-throughput system that explores reaction conditions. The AI system was able to identify optimal reaction conditions that were substantially different from those commonly used previously.

How will AI carve up nature?

What kinds of heuristic understanding might emerge from such techniques? Chemistry has always been reliant on distilling complex and sometimes confusing data into concepts that offer intuitive rules of thumb for chemical reasoning: concepts such as bond types and order, atomic radii, electronegativity scales and oxidation states. But if the discipline becomes ever more dependent on number-crunching of raw data, will such notions continue to be useful? Lilienfeld thinks so. He says that such heuristic concepts are ways of reducing high-dimensional data to low-dimensional parameters – there are typically several ingredients, for example, that go into calculating a single number representing electronegativity. Lilienfeld feels that AI techniques should be able to reproduce such parameters, as well as quantifying their limitations.

Periodic table

An element’s ‘replaceability’ in certain compounds can be calculated – but what does it tell us?

But these methods might also identify new parameters and metrics: to spot new ways to reduce the data. Lilienfeld cites the concept of the ‘Pettifor number‘, proposed by theoretical chemist David Pettifor in 1984 to characterise every element in the periodic table according to the structures they form in binary compounds AB_n_. Roughly speaking, elements with the same Pettifor number should be able to substitute for one another in such binary phases. The classification scheme, deduced by hand from a data set of just 574 compounds, might be considered a precursor to those now produced by machine learning. But AI approaches are refining the notion of Pettifor numbers as a way of grouping elements – and in that way, broadening the chemist’s roster of useful heuristics.

Another example is a classification of so-called elpasolite crystal structures (with stoichiometry ABC₂ D₆) containing main-group elements. Lilienfeld and colleagues used machine learning to look for trends in the bonding and formation energies of these structures – leading them to identify some unusual cases such as one in which aluminium was assigned a negative oxidation state. In such ways, Lilienfeld thinks, AI might help chemists explore whether or not the traditional heuristics of the field truly ‘carve nature at its joints’, or need refining and modifying to better align with what the data show.

‘Perhaps AI approaches of the future, by opening the black box connecting input with output, will enrich the set of possibilities for understanding chemical phenomena,’ says Guillermo Restrepo of the Max Planck Institute for Mathematics in the Sciences in Leipzig, Germany. But he points out that, so far, AI techniques have not brought to light any overlooked categories of, say, reaction classes or functional groups. Might it be that we have already identified most of the useful ‘coarse-grained’ descriptions we need for chemistry – or might it just show that we tend to gravitate towards already familiar territory?

Categorisations made with AI from raw data might also help us decide if those chemists currently use are ‘natural’ – objectively reflected in the physical world – or are more a matter of historical and cultural contingency. ‘AI could settle whether some concepts reflect natural categories in the world,’ says philosopher of chemistry Vanessa Seifert of the University of Athens in Greece. Acidity might be a good example, she suggests: is it just the makeshift ‘persistence of a quotidian concept’ or does it have a deeper validity as a fundamental facet of chemical behaviour?

Towards AI-assisted appreciation

If we are to extract new general insights, we might need to develop AI algorithms that are ‘explainable’: that don’t just spit out numbers but are able to provide some qualitative justification for their conclusions. That is increasingly a goal pursued in AI more generally, since it tends to be what clients demand. A doctor, say, wants to know from an AI diagnostic system not just what it thinks a patient is suffering from but how it reached that conclusion. AI may have to be capable of explaining itself if it is to be trusted.

One day, machines may generate hypotheses that they can explain to humans

‘The ideas of explainable AI and AI-based hypothesis generation have only recently reached chemistry,’ says Strieth-Kalthoff. For example, Aspuru-Guzik and co-workers have developed a ML algorithm that can extract human-interpretable insights from big data sets in chemistry and physics. As well as recovering some known rules of thumb for controlling the solubilities and energy levels of organic molecules, such as the roles of heterocycles and electron-withdrawing groups for the latter, the algorithm offered some new ones.

Such approaches haven’t yet been adopted widely in chemistry, says Strieth-Kalthoff. ‘I am convinced that this will come,’ he adds, ‘but it will take some time.’ Once we have AI that can explain data in a familar chemical context, says Baum, ‘there is a good chance that we can identify new patterns and build abstractions from them’.

I think that this will complement the current chemist’s skill set

To judge from experiences of using AI in other fields, such as music and game-playing, it seems possible that it might even broaden our appreciation of chemistry – for example, by identifying elegant new retrosynthetic stratagems that no human has spotted. Reymond says that his studies of AI for retrosynthesis have identified some previously unsuspected possibilities – but testing them in the lab is likely to be a very laborious process, as has always been the case for new synthetic strategies.

Baum points out that such inspiration has already flowed from the ground-breaking game-playing algorithm AlphaZero, developed by the company DeepMind, which can beat other state-of-the-art programs for chess, Go, and the Japanese board game shogi. The world chess champion Magnus Carlsen has attested that his game has been influenced by some of the moves displayed by AlphaZero.

‘I feel that the analogy to chess or Go is very fitting’, says Strieth-Kalthoff. If AI finds more successful strategies or concepts, we can not only learn from them but come to appreciate them aesthetically too. ‘In the long term’, he says, ‘I think that this will complement the current chemist’s skill set, and lead to a better general understanding of chemistry.’

Philip Ball is a science writer based in London, UK