Realistic costs and diverse suggestions make Chematica more insightful
The developers of Chematica, a computer program that can predict organic synthesis routes, have significantly improved the way it chooses from the options it discovers to present users with better and more diverse results. The speed at which the software makes these comparisons has also greatly improved, making the software more appealing as a planning and investigative tool.
Chematica performs retrosynthetic analysis: given a target compound, it suggests how it could have been made by reacting simpler components, and how those could have been made, and so on. Repeating the process eventually leads to the kinds of starting materials you can order in a catalogue. Backtracking from those reagents to the target molecule gives a synthesis route. The software looks for all of the possible alternatives at each retrosynthetic step, so rather than discovering a single synthesis, it builds a network of routes.
The wealth of alternatives proved to be an embarrassment of riches. ‘In the beginning we were very happy when it found one route. And then it started finding not one, but a thousand, or a hundred thousand routes. Saying “look, they’re all correct” doesn’t really solve a practical problem,’ comments inventor of the program Bartosz Grzybowski, from the Polish Academy of Sciences and Ulsan National Institute of Science and Technology, South Korea. Instead, Chematica ranks the possible routes by the end cost of the target molecule, and can show a variety of alternatives rather than just the cheapest – which may be variations on a theme. Now, Grzybowski’s team has significantly improved how Chematica makes these selections.
Accounting for the imperfect yields of reactions had the biggest impact, implicitly penalising linear synthesis pathways and favouring convergent ones where molecules are assembled in chunks. ‘For the examples shown, the approach taken by Chematica does not differ too much from the retrosynthetic analysis made by a trained organic chemist,’ says Mariola Tortosa, a researcher in organic synthesis at the Autonomous University of Madrid, Spain. ‘One feature that I found particularly attractive and impressive is the fact that the program can identify the optimal timing to use the most expensive reagents.’
Chematica’s definition of chemical novelty has also been improved. The code can vary its shortlist by penalising the use of similar reactions in consecutive suggestions, but it previously had a specific idea of what made two reactions alike. In the new version, two reactions are similar if they make the same product and share a reagent with more than four carbon atoms, which is broader and more realistic. The team tested this by having Chematica develop syntheses for trans-whisky lactone, and found that the software gave three very different approaches.
The software is also significantly faster than its previous version, which in severe cases could take thousands of seconds to rank the syntheses and select the best. ‘Who is willing to wait three hours to do this kind of selection? It was quite slow,’ says Grzybowski. Tomasz Badowski, a mathematician specialising in algorithm theory who worked on the project, designed superior algorithms to cut the search time to less than a second, a four-order-of-magnitude improvement.
The software’s combination of speed, realism, and variety should allow organic chemists to approach the synthesis problem in new ways. ‘It gives you a realistic route to basically slide the dial and say, right, I care about yield, or no, actually I care about steps, or I care about the cost,’ says Lee Cronin, who investigates automation in chemistry at the University of Glasgow, UK. ‘I think that research like this is going to put the practical chemist back in control of their time, so they can spend it doing more fascinating chemistry and making bigger molecules.’ The sentiment is shared by Tortosa: ‘As tools like this keep improving, we won’t need to worry about things that are somehow repetitive and we will have more time to focus on creativity, on understanding fundamental mechanistic aspects and to search for original synthetic transformations. As an academic this is a very important factor.’
Merck KGaA of Darmstadt, Germany bought out the company set up to develop the software in 2017 and has released a commercial version, named Synthia. These improvements to Chematica will be included in a future release.
This article is open access
T Badowski, K Molga and B A Grzybowski, Chem. Sci., 2019, DOI: 10.1039/c8sc05611k