Mapping out drug discovery routes with artificial intelligence
As the Covid-19 pandemic sweeps the world, PostEra is aiming to combine the collective wisdom of the world’s chemists with the company’s artificial intelligence (AI) algorithms to tackle the novel coronavirus.
‘We realised there were hundreds of chemists sitting at home, with their projects on hold,’ says chief scientific officer Alpha Lee. So PostEra set up a non-profit ‘moonshot’ project to harness those resources.
The work builds on a rapid and collaborative global effort to find new drugs against Covid-19. In January, Chinese scientists determined the crystal structure of the virus’s main protease – an enzyme critical to viral replication that has no known counterpart in human cells, making it a good drug target. In February and March, researchers at the Diamond synchrotron facility in the UK started screening for molecular fragments that bind to this protease. As of 26 March, they had found 66.
Then PostEra stepped in, asking the global community of medicinal chemists to design possible drug candidates by stitching these fragments together into theoretical compounds. As of 7 April 400 contributors had submitted more than 3500 suggestions. As the submissions came in, the company used its algorithms to work out which of these compounds would be easiest to make, most likely to work, and least likely to be toxic. This produced an initial list of 250 compounds, along with recipes for their synthesis. PostEra has sent this information to contract manufacturer Enamine in Kiev, Ukraine, to create them. On 8 April, the first 100 arrived at the University of Oxford, UK, where researchers will test how well they bind to the protease in the lab.
The project employs PostEra’s Synthesis Laboratory tool, which designs synthetic routes. The tool’s Molecular Transformer algorithm has been trained to understand chemistry using a library of 9 million reactions published in US patents. The algorithm uses Smiles notation to convert structural information about a compound, along with some 3D information about stereochemistry, into a text string, which can then be fed into the language-based translation algorithm
In August 2019 the team reported that the algorithm outperformed both a set of 11 human chemists from the Massachusetts Institute of Technology, US, and other computer models, at predicting the outcome of 80 random reactions. The best person in the set was 76.5% accurate, while the model was 87.5% accurate. Importantly, the algorithm also estimates its own uncertainty with 89% accuracy. Those measures of uncertainty, says Lee, are useful for figuring out what experiments would create the most informative data to plug back into the model to improve its results.
Since then, the team has boosted the algorithm’s efficiency and adapted it to focus on the reverse scenario: ‘given what you want to make, how do you make it?’ says Lee. ‘Most AI approaches to drug discovery focus on answering the question of what compounds to make, but miss how to make those compounds,’ Lee continues. ‘We provide our clients the solution of not just what compounds to make but the synthetic routes for how to make them.’
Lee and chief executive Aaron Morris first met while studying mathematics at Oxford in the early 2010s; they were debating club partners. ‘We were the only two science kids trying to do public speaking,’ laughs Morris. ‘We tell ourselves that we were good.’ Lee moved to the University of Cambridge to start a research group, which Matt Robinson joined as a student. By then, Morris was an investment banker at Goldman Sachs. Together the three decided to commercialise the work being done in Lee’s lab in late 2019, and co-founded PostEra in October, with Robinson as chief technology officer.
Given that the company was only founded a few months before the pandemic hit, they haven’t had time to work on many projects. They did one project with the American Chemical Society, helping to benchmark various models of bioactivity prediction. That work was completed at the start of 2020, and is now going through review for publication, says Morris. Their bread-and-butter, he adds, is small biotech companies struggling to synthesise desired compounds.
The non-profit Covid-19 focused project has been, the teams says, ‘quite unexpectedly’ successful. ‘We expected around 50–100 submissions,’ says Lee. ‘It’s really been a rollercoaster.’
Sorting through the first batch of 2000 submissions took the AI around 48 hours; the team estimates it would have taken humans about 3 weeks to do the same work. The next batch of submissions focuses on fragments that bind through stronger, covalent bonds to the protease. A covalent attachment, says Lee, ‘gunks it up permanently. But the flip side is it’s really reactive,’ which could spur unwanted side-effects.
‘Right now, we are getting experimental data,’ says Lee, referring to the 100 compounds made by Enamine and delivered to Oxford. ‘Once we get that, we can decide which ones to double-down on. Time is of the essence,’ he says. ‘Our goal is to reach pre-clinical stage in 6–8 months.’
PostEra has a handful of other projects on the go, though many have been put on the back burner during the pandemic, says Morris. One of them, he notes, is concerned with developing a drug aiming to help with substance abuse. ‘We’re also actively looking to add members to the team,’ says Lee.
Date of founding: October 2019
Location: Originally London, UK; now Santa Clara, US
Number of employees: 3 (all co-founders)
Origin in a nutshell: Spin-out from the University of Cambridge, UK
Funding to date : $2.5 million (£2 million) from US seed accelerator Y Combinator, venture capital funds, and angel investors including the former head of data at Facebook