Artificial intelligence can speed up drug discovery, but collaboration is the key

The current Covid-19 crisis highlights the need to rapidly identify new treatments for emerging diseases. But realigning large scale experimental efforts to changing circumstances cannot happen overnight. This pressing call for agility suggests we need to take a new approach to how we search and identify molecular cures and therapies.

Historically, drug discovery has relied on extensive experimental effort to map out the relevant part of the chemical space. Even with the advent of automated high-input screening methods, designing an effective assay for new diseases – and running it at scale – can take months.

We can’t unlock the potential of AI without access to data

Emerging artificial intelligence (AI) technologies have potential to accelerate and transform the search for molecular remedies, enabling rapid, large-scale identification of effective drug candidates. With this goal in mind, we started an initiative called AI Cures with the hope of lowering the barrier for people from varied backgrounds to get involved and contribute. By openly sharing data, critical analysis and methods, we hope to explore the many ways AI can help drug discovery. High scientific standards and open analysis can also mitigate the negative impact from possible false, early announcements.

The simplest task for AI algorithms is in silico property prediction. The algorithms learn to forecast outcomes of experimental assays, such as whether cells treated with a given compound are able to withstand viral infection, allowing large libraries of compounds to be scanned quickly and cheaply for potent candidates. The AI tools do require initial data to learn from, but the amount of data needed can be substantially less than a comprehensive experimental survey. Indeed, the algorithms are naturally opportunistic, seeking consistent statistical patterns that relate molecular features with measured outcomes. The clearer the signal in the initial data, the less is needed to predict or effectively simulate experimental outcomes. In our prior work on antibiotics, for example, an initial screen of only two thousand compounds sufficed for identifying promising candidates from much larger libraries containing around 100 million compounds.

AI tools can assist in the race to repurpose compounds that are already going through the clinical approval process. There are only on the order of 10,000 such compounds, some of them currently undergoing clinical trials, and much of the effort to screen and test their effectiveness against Covid-19 is already underway. The experimental outcomes for various viral inhibition screens do not agree very well with each other, however, leaving plenty of room for AI tools to reconcile differences in cell types, cell lines and experimental protocols.

Unfortunately, the single drug repurposing effort may not yield an effective treatment. Many known viral therapies, like those for treating HIV, are based on combinations of drugs rather than individual compounds. There are multiple rationales for considering molecular cocktails; they may be used to increase potency (targeting different pathways and processes), mitigate side-effects or modulate the human immune response. To identify potent drug combinations, we would ideally have access to large-scale screens of infected cells simultaneously treated with multiple candidate compounds at different dosages. This search space is combinatorial, however, rendering systematic experimental efforts insufficient. This is where AI and machine learning tools can make a real difference. Beyond suggesting promising combinations from limited data, the algorithms can also indicate which combinations would be the most helpful to screen experimentally. Indeed, the roles of AI algorithms and experimental screening efforts are highly synergistic.

The algorithms do not replace medicinal chemists, but enable them to work more effectively

The search for molecular therapies is necessarily limited in the short term to existing drugs and their combinations. But the real potential in AI algorithms comes from exploring larger chemical spaces for de novo designs. Our group is one of several that are already developing and testing AI algorithms capable of automatically designing safe and potent compounds. The algorithms do not replace medicinal chemists but rather enable them to work more effectively to harness the vast untapped chemical possibilities currently beyond their reach. Once algorithms have identified a smaller set of promising candidates, medicinal chemists can use their experience and insights to analyse, modify and further guide the search for cures.

We can’t unlock the potential of AI without access to data, which plays a key role in both the development of innovative AI tools and the effectiveness of the resulting algorithms. However, at the time of writing this article, we are aware of only three Covid-19 screening libraries released fully into the public domain. The broader community of computational scientists is eagerly waiting for screening results from prominent publicly and philanthropically funded efforts such as the Covid-19 Therapeutics Accelerator or MassCPR. Meanwhile, we are working on developing algorithmic tools capable of operating with limited, heterogenous data, including fragment screens and data from related viral species. The ability to extrapolate beyond the training data is paramount, both now and against future threats.

AI and machine learning tools have the potential to transform drug discovery in a manner similar to the way they have transformed other areas of science and engineering. But this requires a broader collaboration between computational, chemical and life sciences. It is exciting to see that a number of similar efforts to AI Cures are emerging around the world. Chemists can help by providing data, analysis or methods, and sharing them openly with others who are eager to try to take it a step further.