A machine-learning algorithm has been developed by scientists in Japan to breathe new life into old molecules. Called BoundLess Objective-free eXploration, or Blox, it allows researchers to search chemical databases for molecules with the right properties to see them repurposed. The team demonstrated the power of their technique by finding molecules that could work in solar cells from a database designed for drug discovery.
Chemical repurposing involves taking a molecule or material and finding an entirely new use for it. Suitable molecules for chemical repurposing tend to stand apart from the larger group when considering one property against another. These materials are said to be out-of-trend and can display previously undiscovered yet exceptional characteristics.
‘In public databases there are a lot of molecules, but each molecule’s properties are mostly unknown. These molecules have been synthesised for a particular purpose, for example drug development, so unrelated properties were not measured,’ explains Koji Tsuda of the Riken Centre for Advanced Intelligence and who led the development of Blox. ‘There are a lot of hidden treasures in databases.’
Drop the boundaries
However, out-of-trend materials are challenging to discover using existing methods. ‘Machine learning for materials discovery requires the underlying data to cover a large area of the desired property space, while existing chemical knowledge tends to rely on well-known materials,’ says James Cumby whose work at the University of Edinburgh, UK, explores functional materials using database mining and quantum chemistry.
How current machine learning discovery routes work is another logjam to finding out-of-trend materials. Some require an appropriate optimisation target in advance. Other techniques can be more random but only probe a set property space. This creates boundaries that the methods cannot search beyond.
‘In most studies about machine learning-based optimisation of materials, one needs to determine an objective function in advance,’ explains Tsuda, noting that a prime example would be binding affinity to a protein. To overcome this boundary and objective function limitation, Tsuda’s team conceived a way to parse molecules based on relative novelty.
‘The Blox method addresses one of the drawbacks of other commonly-used exploration algorithms targeting the chemical landscape: their reliance on human-inputted boundary conditions,’ comments Clemence Corminboeuf who researches machine learning and computational chemistry at the Swiss Federal Institute of Technology in Lausanne (EPFL).
The Blox workflow includes a mathematical technique known as Stein discrepancy, described as ‘refreshingly unconventional’ by Ganna Gryn’ova, a computational chemist at Heidelberg University in Germany. The process begins by selecting materials randomly from a database and observing their properties by experiment or simulation. A machine learning prediction model is then applied to determine these properties more rapidly over a larger dataset and the materials are arranged by plotting one desired property against another. At this point, the Stein discrepancy comes into play and makes Blox so good at finding out-of-trend molecules.
‘To find a novel molecule, it is necessary to define a distance between a molecule at hand and the set of molecules in the database,’ explains Tsuda. The Stein discrepancy highlights molecules that are distant from the main grouping of the database, when plotted in property space. A greater distance means that one of the properties is likely to be unusual and the molecule out-of-trend.
Once an out-of-trend candidate is found it can then be assessed with experiments or simulations to confirm if it is truly an out-of-trend material and therefore suitable for chemical repurposing.
To show the utility of Blox, the team used it to probe 100,000 molecules from the Zinc database – a public database that researchers typically screen for bioactive compounds. Only this time, they were looking for molecules with a high degree of photoactivity. Blox was instructed to map the property space as the absorption wavelength for the first singlet excited state against oscillator strength. From this search, eight out-of-trend molecules that could be used in applications such as solar cells were found and their properties confirmed using density functional theory.
‘This is an enticing tool for exploring properties beyond common trends that will hopefully find its way to other chemical and materials science communities where out-of-trend systems are in demand,’ says Gryn’ova. Crominboeuf also says that Blox holds great promise, and that it would be ‘especially appealing for the design and discovery of materials, for which the final efficiency depends on multiple independent conditions, for example hole transport materials’.
This article is open access
K Terayama et al, Chem. Sci., 2020, DOI: 10.1039/d0sc00982b