Machine learning models complex physical systems with sparse datasets

Researchers from the US have devised a machine learning method that can predict the behaviour of complex physical systems that cannot be represented by current computational methods and which are also unsuitable for high-throughput screening.

Machine learning involves training a computer to make predictions based on desired properties by constructing algorithms that learn from large amounts of data. However, this is impractical for modelling complex materials, like cement and chemical formulations, due to the huge number of intricate interactions that dictate their properties.

Schematic of the machine learning method

Source: Royal Society of Chemistry

Schematic of the machine learning method

Now, scientists led by Newell Washburn at Carnegie Mellon University have devised a new way to make predictions using much smaller datasets. The idea stemmed from their experiences researching dispersants and hydrogels. ‘We’ve developed an intuitive feel for how they work and got interested in trying to map this intuition onto a machine learning framework,’ explains Washburn. ‘In a way, the model we developed represents how we wish we thought about our research.’ While purely statistical methods often require exhaustive computational analysis based on 100s or 1000s of compounds, the team combined both physical and statistical modelling into a machine learning method to simplify the process.

The method’s first test area involved polymeric dispersants for flowing particle suspensions. Washburn’s team fed the method with experimental data from just 10 model polymer systems with different functional groups. The machine learning method reduces the experimental data to a set of single-physics interactions and uses statistical modelling to probe how these parameters affected the properties of the suspensions. It then proposed a novel dispersant whose performance equalled that of the leading commercial material despite having a significantly different composition. Experimental results for the synthesised dispersant agreed well with its predicted properties.

Tanja Junkers, an expert in polymer reaction design from the University of Hasselt in Belgium, is impressed with the work. ‘The approach of Washburn and co-workers – to combine pure statistical with physical modelling – is a very elegant way to speed up research in material design. I think we will see many researchers following this path in various areas of material design already in the near future.’

Looking ahead, Washburn says they are ‘working to put the model on a firmer theoretical footing as a tool for molecular design but also reaching out to chemical and materials companies to see if it can be used to solve some of their toughest technology problems and finding an enthusiastic response so far.’ The team is now examining even more complex systems, like inks for 3D bioprinting, where they simultaneously model chemical, formulation and processing variables.