Crowdsourced model offers new insight into our sense of smell

A computer algorithm has been created that can predict the aromas of molecules from their structural characteristics alone. Following a competition to create the best algorithm, a blend of the top ones approached the theoretical accuracy limit for predicting how testers would characterise the smells of a molecule.

The relationships between the frequencies of light and sound waves and, respectively, the colours perceived by the eye and the pitches discerned by the ear are well understood. However, the nature of our sense of smell remains mysterious. Near-identical molecules can sometimes be distinguished by testers, whereas molecules that are structurally completely different can smell the same. There is no reliable way to predict how humans will characterise the aroma of a particular molecule.

To assist in the search for a predictive algorithm, Andreas Keller and Leslie Vosshall of the Rockefeller University in New York organised a panel of 49 human testers. They were asked to score the pleasantness and intensity of 476 structurally diverse molecules with varied aromas or sometimes no aroma at all, as well as the suitability of 19 descriptors such as ‘fruit’, ‘sweaty’ and ‘decayed’.

They shared the information about testers’ perceptions of most – but not all – molecules with entrants, and asked them to devise algorithms to predict how the same testers characterised the remaining molecules. The researchers also shared information on all the molecules from a commercial dataset called Dragon. ‘Each molecule is described by around 4000 variables like how many hydrogens, how many atoms, the distance between the atoms and the polarity,’ explains systems biologist Pablo Meyer of IBM’s Thomas J Watson Research Center in New York.

The researchers judged the algorithms on how well they predicted the scores that both the population on average and each individual tester gave the remaining molecules for each descriptor. After the competition, teams pooled their efforts and shared features of each other’s models, together with further information about the molecules, to produce an even better model. As a measure of the maximum accuracy theoretically achievable, the researchers slipped in some sample molecules twice to find out how variable the testers’ scores themselves were. ‘Our model is as good as the variations that exist within the data,’ says Meyer.

Eric Block of the University at Albany in New York, who was not involved, is impressed. ‘Keller and his international team have taken a major step forward in decoding how the brain interprets messages from the nose by using a very large dataset to connect molecular features to perceived odour with high precision,’ he says. Block adds a note of caution, however. ’Given that seven-eighths of the key information on receptor–odorant pairing still remains unknown, with many surprises likely, such as the emerging role of metals in olfaction, reverse engineering of smell may be on the horizon, but it’s not yet in hand.’