A new, robust format for standardising complex NMR data has just been published by an international team of chemists led by Damien Jeannerat at the University of Geneva in Switzerland. The new format, called NMReDATA, is a way of structuring the raw data from NMR experiments in a way that is easily readable by humans and machines.
In order to use NMR spectra to identify unknown molecules, organic chemists collect large amounts of raw data such as chemical shift values, scalar coupling constants and 2D correlations and use these values to assign a chemical structure. Without a standard method for processing this data, however, each researcher uses their own system. This makes it all but impossible to re-use data from another lab. Add in the fact that datasets are often incomplete or have been generated with outdated technology and it’s easy to see the dilemma faced by researchers – either redo the work of others or hope that the assignments are accurate.
‘This is something that has been talked about at every NMR conference,’ says Nicholle Bell, research fellow at the University of Edinburgh, UK. Bell uses NMR to identify compounds found in mixtures and sees real value in a system that instantly provides her with the relevant data and the parameters in which it was acquired. ‘I could then take that data, cross-validate it with my data and say, yes, that molecule is present and move on in minutes rather than hours.’ Aside from preventing frustration, labs working in partnership with public health agencies could rapidly confirm the presence of suspect molecules in environmental or food samples.
The new format works with existing file formats by adding relevant tags that correspond to the information that is essential for assigning chemical structures to unknown molecules. NMReDATA can be integrated into new releases of NMR software packages so users need only choose this format when saving their data. In this way any researcher can open these files and quickly validate the assigned structure or duplicate the conditions needed for experimentation.
‘Unlike other fields like x-ray or protein NMR, no central database exists for NMR data or organic compounds and natural products,’ explains Jeannerat something he believes the widespread acceptance of NMReDATA might change. ‘We already have a lot of important companies, journal editors and labs onboard.’ For him though, the key is ‘to move forwards at the right pace’ in order to avoid user frustration. ‘We are open to refining the format, even if major changes will be difficult because we need to avoid compatibility problems.’ The team hopes the software will be available and in widespread use in a year’s time.
18 June 2018: paragraph three was edited to reflect Nicholle Bell’s comments.
M Pupier et al, Magn. Reson. Chem., 2018, DOI: 10.1002/mrc.4737