Six tips for keeping your results in order

Organising data

Source: © Maxime Gé/Ikon Images

Organising your data well makes it easier to share and build upon

The start of a new project is an exciting time and it’s often tempting to leap right in and begin collecting results immediately. But taking the time to plan how you’re going to organise and manage your future data is a vital first step in any project according to Clair Castle, research data manager at the University of Cambridge, UK. ‘It’s time well spent and an investment in your research,’ she says. ‘It will really make life easier for yourself and others later.’

Here the experts share their tips to help you get organised from the outset.

Plan for publication

The ultimate goal is to publish all research so it’s worth considering what you will need to make this happen. ‘There’s a lot of pressure to finish at the end of a project so think about how to help your future self. How will you share the data and what will you need to publish or hand over to someone else?’ says Castle. While the exact requirements will vary depending on the specific field of research, crucially data need to be complete, accessible and understandable.

Simply sharing processed data is insufficient, says Pol Hernández Lladó, a postdoctoral research associate in medicinal chemistry at the University of Oxford, UK. Processing results is a helpful means of presenting complex information during project milestones but subjective interpretation can introduce human error. ‘Always make sure that the raw data is findable. The text and the discussion may change, but the raw data never will,’ advises Hernández Lladó.

Create a standardised system

‘There’s no one-size-fits-all with data organisation – it’s more about following sets of principles and best practices,’ says Castle. ‘At the beginning of your project decide how you’re going to structure your folders, name files, and handle versions of files.’ Providing you remain consistent, how you choose to organise files is entirely a matter of personal choice. Many research groups have existing conventions for file naming and storage so it’s useful to check with colleagues if a system is already in place. For those creating their own organisational structure, Castle recommends keeping filenames concise but informative and providing metadata to help other users navigate your saved information.

Prepare appropriate metadata

Metadata is data about data and provides information to help you and others interpret files in the correct context. Relevant information could be anything from the name of the creator, key experimental details or IP and licencing restrictions, but crucially it will influence how the data is used, shared and interpreted. ‘I have one huge spreadsheet where I track everything. It includes a unique identifier for each compound and the properties measured, as well as the location of the raw data,’ says Hernández Lladó. ‘Particularly for collaborative projects you often get data from different sources so having a clear summary helps for handing over and publishing the final outcome.’

Backup, backup, backup

‘Rule number one: backup your data!’ says Castle. Most institutions have a secure internal storage system, which Castle recommends taking advantage of if it’s available. How frequently you choose to back up will depend on the precise stage of your project but experts typically recommend once a week as a minimum. ‘Think about how much it would be bearable to lose before it became too onerous or even impossible to recreate that data. There may be critical points in your project where you need to back up more frequently,’ Castle adds.

Use an electronic lab notebook

Electronic lab notebooks (ELNs) provide this security automatically and John Jolliffe, project manager at research data consortium NFDI4Chem in Germany, advises all researchers to use them if possible. ‘ELNs automatically apply the FAIR principles (findable, accessible, interoperable, reusable) to your datasets and organise them into existing workflows,’ he explains. ‘They also make the data machine-accessible for machine learning. Real-time AI assistance will become more common in the future and will reduce the number of experiments needed, accelerating science.’ Although already common in industry, many academic institutions have not yet adopted ELNs as standard. However, there are a number of free services available and NFDI4Chem offers guidance on choosing the most suitable system for your research.

Ask for help

Getting started with data management can feel daunting, but Jolliffe says you don’t have to figure it out alone. ‘My top tip would be to find the experts at your institution or within your country,’ he says. Academic librarians will know your institution’s best practices for handling research data while departmental IT teams can advise on any technical resources available. Independent organisations such as the Physical Sciences Data Infrastructure can also offer support to scientists looking for specialist advice. But it’s your colleagues who will potentially have the most relevant insights for your work. ‘They’re facing the same challenges as you so sharing knowledge within your group is really valuable,’ says Castle.