Biochemists and other researchers who apply for funding from the US National Institutes of Health (NIH) will have to include comprehensive data management and sharing plans in grants from 25 January. These will be formal strategies for managing, preserving and sharing scientific data, as well as the accompanying metadata.

The new rule, which is generating some concern within the research community, replaces the NIH’s existing data sharing policy that has been around since 2003, and applies to only those seeking at least $500,000 (£419,200) in direct costs from the agency in any given year. The original regulation required researchers to submit a plan that describes how they will share the underlying data, or if they cannot share it then why not.

By contrast, the latest policy affects all NIH grants, regardless of specific budget. It will apply to competing grant applications, proposals for contracts and other funding agreements submitted to the NIH on or after 25 January.

The agency will now mandate that researchers describe their strategy to share scientific data needed to ‘validate and replicate’ their research findings, whether or not the data is used to support scholarly publications.

‘The new policy is broader, deeper and more detailed,’ says Stephen Jacobs, director of the Open@RIT initiative at the Rochester Institute of Technology, US, which supports open academic work of all types. ‘Any scientific data needed to replicate research findings must be made available now, whereas before you just had to provide access to the data that would support your publications.’

Part of the concern is that the new NIH policy defines data too broadly. ‘There is worry that maybe any note that researchers are going to scribble down has to be included,’ Jacobs adds.

Lack of resources, expertise

Cameron Cook, University of Wisconsin-Madison’s data and digital scholarship manager, who also chairs of the school’s Research Data Services, agrees that the NIH’s original rule was ‘lighter weight’, and was more concerned with research data security.

‘On our campus, it affected fewer of our researchers because it had applied [only] to genomics researchers and those in other fields who were asking for more than $500,000,’ she explains. ‘So, this new data management and sharing policy is a big cultural shift for people that weren’t subject to it, or that were thinking about it in different ways, because now we’re asking those researchers to really consider data sharing from the get-go.’

Many researchers feel overwhelmed and lack the resources or expertise needed to fulfil the requirement of the NIH’s new policy, according to Cook. ‘For many of them, the language of data management and sharing is new, and so that’s overwhelming,’ she explains.

In a perfect world, Cook says, a grant would cover the cost of complying with the agency’s new data management and sharing plan. ‘But that’s always in tension with your other research cost requests and the rest of your grant proposal,’ she adds, pointing out that de-identification and anonymisation services can be extremely expensive.

‘Until there is a generation of scientists who are used to doing this, or data management professionals who are part of scientific teams or staff at university offices of sponsored projects, meeting the new policy requirements will be a lot more work for the researchers themselves,’ Jacobs warns.

He and Cook agree, however, that the new NIH policy is a good model that could set the standard for scientific data sharing in the US and potentially globally. ‘The plan is thoughtful and forward-thinking … and I think it is a good example of what we can expect from other agencies,’ Cook tells Chemistry World.

‘It’s a very exciting thing,’ she continues. ‘I think there’ll be some pain for the next couple of years, as we all figure out how to comply with this and other policies, but hopefully this will be a cultural shift and people will start to become comfortable within a few years and it’ll just be part of our workflows.’

A major impetus behind the push for open science, which includes open data, are all the preprints and data sharing that came out of the global scientific community during the Covid-19 pandemic. Another incentive for open science efforts is addressing the reproducibility crisis.

Experts note that many EU countries are ahead of the US on open science policy overall, including data management, mostly because they can – at least on a country-by-country basis – mandate such rules in a centralised manner, unlike in the US. ‘In general, we are seeing much broader plans for open science across-the-board in Europe,’ Jacobs states. He cites the French and German national plan for open science plans as particularly comprehensive.