The cost of publishing problems could put off a generation of researchers

I chose to study a PhD because I enjoy the tantalising unknown of research. I never imagined it could be so demoralising. Today, students are forced to approach an unfamiliar topic and are expected to grapple with a literature so vast, incomprehensible and unreliable – bordering outright ridiculous – that reading can be a real chore. It’s not the students at fault: it’s the quality of the research.

Female scientist reading documents in lab

Source: © Sigrid Gombert/MITO Images/Getty Images

Scientific publishing is awash with problems around reliable publishing and reproducibility. Papers are littered with data that contradicts a paper’s title, jargon in place of carefully written conclusions and elaborate schematics illustrating potential applications unsubstantiated by the evidence presented in the paper. To cite just a few examples: there is a reproducibility problem with the CO2 adsorption of MOFs;1 perovskite solar cells have unreliably reported stabilities and lifetimes;2 and – looking more widely – a 2016 survey published in Nature revealed over 70% of 1576 researchers could not repeat another scientist’s experiment.3

These issues combine with an obsession with metrics, Research Excellence Framework (REF) returns, h-indexes and the ‘impact’ of our research that created a hyper-competitive system where volume of publication counts over all else. While the rise of predatory journals with severely lacking peer review procedures is a worry, the real challenge is the poor quality of some research in even the most reputable publishing groups.

Perhaps it always has been thus. But it’s still difficult to shake off the worry that we may be dealing with the fallout from these issues for generations to come.

Applied limits

The need to identify an application in every piece of work is a major driver of these problems. Traditionally, it was enough for a paper to contain a single, key finding, and an admirable admission of an experiment’s flaws. Today, we are now bombarded with gigabytes of supplementary information across multiple techniques, all trying to justify the utility of the discovery. This approach rarely acknowledges the limitations of the techniques used and the need for complementary analysis to support the claims. With such a range of techniques and specialities, can a mere two reviewers ever really interrogate the data they are sent with the rigour we would like?

The issues are most acute in emerging technologies. While the requirements for publishing a new synthetic procedure benefit from decades of established method, such rigorous standards do not yet exist in the rapidly growing fields that have sprung up to capitalise on funding aimed at solving global challenges. A system that encourages people to aim for the latest buzzwords –regardless of their prior knowledge or experience – does seem like an inefficient approach to problem solving.

The size and rigour of large data sets are also proving a challenge for machine learning – where the old adage ‘garbage in, garbage out’ has never been truer. Consider the recent case of a machine learning approach being used to predict the yields of cross-coupling reactions.4 The researchers found the existing literature was insufficient to provide the data required, mostly due to failed experiments going unreported. Instead, a high-throughput platform was used to complete 5000 new experiments to generate the data from scratch.

Solving the problem

Several solutions to this problem have been put forward. First, if papers are going to be more interdisciplinary, drawing on multiple techniques and specialisations, then comments and feedback on published research could be more widely utilised. This would see the review process continue and allow a wider range of people with different skills the chance to critique the methodology.

Unfortunately, how to facilitate this is far from settled; anonymous comments are a contentious issue, while requiring individuals to identify themselves often discourages researchers from engaging. One compromise could be identifiable user accounts, where researchers can comment in anonymity while knowing a third party holds their identity – which could become known in the event of an upheld complaint. Of course, this has its own disadvantages, but a bridge between the two sides is surely needed before all discourse ceases for fear of legal action or career reprisals.

It would also be refreshing to see less focus on ‘impact’ as ‘the potential to make someone lots of money’. There is nothing wrong with aiming for a useful application, but most articles claiming such breakthroughs fall short of genuine use. Reliable database entries, 3D printer designs and open-access computer code would be far more worthwhile contributions.

For most young researchers, the ultimate goal of entering chemistry – particularly academia – is to publish and share their ideas. If the only reward for that is dejection and further isolation, then it’s hard to view the system as anything other than broken. I still hope it can be fixed, but right now I, and may other young researchers, have to ask: without reliable publishing, what’s the point of a scientific career?