Philip Ball asks how much of the published literature you should believe. Not much, by some accounts

A 2005 paper by epidemiologist John Ioannidis, then of the University of Ioannina School of Medicine in Greece, had the stark title ‘Why most published research findings are false’.1 Ioannidis claimed that ‘for most study designs and settings, it is more likely for a research claim to be false than true’, and that often published claims simply reflected the prevailing bias of the field. Ioannidis suspected that some ‘established classics’ in the literature wouldn’t stand up to close scrutiny.

His focus was on biomedical research, in particular clinical trials of drugs, where inferences have to be made from complex statistics, perhaps with small sample sizes. Here, not only might the effects being sought be rather marginal but there are also strong biases and prejudices introduced by financial pressures. Reports of drug trials certainly do have a bias towards positive outcomes, prompting valid calls for all drug trials to be registered before the study is undertaken so that negative findings can’t be quietly dropped.

These problems with pharmaceutical research are in themselves troubling for some chemists. But is this mostly an issue for big pharma, with its distorting profit motives and its reliance on statistics rather than more reductive, step-by-step experimentation? Probably not, Daniel Sarewitz of the Consortium for Science, Policy and Outcomes at Arizona State University, US, claimed in Nature last May.2 According to Sarewitz, systematic error due to bias – whether conscious or not – is ‘likely to be prevalent in any field that seeks to predict the behaviour of complex systems: economics, ecology, environmental science, epidemiology and so on’. This figures: all these fields tend to depend on statistical inference of often marginal effects operating through mechanisms that may be poorly understood and perhaps nigh impossible to delineate.

But what about the subjects we like to think of as the ‘hard sciences’ – like most of chemistry? Surely you can place more trust in spectra and rate constants and crystal structures than in scatter plots? Perhaps – but ‘trust’ is often what it is. Not many studies are ever repeated verbatim, and it’s generally acknowledged that crystallographic databases are probably full of errors, if only minor. The chance of experiments being replicated is probably proportional to the significance of the results. Maybe the greater good doesn’t suffer much from a literature full of flawed but uninteresting work – but that would offer scant support for science’s supposedly self-correcting nature.

And problems do crop up on close examination. Take, for example, the recent attempt by Darragh Crotty and colleagues at Trinity College Dublin, Ireland, to replicate the claims of Russian biochemist Anatoly Buchachenko and his coworkers, who since 2004 have been documenting (in good journals) the influence of a weak magnetic field on the rate of enzymatic production of ATP. The Russians report that millitesla magnetic fields can more than double the reaction rate when the phosphorylating enzymes contain 25Mg (which is magnetic) rather than the other two stable isotopes 24Mg and 26Mg. Crotty and colleagues set out to test this because it bore on controversial claims of physiological effects from weak electromagnetic fields. They found no difference in reaction rate for all three magnesium isotopes. So far the discrepancy remains puzzling.

If this is indeed a wider problem than is commonly recognised for all sciences, what to do? Sarewitz suggests reducing hype and strengthening ties between fundamental research and real-world testing. Ioannidis implores researchers to be honest with themselves about the ‘pre-study odds’ of their hypothesis being true. This purging of preconception and self-deception is what Francis Bacon called for in the 17th century when he argued that natural philosophers seeking truth must free themselves from ‘idols of the mind’. But as Ioannidis recognises, changing mindsets isn’t easy.

Another perspective is offered by psychologist Brian Nosek of the University of Virginia, US, and his colleagues.3 They point out that professional success for scientists relies on publishing, but publication both favours positive results and prefers novelty over replication. What is needed is a way to rescue scientists’ ostensible aim – getting it right – from their short-term, pragmatic aim – getting it published. Among things that won’t work, the authors say, are journals devoted to replications and tougher peer review (which can already display stifling conservatism). Instead we need metrics for evaluating what is worth replicating, journal editorial policies that focus on soundness rather than ‘importance’, less focus on sheer publication productivity for job and tenure applicants, lower barriers to publication (so that it becomes less coveted in itself), and in particular, new ways of releasing results: open access to data, methods, tools and lab books. One can find problems with all of these, but the old ways of science publishing are looking increasingly archaic and flawed. What have we got to hide?