‘As with many hidden criminal syndicates, you don’t always know what’s happening,’ says Retraction Watch’s Ivan Oransky about paper mills. They are the biggest organised fraud perpetrated on scientific journals ever, eroding scientists’ trust in the publishing system – and in each other.

While plagiarism and fraud isn’t new – individual researchers have been caught photoshopping electron microscopy images or inventing elemental analysis data – paper mills serve up professional fakery for their customers on an industrial scale. Buyers can apparently purchase a paper, or authorship of one, on any topic based on phony results to submit to a journal. This makes them not only harder to detect and crack down on, but also exponentially increases the damage they could do.

The extent of their operations became apparent in early 2020. Two independent groups of image detectives came across a number of manuscripts, all from different authors at different institutions working on different biomedical topics, that seemed to share strange inconsistencies – as if they had all used the same stock images. The set now contains almost 600 manuscripts. Another set of 125 was discovered only a few months later. And there could be 10 times as many professionally manipulated papers that have not yet been – and might never be – found, estimates science integrity consultant Elisabeth Bik.

Image manipulation or use of stock images at this scale has never been seen before, says Sabina Alam, director of publishing ethics and integrity at Taylor & Francis (T&F). The Biochemical Society’s Portland Press called it a ‘new and acute pandemic of falsified information’, having rejected over 600 manuscripts suspected to originate from paper mills in less than a year.

Shadowy outfits

Paper mills are the essay mills of the scientific publishing world. Instead of students buying a ghost written term paper for a few hundred pounds in the hope of improving their mark, for up to 100 times as much, academics can buy a research paper – with publication guaranteed – and pretend it is their own work. Whether the research within those papers is entirely made up, has been copied from genuine studies or is manipulated in other ways remains unclear. It is one of the many unknowns when it comes to paper mills.

Companies realised that there was even more money to be made in selling manuscripts themselves

Ivan Oransky, Retraction Watch

Even something as simple as locating a paper mill is difficult. Catriona Fennell, director of publishing services at Elsevier, found one business that seemed to offer authorship for sale. According to their website, it was based in London, UK. ‘But then the phone number was from Vietnam, and the servers were in Pakistan. Which government are you even going to report them to?’ Fennell recalls researchers saying ‘that some of these paper mills don’t even have websites, they go around the university drumming up business face to face’.

In 2013, both Science and The Economist described large-scale authorship selling operations for the first time. ‘It’s unbelievable: you can publish SCI papers without doing experiments,’ read a banner on a website of one of these businesses uncovered by Science. SCI indicates journals indexed by Thomson Reuters’ Science Citation Index (SCI), which seem to be particularly sought after.

When contacted by journalists, many businesses claimed that they offered only legitimate services such as proofreading, editing and translating. ‘At some point, either those companies or other companies realised that there was even more money to be made in selling manuscripts themselves,’ suggests Oransky.

Some journals found that once an article was accepted, the researchers requested significant authorship changes – to the extent that the final set of authors had none in common with the set that submitted the work. Nevertheless, the corresponding author’s email address sometimes remained unchanged despite supposedly belonging to a different person.That, to us, was a big indication that [the paper] had been sold, maybe to a higher bidder,’ Alam says. ‘Maybe once it’s got accepted, the price goes up.’

Stock paper

What makes paper mills so hard to detect is that their manipulations are subtle and almost impossible to spot by looking at an individual manuscript. Their work often only become apparent when comparing several papers. ‘If we go back around 10 years, the common thing was that the image itself within the paper was manipulated – maybe they’ve flipped it or they’ve changed the contrast,’ says Alam. And often, attempted fraud came from the same researchers or institutions.

When you start seeing almost identical graphs across a series of different papers it raises suspicions

Laura Fisher, Royal Society of Chemistry

But T&F’s first brush with milled papers, which eventually led them to investigate hundreds of papers, found perfect looking images replicated across several manuscripts – all with unrelated authors and covering different topics. ‘What we thought was that these could be stock images,’ explains Alam.

This modus operandi lets paper mills fly under the radar of editors’ knowledge of image manipulation and standard plagiarism detection software. ‘They make sure that they avoid things like obvious text plagiarism,’ says Fennell. ‘One example of what we saw is graphs presented in a really similar way across papers with different authors. It appeared that the same template was used for graphs showing different things, with different axis or data labels,’ explains Laura Fisher, executive editor at the Royal Society of Chemistry’s journal RSC Advances. ‘On its own a graph would look legitimate, but when you start seeing almost identical graphs across a series of different papers it raises suspicions.’

In the case of the large paper mill discovered in early 2020, the image detectives came across suspect western blot images. Western blot is a common technique to detect specific proteins in samples taken from cells or tissues. ‘This was not a classical duplication, where the western blot bands themselves had been duplicated, it was the background,’ recalls Bik. What should be random noise showed a pattern replicated across hundreds of biochemical papers focused on regulations of various proteins and RNA in cancer cells.

An image comparing two fabricated images

Source: Retracted 2017 & 2018 Royal Society of Chemistry papers

The two western blots here come from unrelated papers with different authors but the bottom GADPH row appears to be identical in both

But retracting a manuscript based on suspicions is not an option. ‘Retraction is a really serious step and irreversible as well,’ says Nicola Nugent, publishing manager of quality and ethics at the RSC. ‘We have to get that right.’ Most publishers follow the Committee on Publication Ethics’ (Cope) strict guidelines on investigating potential fraud – though Alam highlights that Cope’s criteria don’t necessarily cover suspected manipulation on such a large scale.

Publishers’ requests for raw data to confirm the findings reported in the suspect manuscripts seemed to go one of two ways: either the authors didn’t respond at all, or they answered eerily fast even at the height of Covid lockdowns. ‘The other oddity was that all the responses we were getting had a similar tone, using similar language, similar phrases, even though technically they were coming from different authors,’ recalls Alam. What raw data the publishers managed to get was either chaotic – tens of files without clear labels – or simply copies of the images presented in the manuscripts.

Journals’ decisions to retract were rarely disputed. Some authors never broke their silence, while others agreed with a retraction due to themselves finding unspecified ‘problems with the data’. In T&F’s case, it seemed that word had got out. ‘We started to be proactively contacted by authors that we hadn’t contacted yet, but [whose manuscripts] fit similar [paper mill] features,’ says Alam. ‘They were contacting us to say that they found problems in their paper, they needed to retract or withdraw it.’

The RSC’s investigations, which took almost a year to complete, led to the retraction of 70 papers. Wiley has removed 55 articles so far, T&F around 40 and Portland Press another 31.

Doctored research

Many of the milled manuscripts found over the course of 2020 came from researchers affiliated with hospitals in China. Just like in many other countries, scientists publish papers to advance their career. But for researchers in China the incentives might be heightened as publications feature heavily in how scientific contributions are assessed. ‘One of the drivers of this so-called publish or perish culture is this need to have publications in peer reviewed journals – a perverse incentive for people to act in this way,’ says Nugent.

Without papers, you don’t get promotion; without a promotion, you can hardly feed your family

An unnamed Chinese doctor

For medical doctors in China their entire career might hinge on publishing in an international journal. ‘These are clinicians, they’re not interested in research, they want to help patients,’ says Bik. ‘They’re not given time off in their schedule to do research, and they often don’t work in a hospital that has a research facility, they don’t have money to do research.’

On the For Better Science blog, Tiger BB8, one of a number of pseudonymous image detectives that uncovered hundreds of suspicious papers, describes an email they received that apparently came from a desperate junior doctor. ‘Without papers, you don’t get promotion; without a promotion, you can hardly feed your family,’ the email stated. In between caring for patients and spending time with their children, the writer didn’t have any time left to do research even if they wanted to. ‘The current environment in China is like that,’ they added.

According to Science’s 2013 investigation, doctors pay anything from $1600 (£1150) to almost $15,000 for the privilege of putting their name on a milled manuscript. Authorship on the most prestigious fake papers is valued at around $26,000, which exceeds the annual salary of some assistant professors in China. A 2013 analysis estimates that overall, the ghostwriting industry in China generated $4.46 million in 2011.

An image showing an image fabrication comparison

Source: Retracted 2018 Royal Society of Chemistry papers

Histologic sections of kidney tissue from two unrelated papers (top/bottom) with no authors in common. Three of the slides are duplicates that have been repositioned and rotated 

In 2017–18, China introduced harsh punishments for researchers involved in fraud, as well as sweeping policy changes to stamp out academic misconduct. Responsibility for investigating and ruling on misconduct cases was transferred from individual institutions to the Ministry of Science and Technology, who drew up a blacklist of ‘poor quality’ journals. The government also announced changes to how academic performance is assessed. However, there’s still little enforcement of policies, finds an analysis by communications scientist Jianping Lu from Zhejiang University, China.

The problem with so much fraud apparently coming from the same country is that it undermines science’s openness and makes researchers more likely to dismiss genuine work from scientists in China. ‘My concern is also that it threatens diversity,’ says Fennell. ‘If people aren’t sure what content to trust, people may start pigeonholing research from certain countries.’

Arms race

As so little is known about paper mills stopping them at the source is currently impossible. Consequently, publishers are now trying to stop more fake manuscripts from polluting the scientific literature. The RSC has introduced in-house checks for patterns that could indicate manuscripts written using a template, adapted editors’ training and set stricter requirements around data like western blots. Sharing what it’s learned from the paper mill incident with other chemistry publishers, the RSC is also exploring collaborative options like an early warning system for suspected research fraud.

Springer Nature has developed a database to allow for cross-journal interrogation, says the publisher’s research integrity director Suzanne Farley. ‘Pulling information from approximately 2000 journals, this database supports early flagging in the submission system by enabling us to map networks of email addresses, author/peer-reviewer names and article titles associated with suspected paper mill submissions,’ she explains. Publishers are also working on ways to share information on suspected misconduct between them without compromising data protection.

Many publishers work with external image experts or, like Wiley, stood up their own dedicated images team. ‘These colleagues currently screen images for more than 24 journals,’ says Chris Graf, Wiley’s director of research integrity and publishing ethics. ‘So far, they have screened close to 2000 papers prior to acceptance for publication.’ T&F also piloted software to help spot manipulated data, though Alam notes they have yet to find a program that can spot stock images, as well as more straightforward photoshopping.

Nevertheless, there is a risk that paper mill deterrents – such as requiring raw data with each submission – make the already difficult publishing process even more cumbersome for genuine researchers, notes Fennell. ‘We’re always trying to find that balance of not trying to punish the many for the deeds of the few. The last thing you want to do is reject real papers from real authors because they happen to share a characteristic.’

Journals might simply have to face the reality that increasing quality control means publishing fewer papers, says Oransky. He credits those publishers that go through the process of investigating and retracting manipulated manuscripts. ‘It’s the ones that aren’t retracting them, that I’m much more concerned about,’ he says.

Letting fraudulent work become part of scientific record not only undermines researchers’ trust in each other’s work, but also damages the public’s trust in published data. ‘If someone publishes a paper that says a certain drug can do a certain thing, and someone takes that material further down the research route believing this result, it has the potential to slow down legitimate research,’ says Fisher.

Correction: On 25 May 2021 the name of the pseudonymous detective who received the email from a junior Chinese doctor was changed from Smut Clyde to Tiger BB8.