Riding the RAE rollercoaster

No comments

UK academics will soon be bracing themselves for the 2008 research assessment exercise, the last of its kind before a hotly debated metrics system takes over.

The UK’s Research Assessment Exercise (RAE) is due for an almighty shake-up. For two decades, the RAE has used peer review to assess the quality of university research in preparation for Treasury funding. Yet in his 2006 budget and 2007 pre-budget reports, chancellor Gordon Brown announced that quantitative measures - ’metrics’ - would be used to assess research quality and guide funding across the sciences from 2009 (see Chemistry World, January 2007, p13).

FEATURE-p42-METRICS-300

Source: © Stockphot

RAE game playing can cause mass movement of academics at key times

The metrics policy, first officially signalled in the government’s Science and innovation investment framework 2004-2014, has attracted much heated speculation. Likely measures of quality include a department’s research income, its numbers of PhD students, and its citation rates, but nothing has been confirmed in detail. The Higher Education Funding Council for England (Hefce) is due to report on progress working out the new arrangements to government by 30 September, and will consult on these this year.

Governments outside the UK are taking an interest too: Australia, for example, is planning to introduce its own metrics-like assessment system.

Meanwhile, the traditional peer review system is permitted one last appearance, in a modified 2008 assessment, which will inform treasury funding for the sciences up to 2010-11.

For chemists, a shift to metrics may bring welcome relief from the time-consuming and onerous RAE, first praised but increasingly criticised for its effect on UK research. Previous RAEs, held in 1986, 1989, 1992, 1996, and 2001, graded university departments’ research quality on what is now a nominal seven point scale. The ratings were awarded by panels of experts who read through research papers submitted by every academic a department chose to enter. It now takes departments at least two man years to prepare all their papers for the RAE, estimates Paul O’Brien, head of chemistry at the University of Manchester, UK.

FEATURE-p44a-METRICS-300

According to Hefce, the RAE system has improved UK research quality

For their pains, universities receive government cash in proportion to their departments’ RAE ratings; ?1.45 billion of so-called ’quality-related’ funding (QR) will be doled out in 2007-08. While researchers can also bid for around ?1 billion from the UK’s seven research councils for specific projects, and can obtain money from charities or industry, the QR is a stable source of income that can be spent on anything at all, from supporting blue-skies research to maintaining laboratory infrastructure.

FEATURE-p44b-METRICS-350

There is a strong correlation between a chemistry department’s RAE rating and metric indicators

According to Hefce, the resulting departmental drive to boost RAE ratings has improved UK research quality, as measured by the country’s international citation performance. Yet a House of Commons science and technology committee select report concluded in 2004: ’The operation of the RAE has been detrimental to the provision of science and engineering in the UK’.

Marketing research

What damaging effects has the RAE had? Inevitably, universities have used RAE ratings as a management tool, leaving academics distinctly uncomfortable. With the focus of competition has come a market-oriented approach to research. Although research active chemists have gained from better pay, says David Clary of Oxford university, UK, they are whisked away in transfer windows to contribute to other departments just before RAEs are judged; O’Brien, for instance, moved from Queen Mary, University of London, UK, to Imperial College London for one RAE, then to Manchester university, UK, for the next. As a result, ’people talk about planning research for the RAE, not for knowledge generation,’ laments Eric Thomas, vice-chancellor of the University of Bristol, UK. Meanwhile, Paul Walton, head of chemistry at the University of York, UK, feels that the RAE encourages departments to prize short-term quantity over research quality. Michael Sterling, vice-chancellor of the University of Birmingham, criticised too the RAE’s inhibition of interdisciplinary research, and a lack of support for teaching and applied research.

Particularly under fire has been the game playing adopted by departments to ensure as high a rating as possible. The transfer market for academics, for example, has become ’unnatural’, says Thomas, with the star researchers moving to the more highly-ranked universities on a cycle mirroring the RAE; far fewer jobs are available during the in-between years, says Clary. Other academics suggest that departments have also spent much thought choosing which researchers to send forward to the RAE; holding back more teaching-oriented academics, for example, or deciding to submit fewer, more highly rated researchers rather than send in many average ones.

Even the peer-review panels appeared to be playing games: in 2001, 56 per cent of physics departments were awarded the two top grades (5 and 5*), while only 42 per cent of chemistry departments and just 12 per cent of environmental science departments achieved the same distinction. O’Brien suggests this difference indicated not that physics in the UK was in a healthier state; rather that chemists are often self-critical by nature, and a pessimistic chemistry panel was stricter with what constituted a top rating - supposed in principle to be a standard cross-discipline definition of national or international quality research. ’Chemists circled the wagon and fired the arrows inwards,’ he says. The lower panel grades meant funding was more selectively directed to a few highly-ranked departments, and also reduced chemistry departments’ esteem in the eyes of student applicants and grant awarders.

Peer review’s last stand

The 2008 RAE will try to mitigate some of these problems, though it is still based on panel peer review. Its solutions: a two-tier panel structure and an entirely new grading system. Chemistry now gets a 15 member sub-panel, which carries out the research assessments but reports to a super-panel overseeing chemistry, physics, earth and environmental sciences. Julia Higgins, chair of this super-panel and principal of the faculty of engineering at Imperial, explains that the changes were made to keep panels talking to each other, avoiding the problems of separate grade standard definitions across different subjects, and encouraging more focus on interdisciplinary research. Three international members on the super-panel, for example, help define the meaning of research of ’international excellence’ (the highest grade) across all subjects; previously, international advisers sat isolated from each other in smaller subject panels. ’Discussion at every stage is the key,’ Higgins says.

30 September 2007: Deadline for Hefce to report to government on progress in developing new metrics system, in time for the 2008 pre-budget report
30 November 2007: Deadline for 2008 RAE submissions
December 2008: 2008 RAE quality profiles (results) published
February 2009: Funding allocations from 2008 RAE announced. Shadow metrics system tested against 2008 RAE results
2010-11: Metrics system begins to inform funding in the sciences
2013: First ’lighter touch’ peer review assessment for non-science subjects, probably informing funding in 2014-15

More significantly, the once all-important departmental RAE rating has been discarded. It is replaced with a ’quality profile’, showing the individual scores (rated from unclassified to 4*) for every academic submitted by a department. In principle, that should make it easier for a selective funding system to discriminate more gently between departments, on a sliding scale connected to the profile, rather than via drastic weighting differences between crude 4, 5 or 5* ratings. It should also reduce the departmental impact of the loss or recruitment of one big name researcher, says Jeremy Sanders, chair of the chemistry sub-panel, and encourage departments to submit more academics to the assessment.

The exact relation between quality profile and subsequent government funding will determine whether the 2008 RAE succeeds in such intended reforms. But departments won’t find that out until February 2009, after they have been assessed. That’s because Hefce can’t make a final decision until it sees both the research quality profiles and the block grant allotted by the Treasury, says Paul Hubbard, Hefce’s head of research policy.

Taking pot luck

This frustrating lack of information highlights many researchers’ biggest gripe with the RAE, in both past and present incarnations. Although its results inform government funding, the crucial connection between the two is never clear beforehand. Even RAE panel members, for example, were surprised by the impact of their grading decisions on departments’ subsequent quality-related income. ’In 2001, it was never made clear how the RAE grading was going to be used - that 5* departments were going to get substantially more money than 5,’ one member told Chemistry World.

Hefce’s decision to reduce selectively QR funding year on year to lower graded departments was partially to blame for the closure of chemistry departments at King’s College London and Exeter, says Higgins. They both missed out on the top grades (then, 5 and 5*) in the 2001 RAE, and were shut down in 2003 and 2004 respectively. ’In 2003, we did reduce funding for the 4 rating relative to 5, but it was all we could do with the money available. We were working with a fixed pot of funding and wanted to protect the higher ratings,’ says Hefce’s Hubbard. As Bristol vice-chancellor Thomas points out, this was unfortunately damaging to smaller universities who couldn’t afford to sustain their chemistry department until the next RAE.

Part of Hefce’s problem is that it can only distribute a limited amount of cash. The total QR for each subject is decided by two factors: the intrinsic cost of the subject’s research (chemistry is in the highest band), and the numbers of research-active staff in all departments. If chemistry staff volumes drop, there is lower funding all round, regardless of each department’s quality. That, says Sterling, shows up a lack of government policy to sustain the pot for subjects like chemistry, which are ’strategically important to UK plc’. Back at Hefce, Hubbard says: ’if there were severe and well founded concerns about the national volume of research in chemistry, then we would do something about it.’ Meanwhile, Rama Thirunamachandran, Hefce’s director for research and knowledge transfer, points out that the government can strategically support science research in other ways: for example, it recently announced plans to spend ?75 million over three years for university science education ( Chemistry World, December 2006, p2).

Metric conversion

The need for a central funding strategy based on policy is one reason why Birmingham’s vice-chancellor Sterling supports the new metrics system for research quality assessment. It makes it easier, he says, for the government to encourage departments to focus research in one direction. For example, applied research (neglected, some feel, by the peer-review RAE system) could easily be encouraged by increasing the relative importance of applied research grant income towards a department’s RAE rating.

For many others, including the government, saving time and money are the most persuasive reasons to shift to a metrics based assessment. As Alan Johnson MP explained to the House of Commons select committee on education and skills: ’When I was Higher Education minister [2003-04], I was amazed, quite frankly, that we spent all this money and took all this time - something like 82 different panels and committees - on something that could be done much more quickly.’ Hefce has estimated that the cost to institutions of the 2008 RAE will be at least ?45 million. And because each RAE takes so long to administer, it can never hope to be timely and must always reward past performance. Hefce’s QR funding this year is still apportioned according to the 2001 RAE ratings, which in turn were based on research papers published as far back as 1996. A metrics system, by contrast, could run every year.

Chemistry could fit well into a system of metrics. Historically there’s been a strong correlation between a chemistry department’s RAE rating and simple metric indicators such as research income. As Sanders points out, even the current peer review system uses metrics such as a department’s grant income or spending on facilities. ’Metrics will work by replacing expensive, time-consuming, wasteful and inaccurate panel-based re-reviewing of already peer-reviewed research, using quantitative measures - many of which have already been shown to correlate with the panel-based rankings,’ thunders Stevan Harnad of the University of Southampton.

Quality by numbers

Not all academics take such a cut-and-dried view. Higgins feels that some element of peer review will always be needed. ’Metrics can’t evaluate a chemistry department having five Nobel Prize winners all doing theoretical chemistry - with no students and little applied output, but the most brilliant department in the country,’ she says. The hypothetical caricature raises some interesting issues: how would a metrics system measure esteem and scholarship, for example? High citation rates could mean researchers published scholarly research, or that they published reviews of the literature, or had the fortune to publish in controversial areas. Journal impact factors, similarly, may not reliably indicate research quality. ’There is no substitute for expert judgement in assessing research being undertaken by a university department,’ says Ole Petersen, chair of the Royal Society working group on the RAE.

FEATURE-p46-METRICS-380

Thinking hats on: could metrics take away some of the pain of the RAE?

Such objections hold especially true for non-science subjects, which face a mysterious ’lighter-touch’ peer review system under the government’s proposed changes. But Sean McWhinnie, science policy manager for the Royal Society of Chemistry, cautiously welcomes the new system. Some element of peer review could still lie behind a metrics-based system in the sciences, he points out. For example, expert panels could decide which metrics would best summarise a subject, and how their relative importance might be weighted; problematic metrics such as citation rates could be regularly reviewed.

Departmental game-playing - massaging of submitted research papers to apparently improve one’s rating - will no doubt continue. But that may be unavoidable under any selective funding assessment system, and a sufficiently large mixture of metrics could minimise the problem, says McWhinnie.

An assessment system based purely on metrics is not the only possibility. McWhinnie suggests that, for example, metrics could monitor university departments’ progress until a significant quality change is spotted, at which point peer reviewers could step in for a more detailed analysis. ’If I was king for a day, I’m not sure what system I’d choose,’ says O’Brien.

Chemists will have to wait to see whether the RAE’s successor truly provides its promised cost and efficiency benefits, without compromising on research quality. The first hints might be provided by a shadow metrics exercise, which will run just after the 2008 RAE for comparison with the peer review results. Aldo Geuna, of the University of Sussex, who has studied many different countries’ research evaluation systems, suggests that while any performance-based funding system’s initial benefits outweigh its costs, over time the system seems to produce diminishing returns, as competing departments start learning to play the game. More important than any particular assessment strategy - be it metric or peer review based - is the amount of funding it can distribute. The more cash is invested in science education, the easier it is to ensure a high quality of research.

International quality

The UK is a world leader in defining research assessment, says Aldo Geuna. Other European countries have followed its lead, developing assessment schemes steering various paths between peer review and metrics based systems. The Netherlands, notably, uses an RAE-like assessment to drive up the quality of university research, but not to inform funding. New Zealand’s ’Performance-based research fund’ (PBRF), meanwhile, has been in place for several years, using peer review mingled with metrics to measure quality.

Australia is now introducing a metrics-related RAE, called the Research Quality Framework (RQF). The RQF initiative was announced by prime minister John Howard in May 2004, as part of the $8.3 billion (?3.3 billion) Backing Australia’s ability package. ’Why do we need it now? The overwhelming view amongst universities and academics is that we don’t and that it is a waste of many millions of dollars,’ says Arthur Sale, who works in the University of Tasmania’s computing department.

The aim, again, is to selectively allocate resources to the highest quality departments; the RQF replaces a ’research quantum’, which counted the volume of departments’ research grants, publications and degrees and allocated some funds as a result. Full guidelines are to be released in June 2007, but the current plans suggest that assessment panels will give submitted research from each department a ’quality’ and, separately, an ’impact’ rating, with the ’quality’ rating based on citations and other metrics. The new system will be far more comprehensive than the old research quantum, but it is likely to contain more peer review, not less; a reversal of the UK’s situation. First university assessments are due 30 April 2008, to affect 2009 funding.

In the US, mention of an RAE receives only blank looks. But as Julia Higgins points out, most big US universities use endowments to fulfil their basic research needs - these effectively take the place of UK universities’ block government grant.