In December 2019, an unusual respiratory illness emerged in Wuhan in China. Initial cases were linked to a ‘wet market’ selling seafood and wild animals. Now, over a year on with more than 99 million confirmed cases and nearly 2 million deaths worldwide, Covid-19 is the worst global public-health crisis since the 1918 flu pandemic. As it continues to disrupt our lives, many questions still remain unanswered about Sars-CoV-2 – the virus behind the disease. We still don’t know exactly why it drastically affects some people more than others or how long immunity might keep it at bay.

As soon as we heard about this virus, we got ready to work with it

Andrew Davidson, University of Bristol

But given the virus was discovered only a year ago, scientists already know an astonishing amount about it. Within days of its detection, researchers, clinicians and academics began churning out studies that would help governments to slow the wave of community infection and allow tests, drugs and vaccines to be developed at unprecedented speed. To date more than 287,000 Covid-related papers have been published.

‘There have been some very difficult days so far in the pandemic but seeing the research community come together to work towards a common goal has been really exciting,’ says Jason Kindrachuk, who studies emerging viruses at the University of Manitoba, Canada. ‘Being able to look at the genome sequencing in early January [2020] gave us some very quick glimpses into what this virus was, how it might be spread and what receptor it might use to gain access into our cells.’

Pinning down the pathogen

Efforts to sequence the virus kicked-off at the end of December 2019, when the Chinese Center for Disease Control and Prevention (CDC) first stepped-in. Hospitals in Wuhan were receiving growing numbers of sick people with serious pneumonia-like symptoms, including fever, dry cough and difficulty breathing. Doctors noticed similarities to severe acute respiratory syndrome (Sars) an infection caused by a coronavirus, which broke out in China between 2002 and 2004. Sars killed 11% of the 8422 people known to contract it. That virus was brought under control but no vaccine or antiviral exists. Chinese authorities were understandably alarmed.

CDC scientists worked around the clock to identify the mystery pathogen, first extracting RNA from fluid samples taken from the lower respiratory tract of several patients.1 The researchers ran the pathogen’s RNA through a polymerase chain reaction-based test, that can identify 22 known respiratory pathogens. The test drew a blank. It had to be something new.

To find out what, the extracted RNA was used as a template to clone complete genomes of the culprit. These were then sequenced and compared with known pathogens. The results revealed it was a novel coronavirus with an 85% sequence match to the virus that causes Sars. It was also 96% similar to a non-human coronavirus found in bats, offering clues as to where it first evolved. Within days of the CDC initiating its emergency response, the first draft genome sequence was posted online. Other sequences swiftly followed. Researchers around the world could now begin unpicking what makes this new pathogen tick.

Coronavirus clues

Knowing it was a coronavirus, and one that closely resembled the Sars coronavirus, gave scientists a head start. ‘As soon as we heard about this virus, we got ready to work with it,’ says Andrew Davidson at the University of Bristol. He has studied coronaviruses for the past 20 years and at the beginning of the pandemic worked in one of the few UK labs equipped to contain such hazardous pathogens. ‘Because it is so similar to the Sars coronavirus we could tap into all that prior knowledge. 

Our top priority was to create a stabilised spike protein for use as a vaccine antigen

Jason McClellan, University of Texas

Coronaviruses have been studied for decades. Most of them persist within animal populations, particularly birds and bats, but six were already known to infect humans. Four of these invade our upper airways producing mild common cold-like illnesses. But the two more recent and lethal varieties to emerge prefer to replicate in our lungs where they can very damaging. Sars is one of them. The other is Middle East Respiratory Syndrome (Mers) first seen in 2012, and the most lethal – 35% of the 2468 confirmed cases in 2012 resulted in death.

‘This new virus seems to fit in a niche between mild coronaviruses that cause the common cold and the more lethal Sars and Mers coronaviruses,’ says Davidson. ‘It seems to be spread more easily from the upper respiratory tract but because it can infect the lower airways it can cause severe disease too.’ This double whammy of infectiousness and potential severity is why it has caused such a problem. Although less lethal than Sars and Mers coronaviruses, Sars-CoV-2 has been so infectious that the number of deaths is far higher.

An image showing a coloured TEM for coronavirus particles

Source: © Science Photo Library

Transmission electron micrograph of Sars-CoV-2. The virus’s spike proteins create the distinctive corona that gives this class of virus its name

Belonging to a diverse family of viruses, coronaviruses have at their core a small set of genes encoded on a sense single-strand of RNA. Fatty lipid molecules form a ball-shaped shell that encases this RNA, which explains why hand washing with soap or alcohol-based sanitisers was quickly recommended because these break down lipids, destroying virus particles. Sars-CoV-2’s outer surface is studded with club-like protrusions called spike glycoproteins. It’s these carbohydrate-containing proteins that give it a corona under the microscope. These spikes are the key to how these viruses gain entry to host cells. ‘Our top priority was to create a stabilised spike protein for use as a vaccine antigen,’ says Jason McClellan at the University of Texas. He has studied coronavirus spikes since 2013. ‘The spike structure is important for understanding the antibody response to natural infection and vaccination.’

With the virus’s genome sequenced, scientists could predict which proteins it encodes and begin to understand the spikes. Genetic analyses and computer modelling early on hinted that the cell entry mechanism of Sars-CoV-2’s spike proteins was much like the Sars virus’s. In Sars, the virus’s spikes were known to stick to a claw-like receptor protein called Ace2. This is involved in regulating the cardiovascular system and is found in high levels on the surface of our lung cells. But it’s also on cells in our upper airways, the heart, kidneys and digestive tract. Once the virus latches on, the spike proteins split apart allowing the virus’s lipid layer to fuse with its host’s cell membrane. The virus then releases its RNA into the cell, effectively hijacking the cell’s machinery to produce more virus particles. These new virions disperse from the dead cell – a process called viral shedding – to infect more cells or be transmitted to other people via respiratory droplets.

Efficient entry

Several structural studies soon confirmed independently that Ace2 is indeed how Sars-CoV-2 enters our cells. However, when McLellan’s lab stabilised protein spikes and mapped their 3D structure using cryo-electron microscopy – barely within a month of the genome being sequenced – it was discovered that Sars-CoV-2’s spikes bind 10 to 20 times more tightly to Ace2 than the original Sars virus.2

An image showing a timeline of coronavirus-related events

Not long after, Jian Shang and colleagues at the University of Minnesota identified why Sars-CoV-2 bound more tightly. First, a binding domain on the spike has a more compact shape that fits more snugly to Ace2. Second, the spike has certain amino acid residue changes that stabilise two viral binding hotspots.3 This offers clues as to why Sars-CoV-2 transmits so well. The easier it is for the virus to latch on to a cell, the better it can infect and spread.

‘It is astounding to think how exquisitely important this relationship between a viral protein and a host receptor is to community transmission and pre-symptomatic infection,’ says Kindrachuk. ‘I think back to those early days of the pandemic when we were talking of outbreaks on cruise ships, and the role of close quarters in spread, and to then overlay how quickly we were able to determine the central role of the spike protein aligning with Ace2 as a key event in transmission.’

These assessments of the spike protein have informed the rational design of vaccines that have literally moved from bench to bedside in less than a year

Jason Kindrachuk, University of Manitoba

Meanwhile, other studies, including one that Davidson collaborated on, have independently found that Sars-CoV-2 binds to another protein receptor on our cells called neuropilin-1 (NRP1).4 ‘It’s not the main receptor but it could certainly help the virus get into cells more effectively,’ Davidson explains. The study he worked on showed that while deleting NRP1 from the cell doesn’t prevent the virus entering, it’s presence gives the virus an extra foothold to latch on to cells. ‘Other studies have shown that neuropilin-1 seems to be at enhanced levels in respiratory tract and olfactory systems [where the virus initially enters our body],’ he says. According to another study, this receptor could also help the virus invade the nervous system, which could help explain taste and smell loss that’s been reported in many Covid-19 cases. These symptoms are also thought to be associated with abundance of Ace2 and a protease – an enzyme which breaks down proteins – called TMPRSS2 that maintain the integrity of olfactory sensory neurons.

This protease is another requirement for Sars-CoV-2 to infect our cells. Studies by German researchers showed that the virus needs TMPRSS2 and another protease in particular. It was previously known that the Sars virus depends on TMPRSS2 from host cells to activate the spike after it has bound to Ace2. This activation allows the spike to split and alter its shape, exposing a site that enables the virus to fuse with and enter the cell.

However, the German team discovered that the Sars-CoV-2 virus also depends on a furin protease to help split the spike more easily.5 This occurs at a furin cleavage site on the spike, which is missing in the original Sars virus, suggesting another reason why Sars-CoV-2 is so much more infectious. Scientists think that the fact that this enzyme is found throughout the body could be a key factor that enables the virus to infect both lower and upper airways, as well as other parts of the body.

Now, as more transmissible mutant strains are circulating, knowledge of the spike protein and how it might mutate is more valuable than ever to suppress the virus. Davidson suggests the virus is now so widespread that eradication is unlikely. Rather, it will probably become seasonal requiring new vaccines each year like the flu. But the fact that vaccines now exist after just over a year is testament to what scientists can achieve.

‘In thinking back to those early days in January 2020, the most important questions at that point were “What was this virus and how is it being transmitted?”’ says Kindrachuk. ‘These assessments of the spike protein have informed the rational design of vaccines that have literally moved from bench to bedside in less than a year. That is something that is truly historic.’

Correction: The passage on the CDC scientists’ detection of the virus was updated on 27 January 2021