Crowdsourcing image analysis to better understand how viruses multiply inside cells
Scientists love data, but sometimes there’s too much to handle. Fortunately, a collaboration between the citizen science platform Zooniverse, the UK’s national synchrotron facility and the University of Oxford, UK is demonstrating the power of the public when it comes to the analysis of large data sets.
The Science Scribbler: Virus Factory project brings together researchers who are trying to better understand the lifecycles of viruses by looking deep inside infected cells and exploring the interaction of viruses with their hosts. The synchrotron team at Diamond Light Source are using electron beams to explore the insides of cells infected by reoviruses – a large family that infect animals and plants but don’t typically cause any symptoms.
‘We wanted to start with reovirus because it’s a tamer virus and doesn’t have some of the containment level issues of other viruses, and also because it’s recently started to be used to help with cancer treatment,’ says Michele Darrow from Diamond Light Source. The team use a technique called cryo-electron tomography to take images of multiple cross sections within a cell and then piece them back together to form a 3D volume for analysis.
First, it’s necessary to prepare a sample. The researchers grow and then infect cells on a 3mm disc called a grid. After a fixed time period, the cells are frozen very quickly to prevent the formation of crystals that could interfere with the imaging. The frozen cells are too thick for an electron beam to penetrate, so a beam of gallium ions is used to blast away parts of the cell and reveal a layer that’s around 100nm thick and primed for cryo-electron tomography.
The trimmed-down cell can be imaged in less than an hour, but the data produced from just one sample could take a researcher a month to annotate fully. That’s why the researchers are relying on the power of the public, Darrow explains. ‘For an individual researcher, it becomes an insurmountable task to mark out every single virus and also segment the entire cellular components behind the viruses. We absolutely need citizen scientists to do this project.’
The team invite members of the public to visit their online project and ‘segment’ the image data by making marks to designate one region of the image as distinct from another. Launched in early 2019, the first part of the segmentation is already complete. More than 1000 volunteers have finished identifying and circling viruses in the data set and are now busily classifying the virus types by comparing them to images provided by the researchers.
The great thing is that anyone can get involved, Darrow says. ‘This task is really nice for work with citizen scientists, because you can see a visual difference without necessarily knowing what that corresponds to. So, that makes it much easier for people to get involved in the project – they don’t have to be biology experts or chemistry experts or anything like that. They just need to be able to look at images and say “this looks different to this” and then maybe place some marks to indicate that to us.’
The team have worked hard to ensure that the data is generated by citizen scientists is robust. The large images generated in cryo-electron tomography are cut into smaller pieces that are easier to analyse, and each part of the larger image features a number of times in the data set so that different citizen scientists look at the same image and hopefully reach a consensus.
This crowdsourced approach is helping to speed up analysis of the vast amounts of data generated by the researchers each day, and also to create training data for artificial intelligence systems that may be able to perform the data segmentation process in the future.
But Darrow doesn’t think that humans will be replaced entirely. ‘It’s unlikely that we’re going to be able to fully replace the human eye or the human interaction with the data in a lot of these tasks. What we want to do is make it easier.’
Enter the Zooniverse
Funded by The Wellcome Trust, the researchers are also integrating some of their own software into the Zooniverse platform. The software uses machine learning to help researchers speed up segmentation, meaning that researchers can simply mark a dot inside a shape and rely on the software to find the edges. With Zooniverse, the team hope that anyone who wants to start a segmentation project there will be able to do more science and faster.
This software will also help with the next stages of the Virus Factory project. Once all of the viruses in the dataset have been classified, citizen scientists will concentrate on cellular components in the data. Because the cellular components have more variance in size and shape, it may be more challenging to first segment and then classify each component. Hopefully, the segmentation software will allow volunteers to simply scribble in the centre of structures and rely on an algorithm to extend their scribbles to the edge of the structure and trace the outline. Then, once all viruses and cellular components have been segmented and classified, the data set will be analysed together. ‘That’s the whole goal,’ explains Darrow. ‘We want to be able to see viruses within the cell context so we can better understand their lifecycle and how they interact with the cell to make virus factories.’