Inside the race to map the coronavirus

Inside the race to map the coronavirus
© Getty Images

Public health and academic laboratories across the country are mapping tens of thousands of genetic sequences of the coronavirus in a first-of-its-kind effort to track the way the virus is spreading and mutating.

The unprecedented pace of genetic sequencing, called Sequencing for Public Health Emergency Response, Epidemiology and Surveillance — or Spheres — is organized and overseen by the Centers for Disease Control and Prevention (CDC). Scientists said it will help them understand how the coronavirus transmits between people, and how well treatments and nonmedical interventions are working to slow its spread. 

It is also a reflection of a massive leap in medical technology, one that has given laboratories in all 50 states the ability to quickly sequence a genome. As the coronavirus began spreading across the country in February and March, the CDC's Office of Advanced Molecular Detection began coordinating those labs to maximize the amount of data they were generating.

“There were a number of academic centers, there were a number of clinical centers, there were a number of commercial laboratories all starting to sequence [the coronavirus] in a public health context,” said Duncan MacCannell, who heads the Spheres program. “We needed to get them rowing in roughly the same direction.”

The Spheres program includes 37 state and local public health laboratories, 25 academic institutions, 22 corporate labs and nonprofit public health and research laboratories like the Broad Institute, the Gates Foundation, the Fred Hutchinson Cancer Research Center and the Chan Zuckerberg BioHub.

Those labs have already sequenced thousands of samples. The World Health Organization said last week that scientists around the globe had uploaded about 40,000 samples onto an internet database; about 7,000 of those samples are from the U.S.

The coronavirus is an RNA virus comprised of about 30,000 nucleotide lines, MacCannell said, a relatively short genetic code. Research has shown that it accumulates a new change in its RNA, or a mutation, about once every two weeks. Mapping those changes can give scientists clues about where a specific line of transmission came from.

“What you can do with some of these tools, once you have the sequence of the virus, is you can actually follow the trail of breadcrumbs backwards. You use that to understand how they got from point A to point B,” MacCannell said. “It really helps us understand how the virus is transmitting. We can understand where the virus is and how it’s getting there.”

Sequencing viral genomes became a major tool in the fight against the largest Ebola outbreak in modern history, in three impoverished West African countries in 2014 and 2015. There, scientists used the genomes to identify specific clusters of the Ebola virus, giving epidemiologists the ability to track and trace chains of transmission to understand how the virus was spreading.

Today, the Spheres program has already shed new light on how early the virus gained a foothold in North America. While President TrumpDonald John TrumpNearly 300 former national security officials sign Biden endorsement letter DC correspondent on the death of Michael Reinoehl: 'The folks I know in law enforcement are extremely angry about it' Late night hosts targeted Trump over Biden 97 percent of the time in September: study MORE touts travel restrictions on China and the European Union, it has become clear that those limitations came far too late and were far from sufficient to stop the virus from arriving on American soil.

In Oregon, a state with relatively few confirmed coronavirus cases, scientists have identified at least 13 different cases in which the coronavirus was introduced. Some of those strains appear to have migrated south from British Columbia and Washington state, probably as a result of people traveling from China. Others have come from the New York area, likely imported from travelers returning from Europe.

“It's pretty clear that there was a chain that spread throughout the Pacific Northwest in the early phases in March. We see in Oregon and Washington and British Columbia and Saskatchewan, that is kind of our major transmission chain right now,” said Brian O'Roak, who heads the genomic laboratory at Oregon Health Sciences University, which is participating in the Spheres program. “The chains that have been developing on the East Coast, as those have diverged so that we can tell them apart, multiple different chains of the virus have migrated to Oregon and introduced themselves again.”

Other states have tallied dozens or hundreds of introductions on their own. After Trump announced a travel ban for Europe, thousands of Americans — some of whom were infected with the virus — raced to board planes headed home, seeding the virus even more widely across the country.

“You’re seeing these stories of multiple introductions. It really shows the interconnectedness of the world,” MacCannell said.

The rapid expansion of sequencing technology has enabled scientists to map a genome in hours, where it once took months or years. What scientists are now finding is a virus that has not significantly mutated over the course of its spread. Where a virus like influenza can evolve rapidly, that isn't happening in the coronavirus, so far.

Only about 10 nucleotides out of the 30,000 or so seem to be different between specimens, said Ben Bimber, a research assistant professor at OHSU who is working in O’Roak’s lab.

“The coronavirus does have more mechanisms, more proofreading to prevent mutations, so it does mutate less than some RNA viruses,” Bimber said. “It definitely does evolve within a time that we can measure. It's not totally clear what that's going to mean for the pathogenesis across the population, but it does give us something to help track it.”

Keeping an eye on how quickly and how substantially the virus mutates will also allow scientists to adjust treatments or vaccines. 

“If there does happen to be a new variant that evolves that is important for our resistance or how infectious the virus is, we're able to see that in real time,” O'Roak said. “At some point, there could be a new variant that would really make a divergence between the current SARS-CoV-2 strain and a new strain. But so far we haven't identified anything like that.”

MacCannell said understanding more about individual specimens of the coronavirus gives scientists a powerful new set of data-driven tools to track its spread. Scientists have been able to track some lines of transmission all the way back to initial outbreaks in China, he said.

“The pandemic, SARS-CoV-2, has been a powerful story of open data,” MacCannell said. “The more we share data, the more types of data are available for public health and research, the better prepared we are, both for this one and the next one.”