Nature Reviews Microbiology 11, 150 (March 2013) | doi:10.1038/nrmicro2979
GENOME WATCHSherlock Genomes — viral investigator
This month's Genome Watch highlights how deep sequencing technologies have vastly reduced the time and prior knowledge needed to generate viral genomes.
Deep sequencing technologies and state-of-the-art bioinformatics techniques have revolutionized the way that RNA viruses, a notoriously variable group of pathogens, can be identified and characterized. Traditionally, specific information about the viral genome was required to design primers for the reverse transcription and amplification of viral RNA prior to sequencing. In addition, a reference genome was often used to assemble short read sequences into a complete genome. However, recent methodological advances have negated some of these prerequisites.
Gall et al.1 developed a method to generate full genomes of HIV-1. They designed a 'pan-HIV-1' reverse transcription PCR primer set, based on sequences available in the Los Alamos HIV Sequence Database, that could give rise to four overlapping amplicons from any HIV-1 virus. The addition of multiplex identifier (MID) adaptors (tags that enable the source of each read to be identified) meant that many samples could be sequenced simultaneously on deep sequencing platforms. By using de novo assembly of short reads, the authors showed that this method could generate full genomes for viruses within all four major HIV-1 genetic groups. This shows the potential of deep sequencing for high-throughput studies involving HIV-1 genomes with a broad range of sequence diversity.
The high mutation rate of HIV-1 means that an infected individual can be host to many viral genome variants. Some of these can confer drug resistance, which in HIV-1 can involve mutations in different genes along the whole genome. The protocol described above offers a read depth that would allow the detection of low-frequency genotypes, facilitating the analysis of rare mutations associated with drug resistance, unlike capillary sequencing, which offers low read coverage.
Even when almost nothing is known about the agent of a viral infection, deep sequencing technologies can identify a pathogen and produce the full genome, fast. In 2012, a Saudi Arabian patient was admitted to hospital suffering from acute severe pneumonia of unknown cause. Preliminary diagnostics indicated that he was infected by a coronavirus (CoV), a single-stranded RNA virus that can infect many species, with bats being an important zoonotic reservoir. Six CoV species have been detected in humans, although only one was known to cause severe disease: severe acute respiratory syndrome (SARS)-CoV.