Volume 23, Number 7—July 2017
Phylogeography of Burkholderia pseudomallei Isolates, Western Hemisphere
Melioidosis is caused by the bacterium Burkholderia pseudomallei and is mainly associated with tropical areas. Although considered endemic to northern Australia and Southeast Asia, it has increasingly been recognized in other regions, such as Central America, South America, and the Caribbean (1–3).
A study by Pearson et al. based on multilocus sequence typing (MLST) data and analysis of single-nucleotide polymorphisms (SNPs) from whole genome sequences indicated that B. pseudomallei originated on the Australian continent because of the high level of genetic diversity seen in isolates from this region. They proposed that B. pseudomallei was transferred to Southeast Asia next, and from there was disseminated to other regions of the world (4).
Although MLST is the most common method to subtype isolates of B. pseudomallei, over time it has become recognized that it lacks the resolution to firmly link an isolate to a specific geographic origin (4–7). Because of homoplasy, the same sequence type (ST) can be found in isolates from different continents that are not truly closely related and instead have different genomic content (7).
To improve the ability to link genetic data with geographic provenance, we previously used a typing scheme for length polymorphisms in the 16S−23S internal transcribed spacer (ITS) of Burkholderia spp. to characterize isolates of B. pseudomallei with associations in the Western Hemisphere. We found that all isolates with a clear origin in the Western Hemisphere were ITS type G (5,8). We also established that the genomes contained the Yersinia-like fimbrial (YLF) gene, which is associated mainly with isolates not from Australia (5,9). Because of limited diversity detected by these methods, we hypothesized that a genetic bottleneck was associated with isolates from the Western Hemisphere (5).
Subsequent studies by Sarovich et al. (10) and Chewapreecha et al. (11) added support to this hypothesis by inferring relatedness from whole-genome SNPs. These studies indicate that Southeast Asia was the source of isolates from Africa, and Africa then became the source for B. pseudomallei in the Western Hemisphere, potentially associated with transfer to the Americas by the slave trade (10,11). Recent work has also shown the utility of SNP analysis in associating isolates of B. pseudomallei from patients with environmental isolates to establish epidemiologic links to better understand sources of infection (12–14).
In this study, we further characterized isolates associated with the Western Hemisphere by analysis of SNPs in whole-genome sequences to infer phylogenetic relatedness. We also sought to determine if this analysis could be used to associate isolates with geographic regions in the Western Hemisphere to improve epidemiologic associations, especially in cases where the source of infection was unclear.
Genomic DNA Extraction and Sequencing
DNA was extracted from isolates by using the Maxwell 16 Cell DNA Purification Kit with the Maxwell 16MDx automated nucleic acid purification system (Promega, Madison, WI, USA) per the manufacturer’s instructions. Final elution was performed by using PCR-grade water and RNase A. Samples were filtered by using a 0.1-μm filter. Sequencing was performed on an RSII apparatus (Pacific Biosciences, Menlo Park, CA, USA) for most isolates. DNA was sheared to 20 kb by passing it through a needle and used to generate 20-kb SMRTbell libraries by using a standard Template Preparation Kit (Pacific Biosciences). Libraries were further selected by size by using a BluePippin instrument (Sage Science, Beverly, MA, USA). The libraries were then sequenced by using the RSII apparatus (Pacific Biosciences), P6 polymerase, and C4 chemistry for 360-min movies. Library size ranged from 4 kb to 10 kb.
Sequence for MD2013 was determined from paired-end reads, which were generated by using an MiSeq apparatus (Illumina, San Diego, CA, USA). Genomic DNA was sheared to a mean size of 600 bp by using an LE220 focused ultrasonicator (Covaris Inc., Woburn, MA, USA). DNA fragments were cleaned by using Ampure (Beckman Coulter Inc., Indianapolis, IN, USA) and used to prepare single-indexed sequencing libraries by using the PrepX ILM DNA Library Preparation Kit (Wafergen Biosystems, Fremont, CA, USA), Truseq barcoding indices, and PCR primer cocktail (Illumina). Libraries were analyzed for size and concentration, pooled, and denatured for loading onto flow cells for cluster generation. Sequencing was performed by using Miseq 2 × 250-bp cycle paired-end sequencing kits (Illumina). After completion, sequence reads were filtered for read quality, base called, and demultiplexed by using Casava version 1.8.2 software (Illumina).
De novo assembly was performed by using the Hierarchical Genome Assembly Process version 3 software (Pacific Biosciences) (15). Resulting consensus sequences were checked and edited for circularity either automatically by using Circlator (16) or manually by using Gepard (17). Illumina reads were cleaned by using BBDUK version 36.20 software (Joint Genome Institute, Walnut Creek, CA, USA) to remove PhiX (NC_001422.1). We used Trimmomatic version 0.36 software to clip off adapters and perform a 20-bp sliding window quality trim to a Phred 30 average (18). Cleaned reads were then assembled in SPAdes version 3.9.0 software by using the only-assembler option and retaining contigs >500 bp (19). Sequences were deposited into GenBank (accession numbers are in Table 1).
We performed MLST as described by using traditional Sanger methods or genome assembly to query the B. pseudomallei MLST database (http://pubmlst.org/bpseudomallei/) (5,6,20–23). We used kSNP version 3.021 software to analyze SNPs for strains sequenced for this study (Table 1) and a reference panel of genomes (Table 2) (24). The maximum-parsimony tree from core SNPs was used to identify clades by using FigTree version 1.4.0 (http://tree.bio.ed.ac.uk/software/figtree/). Select isolates and clades were investigated for where SNPs occurred by manual interrogation of the kSNP version 3.021 variant call format output file and SnpEff version 4.3 (25).
We detected the Yersinia-like fimbrial (YLF) gene by using CLC Genomics Workbench version 8.0.3 software, YP_110141.1, and blastn for each genome (CLCbio, Aarhus, Denmark) (26). ITS typing involved 3 sequence types: type C (FJ981718.1), type G (FJ981723.1), and type E (FJ981706.1) These types were queried against each target genome by using blastn version 2.2.31+ software (https://www.ncbi.nlm.nih.gov/news/06-16-2015-blast-plus-update/). Matches with 100% sequence alignment and >99% nt identity were assigned ITS types, and these blastn alignments were manually visualized/verified by using CLC Genomics Workbench version 8.0.3 software (8,9,26).
A dendrogram based on maximum-parsimony from analysis of core SNPs showed a distinct clade for genomes with an origin in the Western Hemisphere that branches off a node in common with genomes from isolates associated with Africa (Figure). BLAST analysis yielded ITS type G for isolates with a clear origin in the Western Hemisphere (Table 1). We also observed presence of the YLF gene in all isolates in our panel.
Within the Western Hemisphere clade, distinct subclades appear to be associated with geographic origin (Figure). The largest subclade is the Puerto Rico/Trinidad clade, which consists of 5 genomes (PR1982, PR1998, PR2012, PR2013a, and PR2013b) from clinical cases and environmental isolates from Puerto Rico and 1 clinical case in Florida, USA (FL2012), from a resident of Trinidad (2,20,27). PR2013a and PR2013b, which were isolated from the same soil sample, also had no common SNPs detected but had 9 SNPs when compared with PR2012.
The second largest subclade consists of 5 genomes (CA2009, IL2014, MX2013, MX2014, and TX2015), which are all associated with clinical cases of melioidosis in patients who resided in or visited Mexico before seeking treatment in the United States (2,27,28). The patient from whom isolate MX2013 was obtained also had prior military service in Vietnam.
A subclade containing 4 genomes was resolved and consisted of GA2015, RI2013a, RI2013b, and TX2004. GA2015 was obtained from a patient who sought treatment in Georgia, USA, and resided in Panama but who had also visited Peru. The 2 isolates RI2013a and RI2013b, which have different colony morphologies, were obtained from a patient who traveled from Rhode Island, USA, to Guatemala. No common SNPs were detected for RI2013a and RI2013b. TX2004 was obtained from a resident of Texas, USA, who had spent time in Southeast Asia during World War II as a prisoner of war (29).
Genomes (NY2010 and 724644) from 2 cases of melioidosis in persons from the United States who traveled to Aruba clustered together and had 22 SNP differences between each other (2,27,30). Two isolates from pet iguanas (Iguana iguana) in California, USA (CA2007 and CA2013a) also clustered together (31). Three SNPs differentiate CA2007 from CA2013a.
Other genomes from isolates investigated in the United States from patients with no clear history of travel to regions to which melioidosis is endemic did not cluster within the Western Hemisphere clade. CA2010 groups with isolate PB08298010 in a subcluster among reference genomes from Southeast Asia and has ITS type C. OH2013 has ITS type CE and clusters most closely to K96243, which groups with other isolates from Thailand and Southeast Asia (32). Isolate MD2013, which was obtained from a US military veteran who served in Asia but most recently resided in Africa, has a genome that groups between isolates from Burkina Faso and Madagascar and has ITS type C.
Previous methods have been used to associate genetic features of B. pseudomallei with geographic origin. For example, it has been observed that the YLF gene cluster is predominant in strains from outside Australia, and our results are consistent with that association (5,9). We also observed that ITS type G predominates in our panel of isolates from the Western Hemisphere, which is consistent with results from our previous study (5). MLST has been used to associate isolates with geographic origin (6). However, MLST alone does not appear to be sufficient to provide geographic correlation when investigating an isolate of unknown provenance and can be confounded by homoplasy (4,5,7).
Analysis of SNPs in whole-genome sequences appears quite promising as a way to correlate isolates with geographic origin (4,10,11). The ability to link a strain with a geographic origin has been shown to be increasingly useful, especially in areas such as the US mainland, which so far does not appear to be endemic for melioidosis (2). This ability is particularly useful in cases in which there is no clear travel history to regions to which B. pseudomallei is endemic, or which may have resulted from exposure to contaminated products, such as medical supplies, from those areas (3). Our study supports previous observations by Sarovich et al. (10) and Chewapreecha et al. (11) that, on the basis of SNP analysis, isolates from the Americas belong to a distinct clade that branches off from strains from Africa. The larger size of our panel of isolates from the Western Hemisphere enabled a finer-scale association of some isolates on the basis of geographic origins and provided insights on isolates recovered from cases with no clear sources of exposures.
Isolates from Puerto Rico and the isolate from the patient who resided in Trinidad resulted in a clear association, which is surprising because Trinidad is much closer to Venezuela than to Puerto Rico. Isolates PR2013a and PR2013b were obtained from the same soil sample and do not have common SNPs. The soil sample from which PR2013a and PR2013b was isolated was <250 m from the residence of the patient who was infected with isolate PR2012. Not only do the soil isolates cluster with PR2012, but there were only 9 SNPs differences between them and PR2012, which supports the supposition that the area close to the home of the patient was the source of infection. This supposition is analogous to the observation in a case in Australia where only 3 SNPs separated a clinical isolate from an environmental isolate obtained near the home of the patient (12).
The association among isolates from patients with links to Mexico was also clear on the basis of clustering in the phylogenetic tree. However, each isolate is relatively distant from each other (e.g., 3606 SNPs distinguish MX2014 from IL2012), which indicates that more genetic diversity has been captured for isolates with origins in Mexico than for those with origins in Puerto Rico.
The subclade consisting of GA2015, RI2013a, RI2013b, and TX2004 appears to be related to the subclade from Mexico. Also, no common SNPs were detected for RI2013a and RI2013b despite difference in colony morphotypes. This finding is consistent with a recent observation in another strain in which no common SNPs were detected among genomes of a variety of colony morphotypes, which indicated that the differences were not caused by mutations, but possibly by differential gene expression or another mechanism (33).
TX2004 is an isolate obtained from a patient who has been reported as having the longest incubation period for melioidosis. This patient was a US military veteran who was captured during World War II by the Japanese and interned in various camps in Southeast Asia as a prisoner of war; he did not have a travel history to other regions to which B. pseudomallei is endemic. His case of melioidosis was attributed to exposure to B. pseudomallei 62 years before development of symptoms (29). However, we found that TX2004 belongs to the Western Hemisphere clade and groups with genomes from isolates from melioidosis patients who had travel histories to Guatemala, Panama, and Peru. This finding, and the fact that TX2004 is ITS type G, suggests that TX2004 might not have been acquired by the patient in the Pacific theater during World War II.
For the 2 isolates obtained from travelers to Aruba, we found no epidemiologic link between the 2 cases (NY2010 and 724644) other than travel to Aruba for vacation. This finding supports the hypothesis that Aruba was the source of exposure and highlights the usefulness of well documented travel records for epidemiologic associations.
We observed clustering of genomes from isolates obtained from 2 green iguanas. The first case (CA2007) was reported in 2007 in northern California, and the iguana died ≈2.5 years later. The second case (CA2013a) was reported in a 1.6-year-old iguana in southern California. The iguanas were purchased from different pet stores and the cases were separated in time. Thus, there was no clear epidemiologic link between the cases. No background information was available on the origin of the iguanas or the pet store suppliers. Both iguanas were not transported after purchase and were confined to the homes of the owners.
Earlier MLST data indicated that both of these isolates had ST518, which had also been seen in isolate PB1007001, obtained from a tourist from Arizona, USA, in whom melioidosis developed after this person was infected in Costa Rica. This finding suggested that the iguanas were infected in Central America before importation and is consistent with the known range of iguanas (Florida to southern Brazil) and the observation that most iguanas imported into the United States come from Central America (31,34,35). The genomes from the 2 isolates are highly related (only 3 common SNPs) and suggests that the iguanas originated in the same region. The same breeding facility could also be involved, assuming that the iguanas were farm raised.
The association of clades with geographic origin promises to be a useful tool when investigating cases of melioidosis in patients without a clear exposure history, such as travel to multiple countries to which this disease is endemic or for patients with no known travel history. Isolate MD2013 was obtained from a retired person from the United States who resided in Ghana and traveled around Africa. He also had previous military service in Asia, although not in regions with a strong history of melioidosis. He sought treatment in the District of Columbia and Maryland, USA, area after symptoms developed. MD2013 has a genome that groups between isolates from Burkina Faso and Madagascar and supports infection with the bacteria in Africa. The grouping of MX2013 with other genomes from Mexico indicates infection occurred in Mexico instead of Vietnam. An isolate from a case in a zoological warehouse worker without a travel history to areas to which melioidosis is endemic has a genome (CA2010) that does not belong within the clade for isolates from the Western Hemisphere. This isolate belongs within a subclade that contains an isolate (PB08298010) from a resident of Arizona who also did not have a clear travel history (36). We previously reported that PB08298010 was ITS type G on the basis of an earlier draft genome sequence, but a more recent analysis of a higher-quality whole-genome sequence indicates it has ITS type CE (A. Tuanyok, 2015, pers. comm.) (5). The ITS type CE and the location of this subclade support an origin outside the Western Hemisphere, more likely in Asia, which supports the earlier proposal that PB08298010 might have resulted from exposure to contaminated medical supplies from Southeast Asia (10,36). A similar situation might also be true for a fatal case in a patient in Ohio, USA, who had no travel history outside the continental United States (32). Isolate OH2013 groups with isolates from Thailand and is clearly not associated with those in the Western Hemisphere.
The genome sequences analyzed in this study will assist future epidemiologic investigations of melioidosis, especially in cases potentially associated with exposure in the Western Hemisphere. We believe that, as more genome sequences become available, the robustness and resolution of this analysis will improve and thus enhance the ability to associate isolates of B. pseudomallei with geographic origin.
Dr. Gee is a research biologist in the Bacterial Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA. His primary research interest is molecular epidemiology of B. pseudomallei, B. mallei, and other emerging bacterial pathogens.
This study was supported by the Advanced Molecular Detection Initiative at the Centers for Disease Control and Prevention (https://www.cdc.gov/amd/).
- Wiersinga WJ, Currie BJ, Peacock SJ. Melioidosis. N Engl J Med. 2012;367:1035–44.
- Benoit TJ, Blaney DD, Doker TJ, Gee JE, Elrod MG, Rolim DB, et al. A review of melioidosis cases in the Americas. Am J Trop Med Hyg. 2015;93:1134–9.
- Currie BJ. Melioidosis: evolving concepts in epidemiology, pathogenesis, and treatment. Semin Respir Crit Care Med. 2015;36:111–25.
- Pearson T, Giffard P, Beckstrom-Sternberg S, Auerbach R, Hornstra H, Tuanyok A, et al. Phylogeographic reconstruction of a bacterial species with high levels of lateral gene transfer. BMC Biol. 2009;7:78.
- Gee JE, Allender CJ, Tuanyok A, Elrod MG, Hoffmaster AR. Burkholderia pseudomallei type G in Western Hemisphere. Emerg Infect Dis. 2014;20:682–4.
- Godoy D, Randle G, Simpson AJ, Aanensen DM, Pitt TL, Kinoshita R, et al. Multilocus sequence typing and evolutionary relationships among the causative agents of melioidosis and glanders, Burkholderia pseudomallei and Burkholderia mallei. J Clin Microbiol. 2003;41:2068–79.
- De Smet B, Sarovich DS, Price EP, Mayo M, Theobald V, Kham C, et al. Whole-genome sequencing confirms that Burkholderia pseudomallei multilocus sequence types common to both Cambodia and Australia are due to homoplasy. J Clin Microbiol. 2015;53:323–6.
- Liguori AP, Warrington SD, Ginther JL, Pearson T, Bowers J, Glass MB, et al. Diversity of 16S-23S rDNA internal transcribed spacer (ITS) reveals phylogenetic relationships in Burkholderia pseudomallei and its near-neighbors. PLoS One. 2011;6:e29323.
- Tuanyok A, Auerbach RK, Brettin TS, Bruce DC, Munk AC, Detter JC, et al. A horizontal gene transfer event defines two distinct groups within Burkholderia pseudomallei that have dissimilar geographic distributions. J Bacteriol. 2007;189:9044–9.
- Sarovich DS, Garin B, De Smet B, Kaestli M, Mayo M, Vandamme P, et al. Phylogenetic analysis reveals an Asian origin for African Burkholderia pseudomallei and further supports melioidosis endemicity in Africa. mSphere. 2016;1:pii: 00089.
- Chewapreecha C, Holden MT, Vehkala M, Välimäki N, Yang Z, Harris SR, et al. Global and regional dissemination and evolution of Burkholderia pseudomallei. Nat Microbiol. 2017;2:16263.
- Currie BJ, Price EP, Mayo M, Kaestli M, Theobald V, Harrington I, et al. Use of whole-genome sequencing to link Burkholderia pseudomallei from air sampling to mediastinal melioidosis, Australia. Emerg Infect Dis. 2015;21:2052–4.
- Price EP, Sarovich DS, Smith EJ, MacHunter B, Harrington G, Theobald V, et al. Unprecedented melioidosis cases in northern Australia caused by an Asian Burkholderia pseudomallei strain identified by using large-scale comparative genomics. Appl Environ Microbiol. 2015;82:954–63.
- McRobb E, Sarovich DS, Price EP, Kaestli M, Mayo M, Keim P, et al. Tracing melioidosis back to the source: using whole-genome sequencing to investigate an outbreak originating from a contaminated domestic water supply. J Clin Microbiol. 2015;53:1144–8.
- Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10:563–9.
- Hunt M, Silva ND, Otto TD, Parkhill J, Keane JA, Harris SR. Circlator: automated circularization of genome assemblies using long sequencing reads.Genome Biol. 2015;16:294.
- Krumsiek J, Arnold R, Rattei T. Gepard: a rapid and sensitive tool for creating dotplots on genome scale. Bioinformatics. 2007;23:1026–8.
- Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20.
- Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–77.
- Doker TJ, Sharp TM, Rivera-Garcia B, Perez-Padilla J, Benoit TJ, Ellis EM, et al. Contact investigation of melioidosis cases reveals regional endemicity in Puerto Rico. Clin Infect Dis. 2015;60:243–50.
- Gee JE, Glass MB, Novak RT, Gal D, Mayo MJ, Steigerwalt AG, et al. Recovery of a Burkholderia thailandensis-like isolate from an Australian water source. BMC Microbiol. 2008;8:54.
- Jolley KA, Maiden MC. BIGSdb: Scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics. 2010;11:595.
- Price EP, MacHunter B, Spratt BG, Wagner DM, Currie BJ, Sarovich DS. Improved multilocus sequence typing of Burkholderia pseudomallei and closely related species. J Med Microbiol. 2016;65:992–7.
- Gardner SN, Slezak T, Hall BG. kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome.Bioinformatics. 2015;31:2877–8.
- Cingolani P, Platts A, Wang L, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6:80–92.
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.
- Benoit TJ, Blaney DD, Gee JE, Elrod MG, Hoffmaster AR, Doker TJ, et al.; Centers for Disease Control and Prevention (CDC). Melioidosis cases and selected reports of occupational exposures to Burkholderia pseudomallei—United States, 2008–2013. MMWR Surveill Summ. 2015;64:1–9.
- Cheng JW, Hayden MK, Singh K, Heimler I, Gee JE, Proia L, et al. Burkholderia pseudomallei infection in US traveler returning from Mexico, 2014.Emerg Infect Dis. 2015;21:1884–5.
- Ngauy V, Lemeshev Y, Sadkowski L, Crawford G. Cutaneous melioidosis in a man who was taken as a prisoner of war by the Japanese during World War II. J Clin Microbiol. 2005;43:970–2.
- O’Sullivan BP, Torres B, Conidi G, Smole S, Gauthier C, Stauffer KE, et al. Burkholderia pseudomallei infection in a child with cystic fibrosis: acquisition in the Western Hemisphere. Chest. 2011;140:239–42.
- Zehnder AM, Hawkins MG, Koski MA, Lifland B, Byrne BA, Swanson AA, et al. Burkholderia pseudomallei isolates in 2 pet iguanas, California, USA.Emerg Infect Dis. 2014;20:304–6.
- Doker TJ, Quinn CL, Salehi ED, Sherwood JJ, Benoit TJ, Glass Elrod M, et al.; Melioidosis Investigation Team. Fatal Burkholderia pseudomallei infection initially reported as a Bacillus species, Ohio, 2013. Am J Trop Med Hyg. 2014;91:743–6.
- Vipond J, Kane J, Hatch G, McCorrison J, Nierman WC, Losada L. Sequence determination of Burkholderia pseudomallei strain NCTC 13392 colony morphology variants. Genome Announc. 2013;1:e00925–13.
- Schlaepfer MA, Hoover C, Dodd CK. Challenges in evaluating the impact of the trade in amphibians and reptiles on wild populations. Bioscience. 2005;55:256–64.
- Townsend JH, Krysko KL, Enge KM. Introduced iguanas in southern Florida: a history of more than 35 years. Iguana. 2003;10:111–8.
- Engelthaler DM, Bowers J, Schupp JA, Pearson T, Ginther J, Hornstra HM, et al. Molecular investigations of a locally acquired case of melioidosis in Southern AZ, USA. PLoS Negl Trop Dis. 2011;5:e1347.