Comparative genome analysis of non-toxigenic non-O1 versus toxigenic O1 Vibrio cholerae

Pathogenic strains of Vibrio cholerae are responsible for endemic and pandemic outbreaks of the disease cholera. The complete toxigenic mechanisms underlying virulence in Vibrio strains are poorly understood. The hypothesis of this work was that virulent versus non-virulent strains of V. cholerae harbor distinctive genomic elements that encode virulence. The purpose of this study was to elucidate genomic differences between the O1 serotypes and non-O1 V. cholerae PS15, a non-toxigenic strain, in order to identify novel genes potentially responsible for virulence. In this study, we compared the whole genome of the non-O1 PS15 strain to the whole genomes of toxigenic serotypes at the phylogenetic level, and found that the PS15 genome was distantly related to those of toxigenic V. cholerae. Thus we focused on a detailed gene comparison between PS15 and the distantly related O1 V. cholerae N16961. Based on sequence alignment we tentatively assigned chromosome numbers 1 and 2 to elements within the genome of non-O1 V. cholerae PS15. Further, we found that PS15 and O1 V. cholerae N16961 shared 98% identity and 766 genes, but of the genes present in N16961 that were missing in the non-O1 V. cholerae PS15 genome, 56 were predicted to encode not only for virulence–related genes (colonization, antimicrobial resistance, and regulation of persister cells) but also genes involved in the metabolic biosynthesis of lipids, nucleosides and sulfur compounds. Additionally, we found 113 genes unique to PS15 that were predicted to encode other properties related to virulence, disease, defense, membrane transport, and DNA metabolism. Here, we identified distinctive and novel genomic elements between O1 and non-O1 V. cholerae genomes as potential virulence factors and, thus, targets for future therapeutics. Modulation of such novel targets may eventually enhance eradication efforts of endemic and pandemic disease cholera in afflicted nations.


Authors' contributions MM PK SK EG JTF MI ARD SRT MB GH IEL AS FDS JM MFV
Research concept and design -- Collection and/or assembly of data

Introduction
Cholera is an infectious disease characterized by profuse watery diarrhea and vomiting in humans, and the causative agent is Vibrio cholerae, a Gram-negative, comma-shaped, facultative anaerobic bacterium [1]. V. cholerae includes both pathogenic and nonpathogenic strains, and the bacteria responsible for pandemic outbreaks secrete the cholera toxin [2]. Since 1817, seven pandemics of cholera have been recorded. Cholera is a major public health concern because the disease can exhibit significant mortality if left untreated [3,4]. In the past 200 years, cholera has resulted in millions of deaths due to its ability to spread rapidly within populations, and has been capable of contaminating rivers and estuaries [5]. The most recent outbreak of V. cholerae was recorded in Southeast Asia, which quickly spread across the globe as the seventh pandemic [6]. In 2010 alone, 604,634 cases of cholera were reported in Haiti, raising the death toll count to 7,436 in the first two years [7].
The genomes of several pathogenic V. cholerae strains encode proteins that are directly or indirectly responsible for virulence. In many parts of the world, the O serogroups of V. cholerae are associated with diarrhea [8]. The most common mode of transmission for this bacterium is through the consumption of feces-contaminated water, fishes or crustaceans [9].
In addition to rehydration therapy, the first line of antimicrobial agent used against cholera is doxycycline, prescribed for a period of 1-3 days in order to reduce the severity of the symptoms [10,11]. Other antimicrobials which have been demonstrated to be effective in humans include cotrimoxazole, erythromycin, tetracycline, chloramphenicol, furazolidone and norfloxacin [11,12].
Unfortunately, wide spread use and misuse of these and other antimicrobials have resulted in selection of multidrug-resistant bacterial variants [13] which potentially compromise chemotherapeutic efficacy towards cholera [14]. The different mechanisms by which bacteria show resistance to antimicrobial agents include (a) biofilm production (b) drug inactivation (c) ribosome protection (d) reduced permeability (e) target alteration [15] and (f) active efflux [16]. One of the active efflux pumps of V. cholerae is EmrD-3, which belongs to the major facilitator superfamily (MFS) and is a drug/H + antiporter with 12 transmembrane domains [17]. Another efflux pump encoded in the genome of V. cholerae is VceB [18]. Drug efflux pumps are integral membrane transporters that actively efflux the toxic compounds and antibiotics out of the bacterial cell and confer resistance against multiple antibacterial agents [19][20][21].
The presence of the cholera toxin (CT), the Vibrio pathogenicity island (VPI), and the toxin co-regulated pilus (TCP) within the O1 serogroups of V. cholerae make these strains more virulent and pandemic than their non-O1 counterparts [22]. A significant basis for their pathogenicity is attributed to cholera toxin encoding genes. Other genes important for enhancing virulence in these organisms are ace, psh, PIIICTX, zot and cep, which are implicated in phage morphogenesis [5,23,24]. The Vibrio pathogenicity island-1 (VPI-1) confers toxin release, bioflim formation, attachment to disease vectors for transmission to humans, and are receptors of CTX. The Vibrio pathogenicity island-2 (VPI-2) helps the cholera toxin to gain entry into the intestinal epithelium by unmasking GM1 gangliosides in the lining of the human intestine. The absence of VPI-1 and VPI-2 in non-O1 serogroups of V. cholerae makes them less pathogenic than the O1 serogroups [25].
Even though non-O1 V. cholerae strains carry certain virulence genes, the severity of disease is less compared to O1/ O139 V. cholerae [8]. The non-O1 serogroups of V. cholerae are known as the non-agglutinating Vibrios (NAGs) because they lack the genes coding for CT and TCP [26,27]. The presence of multidrug resistance (MDR) transporters confers resistance to ampicillin, chloramphenicol and tetracycline in non-O1 and non-O139 serogroups of V. cholerae species [14]. The ABC transporters present in PS15 V. cholerae predictably transport phosphate molecules across the periplasm and may be essential for protein synthesis, amino acid exchange, and transport of fatty acids [28].
We previously determined the genome nucleotide sequence of the non-O1 non-toxigenic V. cholerae PS15 (GenBank Accession No. AIJR00000000) [28]. Here, we compared non-O1 PS15 with the genetic information of virulent strains. The genome of V. cholerae PS15 is composed of 3,910,387 base pairs (bp) organized into 3,512 open reading frames with a G +C content of 47.55% [28]. We chose to focus our comparative analysis with V. cholerae PS15 [29] using V. cholerae El Tor N16961 because this latter genome was completely sequenced [30]. N16961 is made up of 4,033,460 base pairs (bp) organized and distributed into two chromosomes, with a G+C content of 46.9% in chromosome 1 and 47.7% in chromosome 2 [30]. Even though the non-O1 V. cholerae bacterium possesses some virulence genes responsible for causing gastrointestinal infections, wound infections, septicemia and cellulitis in humans, little is known about the mechanisms that confer virulence in this microorganism. The aim of this work is to identify differences in the genetic elements between the genomes of virulent N16961 and non-virulent PS15 strains of V. cholerae in order to identify novel virulence mechanisms that may eventually serve as potential therapeutic targets for the ultimate purpose of fostering conditions that reduce dissemination of disease-causing virulent serotypes of V. cholerae through populations.

Comparison of non-O1 PS15 and O1 N16961 Vibrio cholerae genomes using RAST and UniProt
A function based genome comparison was performed between a non-toxigenic, non-O1 V. cholerae PS15 environmental isolate (courtesy of Dr. Charles Kaysner) from sediment sampled in Puget Sound, WA [28,31] and O1 V. cholerae N16961 [30], using the RAST (Rapid Annotation using Subsystem Technology) database and Seed Viewer to predict protein function [32] focusing on comparison of categories and subsystem groupings pertaining to virulence, disease, defense, membrane transport, DNA metabolism, regulons, dormancy, sporulation, phages, prophages, transposable elements, and plasmids for both genomes of O1 and non-O1 V. cholerae microorganisms. The open reading frames (genes) encoding functional roles associated with a subsystem are referred as functioning parts, and a subsystem is referred as a set of predicted abstract functional roles [32]. The screening of predicted proteins encoded from elements of both genomes was performed with BLAST analysis of the amino acid sequences using UniProt [33].

Phylogenetic analysis
The non-O1 V. cholerae PS15 genome sequence [28] (GenBank Accession no. AIJR00000000) was analyzed using BLAST [34] in order to generate phylogenetic trees harboring genomes of closely related organisms and virulence factors of the O1 serotypes. The BLAST pair wise alignment using Tree Neighbor Joining method [35] was used to compare the genome of PS15 to other complete Vibrio genome sequences in the database and is represented in Figure 1.

CGView
The CGView server was used for comparative genome analysis [36]. A graphical circular genome map was constructed using CGView by BLAST analysis of the DNA sequence of V. cholerae non-O1 PS15 (3,910,387 base pairs) with the complete DNA sequence of V. cholerae El Tor N16961 (4,033,460 base pairs) [28,30].

Results
The genome of non-O1 V. cholerae PS15 is distantly related to O1 V. cholerae genomes We previously determined the whole genome sequence of a non-toxigenic, non-O1 V. cholerae isolate from Puget Sound, strain PS15 [28]. It had been shown that genomes of toxigenic O1 V. cholerae bacteria were highly related [30], possibly implying that non-O1 genomes would be more distantly related. We tested this prediction by comparing non-O1 V. cholerae PS15 with other microorganisms by constructing a phylogenetic tree using BLAST pair-wise alignment in order to represent genomes that are most closely related to V. cholerae non-O1 PS15 and to establish relatedness of PS15 to these microorganisms (Figure 1). Although the non-O1 V. cholerae PS15 genome sequence is most closely related to those of V. cholerae LMA 3984-4, O395, O1 strains 2010EL-1786, MJ-1236, O1 biovar El Tor strain N16961, IEC224, and M66-2, the non-O1 V. cholerae PS15 strain is, nonetheless, the most distantly related member within this cluster.

Tentative chromosome assignment in non-toxigenic, non-O1 V. cholerae PS15
Since the two chromosomes of the toxigenic O1 V. cholerae strain N16961 were elucidated [30], we predicted that genomic sequence alignment with the non-toxigenic, non-O1 V. cholerae strain PS15 would implicate chromosome assignment in this bacterium as well. A circular genome representation was generated using the CGView server to plot the structural genome arrangement with BLAST analysis of the non-O1 V. cholerae PS15 genome with that of the O1 V. cholerae N16961 using their respective genomic nucleotide sequences in a FASTA format (Figure 2). Using the genome sequence data from V. cholerae N16961 to compare with the genome of V. cholerae PS15, chromosomes 1 and 2 were implicated for the non-toxigenic PS15 strain and are shown in Figure 2.

The majority of genes in the O1 N16961 and non-O1 PS15 V. cholerae genomes are shared
We have shown above that although the non-O1 V. cholerae PS15 genome is distantly related to the genomes of toxigenic O1 V. cholerae, the PS15 genome is still closely related to genomes of the Vibrio genus. This implies a striking similarity between the non-O1 and O1 genomes, specifically regarding the commonalities within the gene space. To test this, we used RAST Seed Viewer and UniProt to compare the genome sequences of O1 V. cholerae N16961 and non-O1 V. cholerae PS15, the general features of which are shown in Table 1. The O1 and non-O1 V. cholerae genomes shared 766 genes (open reading frames) that are predicted to code for proteins within functional categories pertaining to virulence, disease, defense, membrane transport, phages, prophages, transposable elements, plasmids, DNA metabolism, dormancy, sporulation and regulons. Interestingly, when compared to the N16961 genome, the V. cholerae PS15 genome appears to be truncated sporadically throughout by approximately 120 kbp ( Table 1 and Figure 2). In Table 2 we listed 58 of 766 genes that share 98% identity between both genomes. The remaining genes are listed in Supplement Table S1. Even though non-O1 V. cholerae PS15 is believed to be nonpathogenic compared to the known virulent O1 V. cholerae N16961 strain, their genomes shared 90 genes in common that code for functions pertaining to virulence, disease and defense. Some of these genes included accessory colonization factor (acfD), TCP pilus virulence regulatory protein (tcpN), toxin coregulated pilus biosynthesis protein E (tcpE), TCP pilus virulence regulatory protein (toxT) and accessory colonization factor (acfC). In addition to these virulence-associated genes, both genomes shared 287 genes encoding functional properties in the DNA metabolism category, 8 genes encoding proteins for dormancy and sporulation, 366 genes encoding membrane transporters, 12 genes in the categories of phages, prophages, transposable elements and plasmids, and 3 genes pertaining to regulons. Among these shared genomic elements encoding membrane transporters are genes known to express multidrug resistance efflux pumps, including AcrA of the RND superfamily [37], SugE of the SMR superfamily [38], and NorM of the MATE superfamily [39].

Genes present in O1 V. cholerae N16961 genome and absent in the non-O1 PS15 genome
The pathogenicity of the O1 V. cholerae serotypes suggests that they harbor genomic elements that confer virulence. For instance, the cholera toxin of toxigenic V. cholerae strains is the primary virulence factor in endemic and pandemic cholera cases [40]. Thus, in order to establish the association between presence of virulence-encoding genomic elements and pathogenicity, we compared the functional determinants between both PS15 and N16961 genomes. Our analysis revealed that of the 619 genes absent in the non-O1 V. cholerae PS15 genome [29], 56 of these genes, when compared to O1 V. cholerae N16961, are in the categories including virulence, disease and defense, membrane transport, DNA metabolism, dormancy and sporulation ( Table 3). The virulence genes which were present in O1 serotypes but largely absent in the non-O1 strains, including the PS15 strain, include the accessory cholera enterotoxin (ace), the cholera enterotoxin subunit B (ctxB), the cholera enterotoxin subunit A (ctxA), and the zona occludens toxin (zot). Comparison of the predicted proteins encoded of both PS15 and N16961 genomes using UniProt revealed the absence of other virulence genes in PS15, which include genes predicted to encode accessory colonization factors A and B (acfA and acfB), and the genes encoding VceA and VceB proteins shown to confer resistance to antimicrobial agents ( Table 3) [41]. Notably, the gene demonstrated to confer multidrug resistance and encoding a drug efflux pump, EmrD-3, of the MFS is present in N16961 but absent from the non-O1 V. cholerae PS15 genome [17,21].
A phylogenetic tree, which was generated by BLAST for bacterial genomes that share the cholera toxin, indicated the absence of the cholera toxin gene in the non-O1 V. cholerae PS15 bacterium (Figure 3). The most closely-related microorganisms that shared the DNA encoding the cholera toxin include V. cholerae IEC224, O1 biovar El Tor strain N16961, O395, MJ-1236 and the O1 strain 2010EL-1786.
Other genes that were absent in non-O1 V. cholerae genome but present in O1, include genes that encode glycerolipid and glycerophospholipid metabolism, and genes that code for VPI [25] ( Table 3). Additional genes that are absent in non-O1 V. cholerae PS15 include those coding for the Rst operon essential for the synthesis of phage related replication protein (RstA), phage related integrase (RstB), phage related antirepressor (RstC), phage related transcriptional repressor (RstR) [24], and sulfur metabolism. Other genes that are found in O1 V. cholerae but absent in non-O1 include those coding for TsaE, a protein required for the synthesis of threonylcarbamoyladenosine in the presence of tRNA [42].

Genes present in the non-O1 V. cholerae PS15 genome and absent in the O1 N16961 genome
Because the non-O1 V. cholerae PS15 environmental isolate is considered to be nontoxigenic [31,43], this implies that genes unique to this microorganism, compared to the toxigenic N16961 bacterium, possibly encode non-virulent functions. To test this hypothesis, we performed a function based genome comparison using RAST and UniProt for PS15 and N16961. This comparative analysis revealed that 113 genes were excluded in N16961 but present within the PS15 genome ( Table 4). The three known genes (characterized) that are present in PS15, but absent in N16961, include the oligopeptide ABC transporter called periplasmic oligopeptide-binding protein (OppA) [44], a proteinexport membrane protein (SecF) [45], and the UvrABC system protein A (uvrA) [46], all of which belong to the membrane transport category. Remaining genes annotated as uncharacterized hypothetical proteins as per UniProt are surprisingly predicted to code for proteins involved in functions related to virulence, pathogenesis, defense, solute transport, and DNA metabolism ( Table 4).

Conclusions
Upon comparison of the non-O1 V. cholerae PS15 genome, a non-toxigenic strain, to that of an O1 V. cholerae N16961, a toxigenic strain, we found that of the 619 missing genes, 56 of these missing genomic elements encode dormancy, sporulation, ribosome modulation in persister cells, lipid metabolism, phage infection, nucleoside metabolism, and sulfur metabolism which in turn is essential for biosynthesis of amino acids, vitamins and prosthetic groups [43]. As non-O1 V. cholerae lacks genes coding for metabolism of sulfur, the non-O1 serotype is predicted to be unable to convert naturally available sulfur to sulfide, which could then be incorporated into various sulfur containing metabolites. Sulfur is critical for the biosynthesis of many important compounds like amino acids (cysteine and methionine), vitamins (biotin, thiamin), and prosthetic groups (Fe-S clusters) [43]. These genetic elements and their putative gene products represent novel and promising targets for modulation of gene expression or activity and therapeutic efforts [47], in order to effectively reduce conditions that foster virulence and dissemination of V. cholerae pathogens through populations. These determinants, therefore, clearly also warrant further studies in order to elucidate the complete molecular mechanisms of pathogenesis in cholera infections.
Not surprisingly, also among the 56 missing genes in the non-O1 PS15 genome are those that are known to confer virulence, such as the cholera toxin [40], colonization factors [48], and antimicrobial resistance mechanisms [16]. We thus confirm that the genes encoding the cholera toxin are absent from the genome of the non-toxigenic V. cholerae PS15. We confirm, however, the presence of other genes predicted to encode distinct toxins and colonization factors, as previously shown for the non-O1 V. cholerae strain NRT36S [49]. This latter study and our findings here are consistent with previous work demonstrating that aquatic environments are reservoirs for O1 and non-O1 V. cholerae [50], predicting that such environments allow genetic exchange between unrelated strains. In order to gain valuable insights into enhancing chemotherapeutic efficacy against cholera, it is imperative to study and gain understanding into the modes of action of the toxicity-inducing factors combined with other antibacterial resistance factors in toxigenic V. cholerae [51].
Interestingly, we found that the genome of the nontoxigenic V. cholerae PS15 strain harbors genes absent from the genome of its toxigenic counterpart, N16961. Such determinants mainly include still uncharacterized genetic elements that are predicted to encode proteins that confer virulence, disease, defense, membrane solute transport and DNA metabolism, suggesting that PS15 may be pathogenic to organisms excluding humans, perhaps in environments such as estuary waters [52,53]. Among the genetic determinants unique to PS15 that have been experimentally characterized include OppA, an oligopeptide primary active transporter [44], and SecF, a protein exporter [12]. We propose that these unique genetic elements represent good targets for future development of new therapies against V. cholerae infections in animals other than humans.
The genome of non-O1 V. cholerae PS15 shares >97% identity with El Tor O1 biovar V. cholerae strain N16961, as per BLAST analysis at the nucleotide level. Based on the alignment of the non-O1 PS15 genome with that of O1 N16961, chromosomes 1 and 2 were assigned to the PS15 genome (Figure 2). This tentative chromosome assignment will require confirmation with additional experimental work. Even though the genomes of both strains are highly similar to each other, the non-O1 PS15 microorganism is considered to be non-pathogenic, compared to the O1 N16961 strain, possibly due to the absence of the cholera toxin in PS15, which is responsible for endemic and pandemic diseases [54]. More recent genomic analysis, however, has demonstrated that other genetic elements are also critical for conferring pathogenesis such as genes coding for housekeeping, homeostasis, metabolism, energy generation, and antimicrobial resistance-type functions [55]. Our phylogenetic and genome comparison analyses between the toxigenic and non-toxigenic V. cholerae microorganisms support both of these contentions. Further work with additional variants, such as atypical El Tor [56], NRT36S [49], and CT-producing non-O1 strains [57], will be necessary to definitively gain a complete picture of the relationships between pathogenic versus non-pathogenic V. cholerae.
Remarkably, we found that both of the toxigenic and non-toxigenic V. cholerae strains harbor a variety of genes that have previously been demonstrated to confer multidrug resistance via active drug efflux pump systems, such as AcrAB, NorM / VcmA, SugE, and VcaM [58]. All six RND transporters in V. cholerae N16961 have been studied physiologically [59], and our data showed that V. cholerae PS15 was missing only one of these pumps, called VexA. Additionally, we found a shared but uncharacterized genetic element, VC_A0083 in the toxigenic strain and OSU_1537 in the non-toxigenic strain, tentatively called multidrug resistance protein D and predicted to encode an MFS drug efflux pump. These multidrug resistance mechanisms may be important because of their potential selection and maintenance in environments containing antimicrobial agents, their genetic mobility to other microorganisms, and dissemination within populations [60][61][62][63][64].
We conclude that the study and comparison of the genomic sequences between pathogens and their non-virulent counterparts will help discover genes encoding both the classical virulence factors and those encoding novel virulence factors. Future work will focus on the study of solute transport and antibacterial resistance mechanisms of V. cholerae pathogenic strains and on the identification of novel housekeeping genes which may be equally significant in contributing towards the microorganisms' pathogenicity [17,65,66].

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.      Included in this table are genetic elements that are absent in the non-Ol genome but present in the O1 genome, which have putative functions in virulence, disease and defense, membrane transport, DNA metabolism and dormancy and sporulation. In the table, the first column includes gene descriptions as per UniProt. Second and fourth columns represent abbreviated gene identification; the third and fifth columns represent accession numbers for the listed genes.
* The symbol denotes proteins that have functions in virulence, disease and defense. † The symbol includes proteins that are putative membrane transporters. § Symbols and include proteins that have putative functions in DNA metabolism and dormancy/sporulation categories, respectively.
∥ Symbols and include proteins that have putative functions in DNA metabolism and dormancy/sporulation categories, respectively.