Contents:
Unfortunately, these programs require extensive computational time Ogilvie et al. Concatenation methods, that assume all recovered gene trees share a common evolutionary history DeGiorgio and Degnan, ; Larget et al. This framework has the benefits of not requiring identification of orthologs and allows incorporation of multiple individuals from the same species. When estimating evolutionary relationships among microbes using long DNA sequences, the impact of recombination becomes a significant issue. If recombination is substantial, the evolutionary history of those sequences is no longer captured by a bifurcating model, and therefore a tree representation may fail to accurately portray the genealogy Schierup and Hein, a.
Under such circumstances, two strategies can be considered:. The substitution process along the genome can be highly heterogeneous where different genomic regions fit with different substitution models of evolution Arbiza et al.
Since the selection of the best fitting substitution model is crucial for accurate phylogenetic inferences Lemmon and Moriarty, , phylogenetic programs e. The methods above described for analyzing population dynamics data based on short nucleotide sequences could be extended to analyze GWS data. However, genomes can experience unique evolutionary events such as duplications, insertions, deletions, inversions, translocations or gene-gene interactions.
These events are difficult to model, leading to inefficient or intractable ML functions Marjoram et al. To deal with complex evolutionary scenarios, ABC can serve as an alternative Arenas, a. Fortunately, some frameworks see Fig. Unfortunately, there is not yet an ABC framework available to analyze GWS data, but it is possible to design an ABC method by combining simulations of GWS data, estimation of summary statistics and rejection or multiple regression approaches e.
Most contemporary studies use more than one approach to typing, epidemiology, and phylogenetic inference, with the aim of maximizing compatibility with current and past data and genetic resolution down to the strain level e. Moreover, considering the current antibiotic crises highlighted by recent reports by the World Health Organization, researchers are also turning to in silico MLST schemes from whole-genome sequences to assign new sequence types to clinically important isolates while appreciating the value of genome sequences to typing.
Additionally, with constantly decreasing sequencing costs, genome-scale microbial typing studies are becoming more affordable.
The analysis of WGS data tends to lead to high statistical confidence P value. However, as indicated above, increasing reports are showing highly significant P values for contrasting phylogenetic hypotheses depending on the evolutionary model and inference method used. Additionally, genomes can experience unique evolutionary events e. Therefore, when applying WGS-base typing approaches, emphasizing effect size and biological relevance, rather than the P value, may help to alleviate systematic error Kumar et al.
Similarly, using ABC methods instead of ML functions to estimate population parameters may also help to accommodate complex evolutionary scenarios.
Since the use of genome wide sequence data is a trend that will likely continue, we want to highlight here again that the application of standard methods for phylogenetic and population dynamics analysis to WGS data is potentially problematic given the intrinsic limitations of these gene-based approaches. In the following sections, we show current examples of modern use of molecular typing for both epidemiology and phylogenetic inference.
Other examples can be found in previous studies cited in the Introduction of this review. There has been increasing attention paid to tracking and identifying sources of opportunistic pathogens in hospital-based settings in the last few years. One such opportunistic pathogen is Klebsiella pneumoniae , a human commensal whose hyper-virulent and multidrug resistance members have emerged worldwide.
Recently, Yang et al. This level of resolution proved sufficient for following the transmission chain back to its source endoscopic device ; this was only possible due to the use of WGS, with the added benefit of detecting the blaCTX -M gene involved in Carbapenem resistance in 27 out of 32 isolates. Other studies compared the power of electrophoresis-based methods to WGS making evident that previously clonal isolates are distinguishable through WGS Salipante et al.
Similarly, Mathers et al. The authors highlighted the practicality of linking MLST types with antimicrobial resistance determinants and the power of a whole-genome SNP dataset for increased phylogenetic resolution. The combined use of WGS and MLST can provide valuable information regarding origin, clinical phenotype, and potential treatment of nosocomial infectious disease.
Although K. Davis et al. By combining traditional MLST and WGS, they observed that meat source isolates were more likely multidrug resistant than clinical isolates, even though isolates from both sources shared MLST profiles and were phylogenetically intermingled. Their results suggested potential food-borne transmission routes that carry the risk of spreading multidrug resistance into the general population. For a review on K. The spread of multidrug resistance MDR has also been studied for old foes including the causative agent of Typhoid Fever.
Wong et al.
Based on a dataset with more than 20 thousand SNPs, the authors showed that multiple transfers from Asia to Africa have occurred and are still occurring and that MDR isolates are replacing drug sensitive isolates. Interestingly, another study by the same group did not find this clade in Nigeria, where multiple introductions could better explain the Salmonella genotypes present Wong et al. Overall, these and other similar studies highlight the need for unbiased sampling in molecular epidemiology studies, as often studies select isolates for sequencing and typing based on pathogenicity and convenience, which tend to overlook much of the variation needed for informed public health policy decisions Holt et al.
The HMP DNA and sequence data resources have not only enabled comprehensive characterization of the human microbiota, e. MetaMIS: a metagenomic microbial interaction simulator based on microbial community profiles. In tandem with the rise of culture-independent profiling, culture-based techniques have been refined to capture a wider array of organisms from the human microbiome than previously possible, including anaerobes and nonbacterial members, under ever more accurately controlled conditions. By contrast, shotgun metatranscriptome studies provide biological information that complements metagenome studies, including detection of RNA viruses and quantification of rare but functional genes that might remain undetected in DNA-based metagenomic surveys [ 51 ] Fig. Soil Biol Biochem. However, such controls need to be incorporated during the early stages of a study and cannot be added in retrospect.
Another Enterobacteriaceae that has been studied by modern typing methods is Escherichia coli , like in the German outbreak of May—July Rohde et al. Within 24 hours, the DNA sequences of E. Later whole-genome comparisons showed that TY was nearly identical to an African strain that may or may not harbor the Shiga toxin gene Mossoro et al. While traditional typing approaches pointed to the outbreak strain as being enterohemorrhagic E. Along these same lines, linking outbreaks from different localities has been possible due to the increased resolution that WGS allows.
For example, a small E. Researchers could only separate the two variants by WGS, revealing that the German outbreak isolates were limited in genetic diversity 2 SNPs from four individuals compared to the French isolates 19 SNPs from seven individuals Grad et al. Therefore, slow-evolving pathogens or pathogen outbreaks over short periods of time are difficult to type and present challenges for traditional MLST approaches.
Similarly, Neisseria gonorrhea represents an extremely slow-evolving pathogen. De Silva et al. Other studies have contrasted traditional typing with WGS for populations from different localities but also different epidemiological properties. For instance, Didelot et al. Interestingly, all isolates resolved into a single sequence type per population ST12 and ST, respectively by the most widely used tool for N.
In contrast, WGS could resolve relationships among isolates at the intra- and inter-specific levels less than substitutions genome-wide Didelot et al. Molecular markers can be used for both genotyping and inferring evolutionary relationships. More comprehensive genotyping frameworks link genetic variation with phylogenetic placement to obtain more information regarding origin and pathogenicity. For pathogens with limited genetic diversity such as Salmonella Typhi, new genotyping frameworks have been developed where researchers have identified genome-wide SNPs that link isolates to geographic source populations Roumagnac et al.
Using this framework, the authors predicted geographic origin at the country level for a subset of novel isolates, paving the way for future developments aimed at increasing accuracy and empowering clinicians and public health officials. Another group of enteric pathogens, Shigella spp. Four species exist that cause dysentery: S.
Yang et al.
Their results supported the hypothesis of multiple independent origins probably four of Shigella members from diverse E. This would explain why Shigella spp. In particular, S. The researchers collected samples from four continents and sequenced genomes from S. Similar approaches have been applied to S. Njamkepo et al. While these studies cannot establish causal relationships, the major expansions from Europe to the rest of the world coincide with periods of intense European migration due to colonialism. It is important to note that while traditional typing techniques and WGS were congruent in S.
See The et al.