|
Recent advancements in high-throughput sequencing technologies have increased the availability of T. pallidum genomes, leading to a deeper understanding of this bacterium. This PhD thesis encompasses four studies that delve into both ancient and contemporary T. pallidum genomes, seeking profound insights into its evolution and genomics.
A new method called PIM was developed to detect recombination and selection in 75 contemporary T. pallidum genomes. PIM outperformed other tools in recombination detection, revealing the crucial roles of recombination and positive selection in T. pallidum evolution, particularly in defense and virulence.
Obtaining ancient T. pallidum genomes was considered impossible until recent advancements. This thesis successfully obtained two ancient high-coverage genomes: W86 (TPA) from Poland dating back to the 17th century, and ZH1540 (TEN) from Brazil dating back 2,000 years, the first pre-Columbian T. pallidum genome from the Americas. By incorporating these two new ancient genomes into diverse datasets of T. pallidum genomes, the study uncovered several additional novel recombinant genes using the PIM method. The identification of the strains involved in each recombination event shed light on potential recombination between TPE/TEN and TPA strains in the Old World, indicating the coexistence and circulation of these subspecies in the same region. Moreover, the divergence dates from ancient genomes were older than estimates based on modern genomes alone, enhancing evolutionary timeline accuracy through Bayesian molecular clock dating.
A new mapping approach was developed to enhance genome coverage, reduce reference bias, and improve the accuracy of phylogenetic inference and assignment. This approach eliminated the need for comparisons with multiple reference genomes, streamlining subsequent analyses. While the choice of reference genome influenced the phylogenetic placement of ancient genomes, it did not impact the classification of strains within subspecies.
Furthermore, a novel MLST scheme was devised utilizing 121 T. pallidum genomes, incorporating seven variable genes and 23S rRNA genes. This scheme effectively discriminated between strains across all T. pallidum subspecies, revealing genetic diversity and highlighting the prevalence of macrolide resistance, particularly within the SS14 sublineage. Notably, sample amplification was achieved using a single PCR instead of nested PCRs, resulting in significant time and cost savings while improving efficiency. Analysis of genetic diversity and population structure unveiled localized transmission patterns and underscored the influence of regional factors in the spread of T. pallidum.
This doctoral thesis represents a significant advancement in our comprehension of T. pallidum evolution, genomics, and epidemiology. The inclusion of ancient genomes, the innovative mapping approach, and the novel MLST scheme collectively contribute to the progress of this field.
|