Kizee Etienne, Bioinformatics Thesis Defense

In partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Bioinformatics
in the School of Biological Sciences

Kizee A. Etienne

Defends her thesis:
Computational Genomics to Characterize the Distribution, Diversity and Drug Resistance of Fungal Pathogens.

Thursday, June 9, 2022 
2:00pm Eastern Time 
Zoom Meeting =

Thesis advisor: 
Dr. Lavanya Rishishwar, School of Biological Sciences, Georgia Institute of Technology. 

Committee Members: 
Dr. I. King Jordan, School of Biological Sciences, Georgia Institute of Technology.
Dr. Jung Choi, School of Biological Sciences, Georgia Institute of Technology.
Dr. Frank Stewart, School of Biological Sciences, Georgia Institute of Technology.
Dr. Anastasia Litvintseva, Mycotic Diseases Branch, Centers for Disease Control and Prevention. 

Molecular strain typing of pathogenic fungi has been the “gold-standard” of epidemiology and laboratory-based surveillance in public health mycology providing information on the geographic distribution among groups of fungal pathogens, genetic relatedness/diversity and characterizing the genetic basis of antifungal drug resistance. Fungal pathogens are historically understudied and commonly overlooked as etiological agents of invasive fungal infections (IFIs) and outbreaks in the community and/or hospital environments.  The limited availability of sensitive and robust traditional molecular typing techniques for emerging and existing pathogenic fungi and a growing interest in whether the widespread use of current therapies contribute to the changing epidemiology of fungal pathogens requires a shift to techniques that identify genome-wide variations that may contribute to the emergence and distribution of fungal pathogens. Fungal genomes tend to be difficult to manipulate under laboratory conditions, large and complex; therefore, the choice of sequencing technology and computational approach depends on the goals of the study. Whole Genome Sequence Typing (WGST) has emerged as the main molecular typing technique of fungal pathogens in outbreak investigations and molecular surveillance. Computational analysis includes de novo assembly, mapping/alignment, and identifying single nucleotide polymorphisms (SNPs) that are common among a group of fungal pathogens.  I evaluate a WGST workflow application in outbreak investigation of four emerging fungal pathogens: Apophysomyces trapeziformis, Sarocladium (Acremonium) kiliense, Aspergillus fumigatus and Candida auris.
Whole genome sequencing methods were applied to pathogenic fungi (Apophysomyces trapeziformis, Sarocladium (Acremonium) kiliense) without reference genomes and/or robust molecular typing techniques to determine its applicability in understanding the genetic diversity and relationship among clusters of fungal infections.  Following a tornado in Joplin, Missouri (2011), three clusters comprising Apophysomyces trapeziformis isolates recovered six tornado victims were more genetically related than to A. trapeziformis isolates not part of this outbreak. This study suggests a local population and provides an initial view of A. trapeziformis in its natural setting.  Over the past decade there has been an increasing number of fungal pathogens normally known to infect plants crossing over to infect humans.  Sarocladium (Acremonium) kiliense, a fungus typically restricted to the environment, was implicated in a national outbreak of bloodstream infections in children undergoing chemotherapy.  Whole genome sequence typing identified two disparate clusters; an outbreak cluster that genetically linked infections to contaminated anti-nausea medication produced by a pharmaceutical company; and a non-outbreak cluster comprising of various strains of S. kiliense.    
The global emergence of Azole Resistant Aspergillus fumigatus (ARAf) in patients who have never been treated with triazoles is steadily increasing and has been attributed to environmental acquisition of A. fumigatus strains carrying environmentally derived mutations.    The complete genome sequence of A. fumigatus has facilitated our understanding of the molecular mechanisms implicated in antifungal resistance, namely 34-bp tandem repeats (TR34) located in the promoter region of the cyp51A gene, the main target of triazoles. I investigated the recent emergence of ARAf in the U.S. to characterize the genomic basis of resistance and defining its distribution and spread.   Whole genome SNP and clustering analyses demonstrated a single introduction of TR34 resistance mechanism into the U.S. and admixture analysis suggest the possibility of its spread through recombination.  I highlight the acquisition of ARAf resistance mechanism through a globally shared environmentally driven resistance mechanism.
WGST has emerged as the reference technique to characterize epidemiology of emerging fungal pathogens and identifying reservoirs and sources of infections.  In many cases, it may be necessary to iterate over the workflow as there is a lack of complete reference genomes and the reference based WGST framework can be time and resource intensive.  Consequently, this computational workflow may be inefficient in an outbreak investigation and/or molecular based surveillance. Reference-free, or alignment-free, approach may provide an alternative, scalable and efficient analysis technique for comparative genomics of fungi in outbreak investigations.  I evaluated the use of k-mers (sub-sequences of length k) as genomic clustering utilizing technique in a retrospective outbreak of Candida auris which simultaneously emerged on 3 continents.  Reference-based analysis using WGST, and ANI demonstrated the present of three geographically distinct clades of Candida auris.  The reference-free technique also identified 3 clades of Candida auris, with the added advantage of rapid results.  A reference-free approach may be suitable for fungal outbreak investigations and routine surveillance.