Penghao Xu, Bioinformatics Thesis Defense

In partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Bioinformatics
in the School of Biological Sciences
 
Penghao Xu
 
Defends his thesis:
Revealing patterns of ribonucleotide incorporation in genomic DNA of eukaryotic cells
 
Wednesday, November 30, 2022
9:00 AM Eastern Time
EBB Krone CHOA Seminar Room (room #1005)
Zoom link: https://gatech.zoom.us/j/92565843976
 
Thesis Advisor:
Dr. Francesca Storici
School of Biological Sciences
Georgia Institute of Technology
 
 
Committee Members:
Dr. Mark Borodovsky
Department of Biomedical Engineering
Georgia Institute of Technology
 
Dr. Shamkant B. Navathe
School of Computer Science
Georgia Institute of Technology
 
Dr. Patrick T. McGrath
School of Biological Sciences
Georgia Institute of Technology
 
Dr. Fredrik Vannberg
College of Arts & Sciences
Georgia State University
  
Summary:
Ribonucleoside triphosphates (rNTPs) are incorporated into DNA by DNA polymerases in the form of ribonucleoside monophosphates (rNMPs), which are the most abundant modified nucleotides in genomic DNA in many species. The presence of rNMPs in DNA leads to DNA structural change, and genome instability, affects protein-DNA interactions, and results in diseases including Aicardi-Goutières Syndrome. The incorporated rNTPs are mainly removed by the ribonucleotide excision repair (RER) pathway initiated by Ribonuclease (RNase) H2. Previous researchers already developed several rNMP-mapping techniques to capture the rNMPs in DNA at single nucleotide level. However, the analyses of rNTP incorporation are still limited to the basic study of rNMP counts and compositions in small genomes of RNase H2-deactivated cells. In this study, we first developed a series of key bioinformatics tools to maximize the efficiency of the rNMP-mapping procedure, and to uncover unique features of rNMP presence in the genomic DNA of cells. We optimized the restriction enzyme usage of rNMP-mapping techniques to significantly increase the rNMP capture efficiency, which allowed us to perform detailed rNMP-pattern analysis and reveal correlations between genomic rNMP sites and specific genetic elements. The development of heatmaps showing the composition and sequence context of the rNMPs in genomic DNA allowed us to discover preferred rNTP-incorporation patterns and hotspot motifs in the full genome of many eukaryotic cells including budding and fission yeast, green algae, and human cells. By studying incorporated rNTP patterns around the origins of replication in S. cerevisiae, we revealed rNMP-pattern changes associated with the DNA replication process, which validated the contribution of DNA Pol δ in the early stage of nascent leading strand synthesis. Moreover, we discovered a strong preference for rNMPs after dAMPs on the leading strands and after dCMPs on the lagging strands, which are the specific rNTP incorporation patterns of eukaryotic replicative DNA Pol ε and Pol δ, respectively. These signatures of rNMPs in DNA can be used to track the DNA polymerase usage in DNA replication across the yeast genome and possibly in other eukaryotic cells. Finally, by studying rNTP-incorporation patterns in human mitochondrial DNA (mtDNA), we found a strong light strand preference for rNTP-incorporation and rNMP-enriched zones in human mtDNA. Three consistent rNMP-enriched zones, found within the replication control region of human mtDNA, may affect the activity of DNA Pol γ and that of TWINKLE, the main helicase of human mtDNA. We also showed that longer coding sequences in human mtDNA have higher rNTP-incorporation frequency on their non-template strands at each base, which revealed the direct association between rNMPs and genes in human mtDNA.