Integrated, scalable tools for small RNA genomics: novel algorithms and their application to characterize germline-associated sRNA pathways in diverse species

Date
2017
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
Cells associated with the male germline, specifically in rice and maize (grasses), produce diverse and numerous “phased” 21-nt and 24-nt siRNAs. These phased siRNAs (phasiRNAs) show striking similarity to mammalian Piwi-interacting RNAs (piRNAs) in terms of their abundance, biogenesis and timing of accumulation. Both the plant phasiRNA and mammalian piRNA pathways are emerging as factors crucial for reproductive success. However, since the first report of germline-associated plant phasiRNAs, no systematic study of their evolutionary origins has yet been reported; in this context, the meiotic (24-nt) phasiRNAs are particularly interesting, as they have only been described in grasses, a group of monocots that speciated ~71 million years ago (MYA). Grasses include the most important staple crops: rice, maize and wheat. Given the importance of reproductive success to crop yield, a deeper understanding of phasiRNA pathway is crucial. ☐ This dissertation traces the prevalence and origins of phasiRNA pathways in monocot evolution, while simultaneously it addresses a broad range of key computational gaps and algorithmic limitations in leveraging small RNA data for the study of small RNA in plants. First, I present a new set of tools for identifying and validating miRNA targets, and a new suite for computational characterization of phasiRNAs, which together comprise important methods for studies of plant sRNA field. These next generation tools efficiently scale to the increasing volume of high-throughput data, and are fast, sensitive and feature-rich compared to the existing options. Next in my work, I deployed these tools to investigate phasiRNAs in a recently sequenced genome, that of Asparagus officinalis. The common ancestor of asparagus and the grasses diverged approximately 109 MYA. My work then further expanded to study two other non-grass monocots, Lilium (Lilium maculatum) and daylily (Hemerocallis lilioasphodelus), which diverged from Asparagus ~111 MYA. In this dissertation, I demonstrate that both pre-meiotic and meiotic phasiRNAs are prevalent across the monocots that I studied, establishing their origins well before grasses. In addition to male germline, I find evidence for their accumulation in female and somatic tissues, perhaps suggesting that the narrow accumulation of reproductive phasiRNAs in anthers is either not a general characteristic or it is the product of evolutionary refinement in the grasses. I show that the miRNA trigger for pre-meiotic (21-nt) phasiRNAs likely shifted in evolutionary time from targeting pathogen-defense genes to long, non-coding RNAs (observed in grasses) via specialization and sub-functionalization versus neo-functionalization. I also demonstrate that exceptions to the canonical mechanism of biogenesis of phasiRNAs exist in monocot evolution, whereby phasiRNAs are produced apparently without a miRNA trigger. I conclude that plants show substantial variation in their composition and biogenesis of reproductive phasiRNAs, which have broad roles in plant germline development.
Description
Keywords
Biological sciences, Applied sciences, Asparagus, miRNAs, phasiRNAs, PHASIS, sPARTA, sRNAs
Citation