Advancing gene-centric approaches for microbial ecology

Author(s)Moore, Ryan M.
Date Accessioned2024-02-28T15:06:06Z
Date Available2024-02-28T15:06:06Z
Publication Date2024
SWORD Update2024-02-26T17:09:10Z
AbstractMetagenomics is a powerful approach that has enhanced our understanding of microbial communities and the roles microbes play in various environments. A deep examination of single genes, particularly protein-coding genes, can add critical insight to metagenomic datasets by providing functional information and allowing for the prediction of observable traits and the formulation of "genome to phenome" hypotheses. However, gene-centric approaches to metagenomics face unique challenges, and the comparative lack of tools and approaches specifically designed to address these problems makes gene-centric analyses of microbial communities less accessible to many researchers. Though data quality issues arise at all stages of the sample-to-sequence to-discovery pipeline, gene-centric studies are particularly sensitive to issues such as those arising from misannotations of the genes under study, which necessitates time consuming manual curation, or from the compositional nature of metagenomic data, which requires special statistical care. To address some of the barriers to effective gene-centric analysis in metagenomics, this dissertation introduces three tools: PASV, InteinFinder, and Iroki, as well as a novel framework for examining microbial community diversity. PASV (protein amino acid signature validator) automates the manual curation of homology search results to ensure accurate protein annotation. InteinFinder is a pipeline developed to automatically identify and remove inteins, the protein equivalent of introns, from protein sequences commonly used in gene-centric studies. Together, PASV and InteinFinder significantly reduce the amount of time and domain-knowledge traditionally needed to manually curate single gene datasets. Iroki is a userfriendly tool designed to automatically customize phylogenetic and other types of trees with user supplied metadata, facilitating data interpretation. The introduced diversity framework provides a more comprehensive and scalable view of microbial community diversity compared to current approaches, particularly for large metagenomic datasets. Overall, these advancements simplify the gene-centric study of microbial communities and enhance the metagenomic analysis pipeline.
AdvisorWommack, K. Eric
AdvisorPolson, Shawn W.
DegreePh.D.
DepartmentUniversity of Delaware, Center for Bioinformatics and Computational Biology
Unique Identifier1429145802
URLhttps://udspace.udel.edu/handle/19716/34039
Languageen
PublisherUniversity of Delaware
URIhttps://www.proquest.com/pqdtlocal1006271/dissertations-theses/advancing-gene-centric-approaches-microbial/docview/2931905862/sem-2?accountid=10457
KeywordsData science
KeywordsMicrobial ecology
KeywordsViral ecology
KeywordsMetagenomics
KeywordsMicrobiome
TitleAdvancing gene-centric approaches for microbial ecology
TypeThesis
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Moore_udel_0060D_15867.pdf
Size:
4.59 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: