VIROME and metagenomes online: optimizing functionality by leveraging metadata

Date
2015
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
Metagenomics has become a dominant tool for profiling the composition of microbial and viral communities, allowing inferences of taxonomic or functional composition through comparison of environmental sequences to reference databases. The power of this approach is limited when environmental proteins show no homology to reference sequences or only show homology to proteins with no known function, which may account for as much as 70% of sequences among viral samples. The Viral Informatics Resource for Metagenomic Exploration (VIROME, http://virome.dbi.udel.edu) was developed to provide functional, taxonomic, and environmental homology evidence for viral metagenomes, and to provide visualization capabilities and useful binning and comparison tools. Environmental context is provided through comparison against the Metagenomes Online (MgOl, http://metagenomesonline.org) database of predicted proteins identified from 258 microbial and viral metagenomes. MgOl libraries are manually curated with environmental metadata, providing a framework for the sequence homology results increasing the proportion of a metagenome to which meaningful context can be ascribed. This project significantly built upon the utility of VIROME and MgOl by improving the quality and consistency of the associated metadata. Metadata associated with MgOl libraries has been extensively expanded in alignment with standards such as Minimum Information about any (x) Sequence (MIxS) and Environment Ontology (EnvO). An improved VIROME sample submission portal was also designed which allows users to organize their metagenome's or viral genome's metadata in a MIxS compliant format. Users have the option to export this metadata in an output format which is compatible with Genbank BioSample submissions. Environmental metadata is further leveraged within each library through new visualizations that enhance a metagenome sequences' environmental context, and throughout VIROME through new search and comparison features allowing exploration of metagenomes with similar environmental profiles or protein homology. Through updates to the MgOl database, the VIROME library submission process, and subsequent library exploration, VIROME is able to leverage environmental annotation to provide flexible, user-driven grouping and comparison and facilitate relevant insights into sequence significance and viral community diversity.
Description
Keywords
Citation