UniProt: the universal protein knowledgebase

Author(s)The UniProt Consortium
Author(s)Wu, Cathy H.
Ordered AuthorThe UniProt Consortiumen_US
UD AuthorWu, Cathy H.en_US
Date Accessioned2017-05-12T15:41:16Z
Date Available2017-05-12T15:41:16Z
Copyright DateThe Author(s) 2016.en_US
Publication Date2016-11-28
DescriptionPublisher's PDFen_US
AbstractThe UniProt knowledgebase is a large resource of protein sequences and associated detailed annotation. The database contains over 60 million sequences, of which over half a million sequences have been curated by experts who critically review experimental and predicted data for each protein. The remainder are automatically annotated based on rule systems that rely on the expert curated knowledge. Since our last update in 2014, we have more than doubled the number of reference proteomes to 5631, giving a greater coverage of taxonomic diversity. We implemented a pipeline to remove redundant highly similar proteomes that were causing excessive redundancy in UniProt. The initial run of this pipeline reduced the number of sequences in UniProt by 47 million. For our users interested in the accessory proteomes, we have made available sets of pan proteome sequences that cover the diversity of sequences for each species that is found in its strains and sub-strains. To help interpretation of genomic variants, we provide tracks of detailed protein information for the major genome browsers. We provide a SPARQL endpoint that allows complex queries of the more than 22 billion triples of data in UniProt (http://sparql.uniprot.org/). UniProt resources can be accessed via the website at http://www.uniprot.org/.en_US
DepartmentUniversity of Delaware. Department of Computer and Information Sciences.en_US
CitationUniProt Consortium. "UniProt: the universal protein knowledgebase." Nucleic acids research 45.D1 (2017): D158-D169.en_US
DOI10.1093/nar/gkh131en_US
ISSN1362-4962en_US
URLhttp://udspace.udel.edu/handle/19716/21346
LanguageEnglishen_US
PublisherOxford University Press on behalf of Nucleic Acids Researchen_US
dc.rightsCC BY 4.0en_US
dc.sourceNucleic Acids Researchen_US
dc.source.urihttps://academic.oup.com/naren_US
TitleUniProt: the universal protein knowledgebaseen_US
TypeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
gkw1099_1490211050T3643.pdf
Size:
5.63 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: