Progeny Clustering: A Method to Identify Biological Phenotypes

Author(s)Hu, Chenyue W.
Author(s)Kornblau, Steven M.
Author(s)Slater, John H.
Author(s)Qutub, Amina A.
Ordered AuthorChenyue W. Hu, Steven M. Kornblau, John H. Slater & Amina A. Qutub
UD AuthorSlater, John H.en_US
Date Accessioned2016-04-25T19:27:06Z
Date Available2016-04-25T19:27:06Z
Copyright DateCopyright ©en_US
Publication Date2015-08-12
DescriptionPublisher's PDF.en_US
AbstractEstimating the optimal number of clusters is a major challenge in applying cluster analysis to any type of dataset, especially to biomedical datasets, which are high-dimensional and complex. Here, we introduce an improved method, Progeny Clustering, which is stability-based and exceptionally efficient in computing, to find the ideal number of clusters. The algorithm employs a novel Progeny Sampling method to reconstruct cluster identity, a co-occurrence probability matrix to assess the clustering stability, and a set of reference datasets to overcome inherent biases in the algorithm and data space. Our method was shown successful and robust when applied to two synthetic datasets (datasets of two-dimensions and ten-dimensions containing eight dimensions of pure noise), two standard biological datasets (the Iris dataset and Rat CNS dataset) and two biological datasets (a cell phenotype dataset and an acute myeloid leukemia (AML) reverse phase protein array (RPPA) dataset). Progeny Clustering outperformed some popular clustering evaluation methods in the tendimensional synthetic dataset as well as in the cell phenotype dataset, and it was the only method that successfully discovered clinically meaningful patient groupings in the AML RPPA dataset.en_US
DepartmentUniversity of Delaware. Department of Biomedical Engineering.en_US
CitationHu, C. W. et al. Progeny Clustering: A Method to Identify Biological Phenotypes. Sci. Rep. 5, 12894; doi: 10.1038/srep12894 (2015).en_US
DOI10.1038/srep12894en_US
ISSN2045-2322en_US
URLhttp://udspace.udel.edu/handle/19716/17681
Languageen_USen_US
PublisherNature Publishing Groupen_US
dc.rightsCC BY 4.0en_US
dc.sourceScientific reportsen_US
dc.source.urihttp://www.nature.com/srep/en_US
TitleProgeny Clustering: A Method to Identify Biological Phenotypesen_US
TypeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Progeny Clustering A Method to Identify Biological Phenotypes_1454009491T7716.pdf
Size:
1.52 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: