PMID- 18449997 OWN - NLM STAT- MEDLINE DCOM- 20080605 LR - 20080502 IS - 1064-3745 (Print) IS - 1064-3745 (Linking) VI - 409 DP - 2007 TI - The classification of HLA supertypes by GRID/CPCA and hierarchical clustering methods. PG - 143-54 LID - 10.1007/978-1-60327-118-9_9 [doi] AB - Biological experiments often produce enormous amount of data, which are usually analyzed by data clustering. Cluster analysis refers to statistical methods that are used to assign data with similar properties into several smaller, more meaningful groups. Two commonly used clustering techniques are introduced in the following section: principal component analysis (PCA) and hierarchical clustering. PCA calculates the variance between variables and groups them into a few uncorrelated groups or principal components (PCs) that are orthogonal to each other. Hierarchical clustering is carried out by separating data into many clusters and merging similar clusters together. Here, we use an example of human leukocyte antigen (HLA) supertype classification to demonstrate the usage of the two methods. Two programs, Generating Optimal Linear Partial Least Square Estimations (GOLPE) and Sybyl, are used for PCA and hierarchical clustering, respectively. However, the reader should bear in mind that the methods have been incorporated into other software as well, such as SIMCA, statistiXL, and R. FAU - Guan, Pingping AU - Guan P AD - Computational Biology Group, John Innes Centre, Norwich, UK. FAU - Doytchinova, Irini A AU - Doytchinova IA FAU - Flower, Darren R AU - Flower DR LA - eng PT - Journal Article PL - United States TA - Methods Mol Biol JT - Methods in molecular biology (Clifton, N.J.) JID - 9214969 RN - 0 (HLA Antigens) SB - IM MH - Binding Sites MH - Cluster Analysis MH - *Computational Biology MH - Databases, Protein MH - HLA Antigens/chemistry/*classification/genetics MH - Humans MH - Immunogenetics/statistics & numerical data MH - Least-Squares Analysis MH - Principal Component Analysis MH - Software EDAT- 2008/05/03 09:00 MHDA- 2008/06/06 09:00 CRDT- 2008/05/03 09:00 PHST- 2008/05/03 09:00 [pubmed] PHST- 2008/06/06 09:00 [medline] PHST- 2008/05/03 09:00 [entrez] AID - 10.1007/978-1-60327-118-9_9 [doi] PST - ppublish SO - Methods Mol Biol. 2007;409:143-54. doi: 10.1007/978-1-60327-118-9_9.