PMID- 18496789 OWN - NLM STAT- MEDLINE DCOM- 20081219 LR - 20081127 IS - 1096-987X (Electronic) IS - 0192-8651 (Linking) VI - 30 IP - 1 DP - 2009 Jan 15 TI - HIV-1 protease cleavage site prediction based on amino acid property. PG - 33-9 LID - 10.1002/jcc.21024 [doi] AB - Knowledge of the polyprotein cleavage sites by HIV protease will refine our understanding of its specificity, and the information thus acquired is useful for designing specific and efficient HIV protease inhibitors. Recently, several works have approached the HIV-1 protease specificity problem by applying a number of classifier creation and combination methods. The pace in searching for the proper inhibitors of HIV protease will be greatly expedited if one can find an accurate, robust, and rapid method for predicting the cleavage sites in proteins by HIV protease. In this article, we selected HIV-1 protease as the subject of the study. 299 oligopeptides were chosen for the training set, while the other 63 oligopeptides were taken as a test set. The peptides are represented by features constructed by AAIndex (Kawashima et al., Nucleic Acids Res 1999, 27, 368; Kawashima and Kanehisa, Nucleic Acids Res 2000, 28, 374). The mRMR method (Maximum Relevance, Minimum Redundancy; Ding and Peng, Proc Second IEEE Comput Syst Bioinformatics Conf 2003, 523; Peng et al., IEEE Trans Pattern Anal Mach Intell 2005, 27, 1226) combining with incremental feature selection (IFS) and feature forward search (FFS) are applied to find the two important cleavage sites and to select 364 important biochemistry features by jackknife test. Using KNN (K-nearest neighbors) to combine the selected features, the prediction model obtains high accuracy rate of 91.3% for Jackknife cross-validation test and 87.3% for independent-set test. It is expected that our feature selection scheme can be referred to as a useful assistant technique for finding effective inhibitors of HIV protease, especially for the scientists in this field. CI - Copyright 2008 Wiley Periodicals, Inc. FAU - Niu, Bing AU - Niu B AD - School of Materials Science and Engineering, Shanghai University, 149 Yan-Chang Road, Shanghai 200072, People's Republic of China. FAU - Lu, Lin AU - Lu L FAU - Liu, Liang AU - Liu L FAU - Gu, Tian Hong AU - Gu TH FAU - Feng, Kai-Yan AU - Feng KY FAU - Lu, Wen-Cong AU - Lu WC FAU - Cai, Yu-Dong AU - Cai YD LA - eng PT - Journal Article PT - Research Support, Non-U.S. Gov't PL - United States TA - J Comput Chem JT - Journal of computational chemistry JID - 9878362 RN - 0 (Amino Acids) RN - 0 (Oligopeptides) RN - EC 3.4.23.- (HIV Protease) RN - EC 3.4.23.- (p16 protease, Human immunodeficiency virus 1) SB - IM MH - Algorithms MH - Amino Acids/*chemistry MH - Binding Sites MH - Computational Biology MH - HIV Protease/*chemistry/metabolism MH - HIV-1/*enzymology MH - Models, Chemical MH - Oligopeptides/*chemistry/metabolism MH - Structure-Activity Relationship MH - Substrate Specificity EDAT- 2008/05/23 09:00 MHDA- 2008/12/20 09:00 CRDT- 2008/05/23 09:00 PHST- 2008/05/23 09:00 [pubmed] PHST- 2008/12/20 09:00 [medline] PHST- 2008/05/23 09:00 [entrez] AID - 10.1002/jcc.21024 [doi] PST - ppublish SO - J Comput Chem. 2009 Jan 15;30(1):33-9. doi: 10.1002/jcc.21024.