PMID- 23815611 OWN - NLM STAT- MEDLINE DCOM- 20131113 LR - 20211021 IS - 1471-2105 (Electronic) IS - 1471-2105 (Linking) VI - 14 Suppl 8 IP - Suppl 8 DP - 2013 TI - Integrating peptides' sequence and energy of contact residues information improves prediction of peptide and HLA-I binding with unknown alleles. PG - S1 LID - 10.1186/1471-2105-14-S8-S1 [doi] AB - BACKGROUND: The HLA (human leukocyte antigen) class I is a kind of molecule encoded by a large family of genes and is characteristic of high polymorphism. Now the number of the registered HLA-I molecules has exceeded 3000. Slight differences in the amino acid sequences of HLAs would make them bind to different sets of peptides. In the past decades, although many methods have been proposed to predict the binding between peptides and HLA-I molecules and achieved good performance, most experimental data used by them is limited to the HLAs with a small number of alleles. Thus they are inclined to obtain high prediction accuracy only for data with similar alleles. Because the peptides and HLAs together determine the binding, it's necessary to consider their contribution meanwhile. RESULTS: By taking into account the features of the peptides sequence and the energy of contact residues, in this paper a method based on the artificial neural network is proposed to predict the binding of peptides and HLA-I even when the HLAs' potential alleles are unknown. Two experiments in the allele-specific and super-type cases are performed respectively to validate our method. In the first case, we collect 14 HLA-A and 14 HLA-B molecules on Bjoern Peters dataset, and compare our method with the ARB, SMM, NetMHC and other 16 online methods. Our method gets the best average AUC (Area under the ROC) value as 0.909. In the second one, we use leave one out cross validation on MHC-peptide binding data that has different alleles but shares the common super-type. Compared to gold standard methods like NetMHC and NetMHCpan, our method again achieves the best average AUC value as 0.847. CONCLUSIONS: Our method achieves satisfactory results. Whenever it's tested on the HLA-I with single definite gene or with super-type gene locus, it gets better classification accuracy. Especially, when the training set is small, our method still works better than the other methods in the comparison. Therefore, we could make a conclusion that by combining the peptides' information, HLAs amino acid residues' interaction information and contact energy, our method really could improve prediction of the peptide HLA-I binding even when there aren't the prior experimental dataset for HLAs with various alleles. FAU - Luo, Fei AU - Luo F AD - School of Computer, Wuhan University, Wuhan, Hubei, China. FAU - Gao, Yangyang AU - Gao Y FAU - Zhu, Yongqiong AU - Zhu Y FAU - Liu, Juan AU - Liu J LA - eng PT - Journal Article DEP - 20130509 PL - England TA - BMC Bioinformatics JT - BMC bioinformatics JID - 100965194 RN - 0 (HLA Antigens) RN - 0 (Histocompatibility Antigens Class I) RN - 0 (MHC binding peptide) RN - 0 (Oligopeptides) SB - IM MH - Algorithms MH - Alleles MH - Area Under Curve MH - Energy Metabolism MH - HLA Antigens/chemistry/genetics/immunology MH - Histocompatibility Antigens Class I/chemistry/genetics/immunology/*metabolism MH - Humans MH - *Neural Networks, Computer MH - Oligopeptides/*metabolism PMC - PMC3654895 EDAT- 2013/07/17 06:00 MHDA- 2013/11/14 06:00 PMCR- 2013/05/09 CRDT- 2013/07/03 06:00 PHST- 2013/07/03 06:00 [entrez] PHST- 2013/07/17 06:00 [pubmed] PHST- 2013/11/14 06:00 [medline] PHST- 2013/05/09 00:00 [pmc-release] AID - 1471-2105-14-S8-S1 [pii] AID - 10.1186/1471-2105-14-S8-S1 [doi] PST - ppublish SO - BMC Bioinformatics. 2013;14 Suppl 8(Suppl 8):S1. doi: 10.1186/1471-2105-14-S8-S1. Epub 2013 May 9.