PMID- 28444127 OWN - NLM STAT- MEDLINE DCOM- 20180601 LR - 20220316 IS - 1367-4811 (Electronic) IS - 1367-4803 (Linking) VI - 33 IP - 17 DP - 2017 Sep 1 TI - HLA class I binding prediction via convolutional neural networks. PG - 2658-2665 LID - 10.1093/bioinformatics/btx264 [doi] AB - MOTIVATION: Many biological processes are governed by protein-ligand interactions. One such example is the recognition of self and non-self cells by the immune system. This immune response process is regulated by the major histocompatibility complex (MHC) protein which is encoded by the human leukocyte antigen (HLA) complex. Understanding the binding potential between MHC and peptides can lead to the design of more potent, peptide-based vaccines and immunotherapies for infectious autoimmune diseases. RESULTS: We apply machine learning techniques from the natural language processing (NLP) domain to address the task of MHC-peptide binding prediction. More specifically, we introduce a new distributed representation of amino acids, name HLA-Vec, that can be used for a variety of downstream proteomic machine learning tasks. We then propose a deep convolutional neural network architecture, name HLA-CNN, for the task of HLA class I-peptide binding prediction. Experimental results show combining the new distributed representation with our HLA-CNN architecture achieves state-of-the-art results in the majority of the latest two Immune Epitope Database (IEDB) weekly automated benchmark datasets. We further apply our model to predict binding on the human genome and identify 15 genes with potential for self binding. AVAILABILITY AND IMPLEMENTATION: Codes to generate the HLA-Vec and HLA-CNN are publicly available at: https://github.com/uci-cbcl/HLA-bind . CONTACT: xhx@ics.uci.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CI - (c) The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com FAU - Vang, Yeeleng S AU - Vang YS AD - Department of Computer Science, University of California, Irvine, CA 92697, USA. FAU - Xie, Xiaohui AU - Xie X AD - Department of Computer Science, University of California, Irvine, CA 92697, USA. AD - Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, USA. LA - eng PT - Journal Article PL - England TA - Bioinformatics JT - Bioinformatics (Oxford, England) JID - 9808944 RN - 0 (Epitopes) RN - 0 (HLA Antigens) RN - 0 (Histocompatibility Antigens Class I) RN - 0 (Peptides) SB - IM MH - Epitopes MH - HLA Antigens/metabolism MH - Histocompatibility Antigens Class I/*metabolism MH - Humans MH - *Machine Learning MH - *Neural Networks, Computer MH - Peptides/*metabolism MH - Protein Binding MH - Proteomics/*methods EDAT- 2017/04/27 06:00 MHDA- 2018/06/02 06:00 CRDT- 2017/04/27 06:00 PHST- 2016/12/21 00:00 [received] PHST- 2017/04/18 00:00 [accepted] PHST- 2017/04/27 06:00 [pubmed] PHST- 2018/06/02 06:00 [medline] PHST- 2017/04/27 06:00 [entrez] AID - 3746909 [pii] AID - 10.1093/bioinformatics/btx264 [doi] PST - ppublish SO - Bioinformatics. 2017 Sep 1;33(17):2658-2665. doi: 10.1093/bioinformatics/btx264.