PMID- 24705205 OWN - NLM STAT- MEDLINE DCOM- 20140826 LR - 20211021 IS - 1758-0463 (Electronic) IS - 1758-0463 (Linking) VI - 2014 DP - 2014 TI - HPVdb: a data mining system for knowledge discovery in human papillomavirus with applications in T cell immunology and vaccinology. PG - bau031 LID - 10.1093/database/bau031 [doi] LID - bau031 AB - High-risk human papillomaviruses (HPVs) are the causes of many cancers, including cervical, anal, vulvar, vaginal, penile and oropharyngeal. To facilitate diagnosis, prognosis and characterization of these cancers, it is necessary to make full use of the immunological data on HPV available through publications, technical reports and databases. These data vary in granularity, quality and complexity. The extraction of knowledge from the vast amount of immunological data using data mining techniques remains a challenging task. To support integration of data and knowledge in virology and vaccinology, we developed a framework called KB-builder to streamline the development and deployment of web-accessible immunological knowledge systems. The framework consists of seven major functional modules, each facilitating a specific aspect of the knowledgebase construction process. Using KB-builder, we constructed the Human Papillomavirus T cell Antigen Database (HPVdb). It contains 2781 curated antigen entries of antigenic proteins derived from 18 genotypes of high-risk HPV and 18 genotypes of low-risk HPV. The HPVdb also catalogs 191 verified T cell epitopes and 45 verified human leukocyte antigen (HLA) ligands. Primary amino acid sequences of HPV antigens were collected and annotated from the UniProtKB. T cell epitopes and HLA ligands were collected from data mining of scientific literature and databases. The data were subject to extensive quality control (redundancy elimination, error detection and vocabulary consolidation). A set of computational tools for an in-depth analysis, such as sequence comparison using BLAST search, multiple alignments of antigens, classification of HPV types based on cancer risk, T cell epitope/HLA ligand visualization, T cell epitope/HLA ligand conservation analysis and sequence variability analysis, has been integrated within the HPVdb. Predicted Class I and Class II HLA binding peptides for 15 common HLA alleles are included in this database as putative targets. HPVdb is a knowledge-based system that integrates curated data and information with tailored analysis tools to facilitate data mining for HPV vaccinology and immunology. To our best knowledge, HPVdb is a unique data source providing a comprehensive list of HPV antigens and peptides. Database URL: http://cvc.dfci.harvard.edu/hpv/. FAU - Zhang, Guang Lan AU - Zhang GL AD - Cancer Vaccine Center, Dana-Farber Cancer Institute, 77 Ave Louis Pasteur, Boston, MA 02115, USA, Department of Computer Science, Metropolitan College, Boston University, 808 Commonwealth Ave, Boston, MA 02215, USA, Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA and German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany. FAU - Riemer, Angelika B AU - Riemer AB FAU - Keskin, Derin B AU - Keskin DB FAU - Chitkushev, Lou AU - Chitkushev L FAU - Reinherz, Ellis L AU - Reinherz EL FAU - Brusic, Vladimir AU - Brusic V LA - eng GR - U01 AI090043/AI/NIAID NIH HHS/United States PT - Journal Article PT - Research Support, N.I.H., Extramural PT - Research Support, Non-U.S. Gov't DEP - 20140404 PL - England TA - Database (Oxford) JT - Database : the journal of biological databases and curation JID - 101517697 RN - 0 (Antigens, Viral) RN - 0 (Epitopes, T-Lymphocyte) RN - 0 (Papillomavirus Vaccines) SB - IM MH - Amino Acid Sequence MH - Antigens, Viral/immunology MH - Conserved Sequence MH - Data Mining/*methods MH - *Databases, Genetic MH - Databases, Protein MH - Epitopes, T-Lymphocyte/chemistry/immunology MH - Genetic Variation MH - Humans MH - Knowledge Bases MH - Molecular Sequence Data MH - Papillomaviridae/classification/*immunology MH - Papillomavirus Vaccines/*immunology MH - *Software MH - T-Lymphocytes/*immunology MH - *Vaccination PMC - PMC3975992 EDAT- 2014/04/08 06:00 MHDA- 2014/08/27 06:00 PMCR- 2014/04/04 CRDT- 2014/04/08 06:00 PHST- 2014/04/08 06:00 [entrez] PHST- 2014/04/08 06:00 [pubmed] PHST- 2014/08/27 06:00 [medline] PHST- 2014/04/04 00:00 [pmc-release] AID - bau031 [pii] AID - 10.1093/database/bau031 [doi] PST - epublish SO - Database (Oxford). 2014 Apr 4;2014:bau031. doi: 10.1093/database/bau031. Print 2014.