PMID- 38293199 OWN - NLM STAT- PubMed-not-MEDLINE LR - 20240214 DP - 2024 Jan 16 TI - A rigorous benchmarking of alignment-based HLA typing algorithms for RNA-seq data. LID - 2023.05.22.541750 [pii] LID - 10.1101/2023.05.22.541750 [doi] AB - Accurate identification of human leukocyte antigen (HLA) alleles is essential for various clinical and research applications, such as transplant matching and drug sensitivities. Recent advances in RNA-seq technology have made it possible to impute HLA types from sequencing data, spurring the development of a large number of computational HLA typing tools. However, the relative performance of these tools is unknown, limiting the ability for clinical and biomedical research to make informed choices regarding which tools to use. Here we report the study design of a comprehensive benchmarking of the performance of 12 HLA callers across 682 RNA-seq samples from 8 datasets with molecularly defined gold standard at 5 loci, HLA-A, -B, -C, -DRB1, and -DQB1. For each HLA typing tool, we will comprehensively assess their accuracy, compare default with optimized parameters, and examine for discrepancies in accuracy at the allele and loci levels. We will also evaluate the computational expense of each HLA caller measured in terms of CPU time and RAM. We also plan to evaluate the influence of read length over the HLA region on accuracy for each tool. Most notably, we will examine the performance of HLA callers across European and African groups, to determine discrepancies in accuracy associated with ancestry. We hypothesize that RNA-Seq HLA callers are capable of returning high-quality results, but the tools that offer a good balance between accuracy and computational expensiveness for all ancestry groups are yet to be developed. We believe that our study will provide clinicians and researchers with clear guidance to inform their selection of an appropriate HLA caller. FAU - Yu, Dottie AU - Yu D AUID- ORCID: 0009-0004-5682-7362 AD - Department of Quantitative and Computational Biology, Dornsife College of Letters, Arts and Sciences, University of Southern California, 1975 Zonal Ave, Los Angeles, CA 90033, USA. FAU - Ayyala, Ram AU - Ayyala R AUID- ORCID: 0000-0001-7275-271X AD - Department of Quantitative and Computational Biology, Dornsife College of Letters, Arts and Sciences, University of Southern California, 1975 Zonal Ave, Los Angeles, CA 90033, USA. FAU - Sadek, Sarah Hany AU - Sadek SH AD - Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, Los Angeles, CA, USA. AD - Department of Biology, and Department of Computer Science, California State University, Fullerton, Fullerton, CA 92831. FAU - Chittampalli, Likhitha AU - Chittampalli L AUID- ORCID: 0000-0002-3976-1750 AD - Department of Computer Science, Viterbi School of Engineering University of Southern California, Los Angeles, CA, USA. FAU - Farooq, Hafsa AU - Farooq H AUID- ORCID: 0009-0002-6260-5016 AD - Department of Computer Science, Georgia State University Atlanta, GA 30303 USA. FAU - Jung, Junghyun AU - Jung J AUID- ORCID: 0000-0001-6832-4368 AD - Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, Los Angeles, CA, USA. AD - Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA. FAU - Nahid, Abdullah Al AU - Nahid AA AUID- ORCID: 0000-0002-4390-0768 AD - Department of Biochemistry and Molecular Biology, Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh. FAU - Boldirev, Grigore AU - Boldirev G AD - Department of Computer Science, College of Arts and Sciences, Georgia State University, Atlanta, GA, 30303, USA. FAU - Jung, Mina AU - Jung M AD - Department of Quantitative and Computational Biology, Dornsife College of Letters, Arts and Sciences, University of Southern California, 1975 Zonal Ave, Los Angeles, CA 90033, US. FAU - Park, Sungmin AU - Park S AD - Department of Computer Science and Engineering, Dongguk University-Seoul, Seoul, 04620, South Korea. FAU - Nguyen, Austin AU - Nguyen A AUID- ORCID: 0000-0001-7940-4830 AD - Computational Biologist, Immune Monitoring & Cancer Omics Oregon Health & Science University, Biomedical Engineering, 3181 S.W. Sam Jackson Park Road Portland, OR 97239-3098. FAU - Zelikovsky, Alex AU - Zelikovsky A AD - Department of Computer Science, College of Arts and Sciences, Georgia State University, Atlanta, GA, 30303, USA. FAU - Mancuso, Nicholas AU - Mancuso N AUID- ORCID: 0000-0002-9352-5927 AD - Assistant Professor of Population and Public Health Sciences, Keck School of Medicina, University of Southern California, 1845 N. Soto Street, USA. FAU - Joo, Jong Wha J AU - Joo JWJ AUID- ORCID: 0000-0002-1863-4664 AD - Department of Computer Science and Engineering, Dongguk University-Seoul, Seoul, 04620, South Korea. AD - Division of AI Software Convergence, Dongguk University-Seoul, Seoul, 04620, South Korea. FAU - Thompson, Reid F AU - Thompson RF AUID- ORCID: 0000-0003-3661-5296 AD - Assistant Professor of Radiation Medicine, School of Medicine, OHSU, Portland, OR 97239. AD - Assistant Professor of Biomedical Engineering, School of Medicine, OHSU, Portland, OR 97239. AD - Staff Physician, VA Portland Healthcare System, Portland OR 97239. FAU - Alachkar, Houda AU - Alachkar H AUID- ORCID: 0000-0001-5567-5521 AD - Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, CA, USA. FAU - Mangul, Serghei AU - Mangul S AUID- ORCID: 0000-0003-4770-3443 AD - Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, 1540 Alcazar Street, Los Angeles, CA 90033, USA. AD - Department of Quantitative and Computational Biology, University of Southern California, Los Angeles. LA - eng GR - R01 AI173172/AI/NIAID NIH HHS/United States PT - Preprint DEP - 20240116 PL - United States TA - bioRxiv JT - bioRxiv : the preprint server for biology JID - 101680187 PMC - PMC10827116 COIS- Competing interests The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. EDAT- 2024/01/31 06:42 MHDA- 2024/01/31 06:43 PMCR- 2024/01/30 CRDT- 2024/01/31 04:21 PHST- 2024/01/31 06:42 [pubmed] PHST- 2024/01/31 06:43 [medline] PHST- 2024/01/31 04:21 [entrez] PHST- 2024/01/30 00:00 [pmc-release] AID - 2023.05.22.541750 [pii] AID - 10.1101/2023.05.22.541750 [doi] PST - epublish SO - bioRxiv [Preprint]. 2024 Jan 16:2023.05.22.541750. doi: 10.1101/2023.05.22.541750.