PMID- 23849067 OWN - NLM STAT- MEDLINE DCOM- 20140127 LR - 20191210 IS - 1399-0039 (Electronic) IS - 0001-2815 (Linking) VI - 82 IP - 2 DP - 2013 Aug TI - Comparative validation of computer programs for haplotype frequency estimation from donor registry data. PG - 93-105 LID - 10.1111/tan.12160 [doi] AB - Estimation of human leukocyte antigen (HLA) haplotype frequencies from unrelated stem cell donor registries presents a challenge because of large sample sizes and heterogeneity of HLA typing data. For the 14th International HLA and Immunogenetics Workshop, five bioinformatics groups initiated the 'Registry Diversity Component' aiming to cross-validate and improve current haplotype estimation tools. Five datasets were derived from different donor registries and then used as input for five different computer programs for haplotype frequency estimation. Because of issues related to heterogeneity and complexity of HLA typing data identified in the initial phase, the same five implementations, and two new ones, were used on simulated datasets in a controlled experiment where the correct results were known a priori. These datasets contained various fractions of missing HLA-DR modeled after European haplotype frequencies. We measured the contribution of sampling fluctuation and estimation error to the deviation of the frequencies from their true values, finding equivalent contributions of each for the chosen samples. Because of patient-directed activities, selective prospective typing strategies and the variety and evolution of typing technology, some donors have more complete and better HLA data. In this setting, we show that restricting estimation to fully typed individuals introduces biases that could be overcome by including all donors in frequency estimation. Our study underlines the importance of critical review and validation of tools in registry-related activity and provides a sustainable framework for validating the computational tools used. Accurate frequencies are essential for match prediction to improve registry operations and to help more patients identify suitably matched donors. CI - (c) 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd. FAU - Eberhard, H-P AU - Eberhard HP AD - Zentrales Knochenmarkspender-Register Deutschland (ZKRD), Ulm, Germany. FAU - Madbouly, A S AU - Madbouly AS FAU - Gourraud, P A AU - Gourraud PA FAU - Balere, M L AU - Balere ML FAU - Feldmann, U AU - Feldmann U FAU - Gragert, L AU - Gragert L FAU - Torres, H Maldonado AU - Torres HM FAU - Pingel, J AU - Pingel J FAU - Schmidt, A H AU - Schmidt AH FAU - Steiner, D AU - Steiner D FAU - van der Zanden, H G M AU - van der Zanden HG FAU - Oudshoorn, M AU - Oudshoorn M FAU - Marsh, S G E AU - Marsh SG FAU - Maiers, M AU - Maiers M FAU - Muller, C R AU - Muller CR LA - eng PT - Comparative Study PT - Journal Article PT - Validation Study PL - England TA - Tissue Antigens JT - Tissue antigens JID - 0331072 RN - 0 (HLA Antigens) SB - IM MH - Gene Frequency MH - HLA Antigens/genetics/*immunology MH - Haplotypes/*immunology MH - Histocompatibility Testing/methods/*standards/statistics & numerical data MH - Humans MH - *Models, Statistical MH - *Registries MH - Software/*standards MH - *Stem Cell Transplantation MH - Unrelated Donors/statistics & numerical data OTO - NOTNLM OT - donor registry OT - expectation-maximization OT - haplotype frequency estimation OT - hematopoietic stem-cell transplant OT - human leukocyte antigen OT - typing ambiguity EDAT- 2013/07/16 06:00 MHDA- 2014/01/28 06:00 CRDT- 2013/07/16 06:00 PHST- 2013/01/14 00:00 [received] PHST- 2013/04/15 00:00 [revised] PHST- 2013/05/31 00:00 [accepted] PHST- 2013/07/16 06:00 [entrez] PHST- 2013/07/16 06:00 [pubmed] PHST- 2014/01/28 06:00 [medline] AID - 10.1111/tan.12160 [doi] PST - ppublish SO - Tissue Antigens. 2013 Aug;82(2):93-105. doi: 10.1111/tan.12160.