PMID- 26820188 OWN - NLM STAT- MEDLINE DCOM- 20170104 LR - 20240325 IS - 1532-0480 (Electronic) IS - 1532-0464 (Print) IS - 1532-0464 (Linking) VI - 60 DP - 2016 Apr TI - Multivariate analysis of the population representativeness of related clinical studies. PG - 66-76 LID - S1532-0464(16)00008-3 [pii] LID - 10.1016/j.jbi.2016.01.007 [doi] AB - OBJECTIVE: To develop a multivariate method for quantifying the population representativeness across related clinical studies and a computational method for identifying and characterizing underrepresented subgroups in clinical studies. METHODS: We extended a published metric named Generalizability Index for Study Traits (GIST) to include multiple study traits for quantifying the population representativeness of a set of related studies by assuming the independence and equal importance among all study traits. On this basis, we compared the effectiveness of GIST and multivariate GIST (mGIST) qualitatively. We further developed an algorithm called "Multivariate Underrepresented Subgroup Identification" (MAGIC) for constructing optimal combinations of distinct value intervals of multiple traits to define underrepresented subgroups in a set of related studies. Using Type 2 diabetes mellitus (T2DM) as an example, we identified and extracted frequently used quantitative eligibility criteria variables in a set of clinical studies. We profiled the T2DM target population using the National Health and Nutrition Examination Survey (NHANES) data. RESULTS: According to the mGIST scores for four example variables, i.e., age, HbA1c, BMI, and gender, the included observational T2DM studies had superior population representativeness than the interventional T2DM studies. For the interventional T2DM studies, Phase I trials had better population representativeness than Phase III trials. People at least 65years old with HbA1c value between 5.7% and 7.2% were particularly underrepresented in the included T2DM trials. These results confirmed well-known knowledge and demonstrated the effectiveness of our methods in population representativeness assessment. CONCLUSIONS: mGIST is effective at quantifying population representativeness of related clinical studies using multiple numeric study traits. MAGIC identifies underrepresented subgroups in clinical studies. Both data-driven methods can be used to improve the transparency of design bias in participation selection at the research community level. CI - Copyright (c) 2016 Elsevier Inc. All rights reserved. FAU - He, Zhe AU - He Z AD - Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA. Electronic address: zh2132@columbia.edu. FAU - Ryan, Patrick AU - Ryan P AD - Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA; Janssen Research and Development, Titusville, NJ 08560, USA; Observational Health Data Sciences and Informatics, New York, NY 10032, USA. FAU - Hoxha, Julia AU - Hoxha J AD - Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA. FAU - Wang, Shuang AU - Wang S AD - Department of Biostatistics, Columbia University, New York, NY 10032, USA. FAU - Carini, Simona AU - Carini S AD - Department of Medicine, University of California, San Francisco, CA 94143, USA. FAU - Sim, Ida AU - Sim I AD - Department of Medicine, University of California, San Francisco, CA 94143, USA. FAU - Weng, Chunhua AU - Weng C AD - Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA; Observational Health Data Sciences and Informatics, New York, NY 10032, USA. LA - eng GR - R01 LM009886/LM/NLM NIH HHS/United States GR - UL1 TR000040/TR/NCATS NIH HHS/United States GR - R01LM009886/LM/NLM NIH HHS/United States GR - UL1TR000040/TR/NCATS NIH HHS/United States PT - Journal Article PT - Research Support, N.I.H., Extramural DEP - 20160125 PL - United States TA - J Biomed Inform JT - Journal of biomedical informatics JID - 100970413 SB - IM MH - *Algorithms MH - Biomedical Research/*standards MH - Clinical Trials as Topic MH - Databases, Factual MH - Demography/*methods MH - Diabetes Mellitus, Type 2 MH - Humans MH - Medical Informatics Computing MH - Multivariate Analysis MH - Nutrition Surveys MH - Observational Studies as Topic MH - Patient Selection MH - *Selection Bias PMC - PMC4837055 MID - NIHMS754593 OTO - NOTNLM OT - Clinical trial OT - Knowledge representation OT - Selection bias COIS- COMPETING INTERESTS None. EDAT- 2016/01/29 06:00 MHDA- 2017/01/05 06:00 PMCR- 2017/04/01 CRDT- 2016/01/29 06:00 PHST- 2015/07/11 00:00 [received] PHST- 2016/01/15 00:00 [revised] PHST- 2016/01/19 00:00 [accepted] PHST- 2016/01/29 06:00 [entrez] PHST- 2016/01/29 06:00 [pubmed] PHST- 2017/01/05 06:00 [medline] PHST- 2017/04/01 00:00 [pmc-release] AID - S1532-0464(16)00008-3 [pii] AID - 10.1016/j.jbi.2016.01.007 [doi] PST - ppublish SO - J Biomed Inform. 2016 Apr;60:66-76. doi: 10.1016/j.jbi.2016.01.007. Epub 2016 Jan 25.