PMID- 21672907 OWN - NLM STAT- MEDLINE DCOM- 20111102 LR - 20211020 IS - 1527-974X (Electronic) IS - 1067-5027 (Print) IS - 1067-5027 (Linking) VI - 18 IP - 4 DP - 2011 Jul-Aug TI - The application of naive Bayes model averaging to predict Alzheimer's disease from genome-wide data. PG - 370-5 LID - 10.1136/amiajnl-2011-000101 [doi] AB - OBJECTIVE: Predicting patient outcomes from genome-wide measurements holds significant promise for improving clinical care. The large number of measurements (eg, single nucleotide polymorphisms (SNPs)), however, makes this task computationally challenging. This paper evaluates the performance of an algorithm that predicts patient outcomes from genome-wide data by efficiently model averaging over an exponential number of naive Bayes (NB) models. DESIGN: This model-averaged naive Bayes (MANB) method was applied to predict late onset Alzheimer's disease in 1411 individuals who each had 312,318 SNP measurements available as genome-wide predictive features. Its performance was compared to that of a naive Bayes algorithm without feature selection (NB) and with feature selection (FSNB). MEASUREMENT: Performance of each algorithm was measured in terms of area under the ROC curve (AUC), calibration, and run time. RESULTS: The training time of MANB (16.1 s) was fast like NB (15.6 s), while FSNB (1684.2 s) was considerably slower. Each of the three algorithms required less than 0.1 s to predict the outcome of a test case. MANB had an AUC of 0.72, which is significantly better than the AUC of 0.59 by NB (p<0.00001), but not significantly different from the AUC of 0.71 by FSNB. MANB was better calibrated than NB, and FSNB was even better in calibration. A limitation was that only one dataset and two comparison algorithms were included in this study. CONCLUSION: MANB performed comparatively well in predicting a clinical outcome from a high-dimensional genome-wide dataset. These results provide support for including MANB in the methods used to predict outcomes from large, genome-wide datasets. FAU - Wei, Wei AU - Wei W AD - Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA. FAU - Visweswaran, Shyam AU - Visweswaran S FAU - Cooper, Gregory F AU - Cooper GF LA - eng GR - HHSN276201000030C/LM/NLM NIH HHS/United States GR - R01 LM010020/LM/NLM NIH HHS/United States GR - R01-LM010020/LM/NLM NIH HHS/United States PT - Journal Article PT - Research Support, N.I.H., Extramural PT - Research Support, U.S. Gov't, Non-P.H.S. PL - England TA - J Am Med Inform Assoc JT - Journal of the American Medical Informatics Association : JAMIA JID - 9430800 RN - 0 (Apolipoproteins E) SB - IM MH - Aged MH - Aged, 80 and over MH - Algorithms MH - Alzheimer Disease/*diagnosis/*genetics MH - Apolipoproteins E/genetics MH - *Artificial Intelligence MH - Bayes Theorem MH - Case-Control Studies MH - *Genome-Wide Association Study MH - Humans MH - Models, Genetic MH - Polymorphism, Single Nucleotide MH - Prognosis MH - ROC Curve PMC - PMC3128400 COIS- Competing interests: None. EDAT- 2011/06/16 06:00 MHDA- 2011/11/04 06:00 PMCR- 2012/07/01 CRDT- 2011/06/16 06:00 PHST- 2011/06/16 06:00 [entrez] PHST- 2011/06/16 06:00 [pubmed] PHST- 2011/11/04 06:00 [medline] PHST- 2012/07/01 00:00 [pmc-release] AID - amiajnl-2011-000101 [pii] AID - 10.1136/amiajnl-2011-000101 [doi] PST - ppublish SO - J Am Med Inform Assoc. 2011 Jul-Aug;18(4):370-5. doi: 10.1136/amiajnl-2011-000101.