PMID- 21595983 OWN - NLM STAT- MEDLINE DCOM- 20110928 LR - 20211020 IS - 1745-6150 (Electronic) IS - 1745-6150 (Linking) VI - 6 DP - 2011 May 20 TI - ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates. PG - 27 LID - 10.1186/1745-6150-6-27 [doi] AB - BACKGROUND: False discovery rate (FDR) control is commonly accepted as the most appropriate error control in multiple hypothesis testing problems. The accuracy of FDR estimation depends on the accuracy of the estimation of p-values from each test and validity of the underlying assumptions of the distribution. However, in many practical testing problems such as in genomics, the p-values could be under-estimated or over-estimated for many known or unknown reasons. Consequently, FDR estimation would then be influenced and lose its veracity. RESULTS: We propose a new extrapolative method called Constrained Regression Recalibration (ConReg-R) to recalibrate the empirical p-values by modeling their distribution to improve the FDR estimates. Our ConReg-R method is based on the observation that accurately estimated p-values from true null hypotheses follow uniform distribution and the observed distribution of p-values is indeed a mixture of distributions of p-values from true null hypotheses and true alternative hypotheses. Hence, ConReg-R recalibrates the observed p-values so that they exhibit the properties of an ideal empirical p-value distribution. The proportion of true null hypotheses (pi0) and FDR are estimated after the recalibration. CONCLUSIONS: ConReg-R provides an efficient way to improve the FDR estimates. It only requires the p-values from the tests and avoids permutation of the original test data. We demonstrate that the proposed method significantly improves FDR estimation on several gene expression datasets obtained from microarray and RNA-seq experiments. FAU - Li, Juntao AU - Li J AD - Computational & Mathematical Biology, Genome Institute of Singapore, 60 Biopolis Street, Singapore 138672, Singapore. FAU - Paramita, Puteri AU - Paramita P FAU - Choi, Kwok Pui AU - Choi KP FAU - Karuturi, R Krishna Murthy AU - Karuturi RK LA - eng PT - Evaluation Study PT - Journal Article PT - Meta-Analysis PT - Research Support, Non-U.S. Gov't DEP - 20110520 PL - England TA - Biol Direct JT - Biology direct JID - 101258412 SB - IM MH - Algorithms MH - *Data Interpretation, Statistical MH - Gene Expression Profiling MH - Gene Expression Regulation, Fungal MH - Humans MH - Models, Statistical MH - *Regression Analysis MH - Reproducibility of Results MH - Saccharomyces cerevisiae/genetics/physiology MH - Sample Size MH - Sequence Analysis, RNA PMC - PMC3130718 EDAT- 2011/05/21 06:00 MHDA- 2011/09/29 06:00 PMCR- 2011/05/20 CRDT- 2011/05/21 06:00 PHST- 2010/12/03 00:00 [received] PHST- 2011/05/20 00:00 [accepted] PHST- 2011/05/21 06:00 [entrez] PHST- 2011/05/21 06:00 [pubmed] PHST- 2011/09/29 06:00 [medline] PHST- 2011/05/20 00:00 [pmc-release] AID - 1745-6150-6-27 [pii] AID - 10.1186/1745-6150-6-27 [doi] PST - epublish SO - Biol Direct. 2011 May 20;6:27. doi: 10.1186/1745-6150-6-27.