PMID- 15014169 OWN - NLM STAT- MEDLINE DCOM- 20041210 LR - 20240109 IS - 0737-4038 (Print) IS - 0737-4038 (Linking) VI - 21 IP - 5 DP - 2004 May TI - False-positive selection identified by ML-based methods: examples from the Sig1 gene of the diatom Thalassiosira weissflogii and the tax gene of a human T-cell lymphotropic virus. PG - 914-21 AB - Sexually induced gene 1 (Sig1) in the centric diatom Thalassiosira weissflogii is considered to encode a gamete recognition protein. Sorhannus (2003) analyzed nucleotide sequences of Sig1 using parsimony analysis and the maximum-likelihood (ML)-based Bayesian method for inferring positive selection at single amino acid sites and reported that positively selected sites were detected by the latter method but not by the former. He then concluded that for this type of study, the ML-based method is more reliable than parsimony analysis. Here we show that his results apparently represent false-positive cases of the ML-based method and that there is no solid evidence that this gene contains positively selected sites. We further demonstrate that in the tax gene of human T-cell lymphotropic virus type I (HTLV-I), all codon sites, including invariable sites, can be inferred as positively selected sites by the ML-based method. These observations indicate that the ML-based method may produce many false-positive sites. One of the main reasons for the occurrence of false positives is that in the ML-based method, codon sites are grouped into several categories, with different nonsynonymous/synonymous rate ratios (omegas), on a purely statistical basis, and positive selection is inferred indirectly by examining whether the average omega for each category is greater than 1. In parsimony analysis, however, the evolutionary change of nucleotides at each codon site is examined. For this reason, parsimony-based methods rarely produce false positives and are safer than ML-based methods for detecting positive selection at individual codon sites, although a large number of sequences are necessary. FAU - Suzuki, Yoshiyuki AU - Suzuki Y AD - Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Japan. yossuzuki@lab.nig.ac.jp FAU - Nei, Masatoshi AU - Nei M LA - eng GR - GM 29332/GM/NIGMS NIH HHS/United States PT - Journal Article PT - Research Support, U.S. Gov't, P.H.S. DEP - 20040310 PL - United States TA - Mol Biol Evol JT - Molecular biology and evolution JID - 8501455 RN - 0 (Codon) RN - 0 (Proteins) RN - 0 (Protozoan Proteins) RN - 0 (Sexually induced protein 1, Thalassiosira weissflogii) SB - IM MH - Bayes Theorem MH - Binding Sites MH - Codon MH - Databases as Topic MH - Diatoms/*genetics MH - *Evolution, Molecular MH - False Positive Reactions MH - Genes, pX/*genetics MH - Genetic Techniques MH - *Likelihood Functions MH - Models, Genetic MH - Phylogeny MH - Proteins/*genetics MH - Protozoan Proteins MH - Sequence Analysis, DNA EDAT- 2004/03/12 05:00 MHDA- 2004/12/16 09:00 CRDT- 2004/03/12 05:00 PHST- 2004/03/12 05:00 [pubmed] PHST- 2004/12/16 09:00 [medline] PHST- 2004/03/12 05:00 [entrez] AID - msh098 [pii] AID - 10.1093/molbev/msh098 [doi] PST - ppublish SO - Mol Biol Evol. 2004 May;21(5):914-21. doi: 10.1093/molbev/msh098. Epub 2004 Mar 10.