PMID- 27103098 OWN - NLM STAT- MEDLINE DCOM- 20171211 LR - 20200306 IS - 1477-4054 (Electronic) IS - 1467-5463 (Print) IS - 1467-5463 (Linking) VI - 18 IP - 3 DP - 2017 May 1 TI - Heterogeneous molecular processes among the causes of how sequence similarity scores can fail to recapitulate phylogeny. PG - 451-457 LID - 10.1093/bib/bbw034 [doi] AB - Sequence similarity tools like Basic Local Alignment Search Tool (BLAST) are essential components of many functional genetic, genomic, phylogenetic and bioinformatic studies. Many modern analysis pipelines use significant sequence similarity scores (p- or E-values) and the ranked order of BLAST matches to test a wide range of hypotheses concerning homology, orthology, the timing of de novo gene birth/death and gene family expansion/contraction. Despite significant contrary findings, many of these tests still implicitly assume that stronger or higher-ranked E-value scores imply closer phylogenetic relationships between sequences. Here, we demonstrate that even though a general relationship does exist between the phylogenetic distance of two sequences and their E-value, significant and misleading errors occur in both the completeness and the order of results under realistic evolutionary scenarios. These results provide additional details to past evidence showing that studies should avoid drawing direct inferences of evolutionary relatedness from measures of sequence similarity alone, and should instead, where possible, use more rigorous phylogeny-based methods. CI - (c) The Author 2016. Published by Oxford University Press. FAU - Smith, Stephen A AU - Smith SA FAU - Pease, James B AU - Pease JB LA - eng PT - Journal Article PL - England TA - Brief Bioinform JT - Briefings in bioinformatics JID - 100912837 SB - IM MH - Computational Biology MH - *Phylogeny MH - Sequence Alignment MH - Software PMC - PMC5429007 OTO - NOTNLM OT - BLAST OT - compositional bias OT - phylogenetics OT - phylostratigraphy OT - rate heterogeneity OT - sequence similarity EDAT- 2016/04/23 06:00 MHDA- 2017/12/12 06:00 PMCR- 2016/04/21 CRDT- 2016/04/23 06:00 PHST- 2016/01/19 00:00 [received] PHST- 2016/04/23 06:00 [pubmed] PHST- 2017/12/12 06:00 [medline] PHST- 2016/04/23 06:00 [entrez] PHST- 2016/04/21 00:00 [pmc-release] AID - bbw034 [pii] AID - 10.1093/bib/bbw034 [doi] PST - ppublish SO - Brief Bioinform. 2017 May 1;18(3):451-457. doi: 10.1093/bib/bbw034.