PMID- 26829802 OWN - NLM STAT- MEDLINE DCOM- 20180523 LR - 20240327 IS - 1557-9964 (Electronic) IS - 1545-5963 (Print) IS - 1545-5963 (Linking) VI - 14 IP - 5 DP - 2017 Sep-Oct TI - An Annotation Agnostic Algorithm for Detecting Nascent RNA Transcripts in GRO-Seq. PG - 1070-1081 LID - 10.1109/TCBB.2016.2520919 [doi] AB - We present a fast and simple algorithm to detect nascent RNA transcription in global nuclear run-on sequencing (GRO-seq). GRO-seq is a relatively new protocol that captures nascent transcripts from actively engaged polymerase, providing a direct read-out on bona fide transcription. Most traditional assays, such as RNA-seq, measure steady state RNA levels which are affected by transcription, post-transcriptional processing, and RNA stability. GRO-seq data, however, presents unique analysis challenges that are only beginning to be addressed. Here, we describe a new algorithm, Fast Read Stitcher (FStitch), that takes advantage of two popular machine-learning techniques, hidden Markov models and logistic regression, to classify which regions of the genome are transcribed. Given a small user-defined training set, our algorithm is accurate, robust to varying read depth, annotation agnostic, and fast. Analysis of GRO-seq data without a priori need for annotation uncovers surprising new insights into several aspects of the transcription process. FAU - Azofeifa, Joseph G AU - Azofeifa JG FAU - Allen, Mary A AU - Allen MA FAU - Lladser, Manuel E AU - Lladser ME FAU - Dowell, Robin D AU - Dowell RD LA - eng GR - S10 OD012300/OD/NIH HHS/United States GR - T15 LM009451/LM/NLM NIH HHS/United States PT - Journal Article PT - Research Support, N.I.H., Extramural PT - Research Support, Non-U.S. Gov't PT - Research Support, U.S. Gov't, Non-P.H.S. DEP - 20160126 PL - United States TA - IEEE/ACM Trans Comput Biol Bioinform JT - IEEE/ACM transactions on computational biology and bioinformatics JID - 101196755 RN - 63231-63-0 (RNA) SB - IM MH - *Algorithms MH - Computational Biology/*methods MH - Databases, Genetic MH - Humans MH - Markov Chains MH - Molecular Sequence Annotation/*methods MH - RNA/analysis/*genetics MH - Sequence Analysis, RNA/*methods PMC - PMC5667649 MID - NIHMS912031 EDAT- 2016/02/02 06:00 MHDA- 2018/05/24 06:00 PMCR- 2018/09/01 CRDT- 2016/02/02 06:00 PHST- 2016/02/02 06:00 [pubmed] PHST- 2018/05/24 06:00 [medline] PHST- 2016/02/02 06:00 [entrez] PHST- 2018/09/01 00:00 [pmc-release] AID - 10.1109/TCBB.2016.2520919 [doi] PST - ppublish SO - IEEE/ACM Trans Comput Biol Bioinform. 2017 Sep-Oct;14(5):1070-1081. doi: 10.1109/TCBB.2016.2520919. Epub 2016 Jan 26.