PMID- 31074373 OWN - NLM STAT- MEDLINE DCOM- 20190705 LR - 20231011 IS - 1471-2105 (Electronic) IS - 1471-2105 (Linking) VI - 20 IP - Suppl 7 DP - 2019 May 1 TI - MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites. PG - 200 LID - 10.1186/s12859-019-2735-3 [doi] LID - 200 AB - BACKGROUND: Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central significance in the understanding of molecular regulation mechanism. Many algorithms have been proposed for the identification of transcription factor binding sites. However, it remains a challengeable problem. RESULTS: Here, we proposed a novel motif discovery algorithm based on support vector machine (MD-SVM) to learn a discriminative model for TF binding sites. MD-SVM firstly obtains position weight matrix (PWM) from a set of training datasets. Then it translates the MD problem into a computational framework of multiple instance learning (MIL). It was applied to several real biological datasets. Results show that our algorithm outperforms MI-SVM in terms of both accuracy and specificity. CONCLUSIONS: In this paper, we modeled the TF motif discovery problem as a MIL optimization problem. The SVM algorithm was adapted to discriminate positive and negative bags of instances. Compared to other svm-based algorithms, MD-SVM show its superiority over its competitors in term of ROC AUC. Hopefully, it could be of benefit to the research community in the understanding of molecular functions of DNA functional elements and transcription factors. FAU - Hu, Jialu AU - Hu J AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. jhu@nwpu.edu.cn. AD - Centre of Multidisciplinary Convergence Computing, School of Computer Science, Northwestern Polytechnical University, 1 Dong Xiang Road, Xi'an, 710129, China. jhu@nwpu.edu.cn. FAU - Wang, Jingru AU - Wang J AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. FAU - Lin, Jianan AU - Lin J AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. FAU - Liu, Tianwei AU - Liu T AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. FAU - Zhong, Yuanke AU - Zhong Y AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. FAU - Liu, Jie AU - Liu J AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. FAU - Zheng, Yan AU - Zheng Y AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. FAU - Gao, Yiqun AU - Gao Y AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. FAU - He, Junhao AU - He J AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. FAU - Shang, Xuequn AU - Shang X AD - School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China. LA - eng PT - Journal Article DEP - 20190501 PL - England TA - BMC Bioinformatics JT - BMC bioinformatics JID - 100965194 RN - 0 (Transcription Factors) SB - IM MH - *Algorithms MH - Binding Sites MH - Humans MH - *Nucleotide Motifs MH - Protein Binding MH - *Support Vector Machine MH - Transcription Factors/*metabolism PMC - PMC6509868 OTO - NOTNLM OT - Binding site preference OT - Multiple instance learning OT - Support vector machine OT - Transcription factor COIS- ETHICS APPROVAL AND CONSENT TO PARTICIPATE: Not applicable. CONSENT FOR PUBLICATION: Not applicable. COMPETING INTERESTS: The authors declare that they have no competing interests. PUBLISHER'S NOTE: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. EDAT- 2019/05/11 06:00 MHDA- 2019/07/06 06:00 PMCR- 2019/05/01 CRDT- 2019/05/11 06:00 PHST- 2019/05/11 06:00 [entrez] PHST- 2019/05/11 06:00 [pubmed] PHST- 2019/07/06 06:00 [medline] PHST- 2019/05/01 00:00 [pmc-release] AID - 10.1186/s12859-019-2735-3 [pii] AID - 2735 [pii] AID - 10.1186/s12859-019-2735-3 [doi] PST - epublish SO - BMC Bioinformatics. 2019 May 1;20(Suppl 7):200. doi: 10.1186/s12859-019-2735-3.