PMID- 16241270 OWN - NLM STAT- MEDLINE DCOM- 20060104 LR - 20051024 IS - 1539-3755 (Print) IS - 1539-3755 (Linking) VI - 67 IP - 6 Pt 1 DP - 2003 Jun TI - Classification of short human exons and introns based on statistical features. PG - 061916 AB - The classification of human gene sequences into exons and introns is a difficult problem in DNA sequence analysis. In this paper, we define a set of features, called the simple Z (SZ) features, which is derived from the Z-curve features for the recognition of human exons and introns. The classification results show that SZ features, while fewer in numbers (three in total), can preserve the high recognition rate of the original nine Z-curve features. Since the size of SZ features is one-third of the Z-curve features, the dimensionality of the feature space is much smaller, and better recognition efficiency is achieved. If the stop codon feature is used together with the three SZ features, a recognition rate of up to 92% for short sequences of length <140 bp can be obtained. FAU - Wu, Yonghui AU - Wu Y AD - Department of Computer Engineering and Information Technology, City University of Hong Kong, Kowloon, Hong Kong. itwyh@cityu.edu.hk FAU - Liew, Alan Wee-Chung AU - Liew AW FAU - Yan, Hong AU - Yan H FAU - Yang, Mengsu AU - Yang M LA - eng PT - Journal Article DEP - 20030627 PL - United States TA - Phys Rev E Stat Nonlin Soft Matter Phys JT - Physical review. E, Statistical, nonlinear, and soft matter physics JID - 101136452 RN - 0 (Codon) RN - 0 (Codon, Terminator) SB - IM MH - Algorithms MH - Biophysics/methods MH - Codon MH - Codon, Terminator MH - Computational Biology/*methods MH - *Exons MH - Humans MH - *Introns MH - *Models, Genetic MH - Models, Statistical MH - Sequence Analysis, DNA/*methods EDAT- 2005/10/26 09:00 MHDA- 2006/01/05 09:00 CRDT- 2005/10/26 09:00 PHST- 2002/10/07 00:00 [received] PHST- 2005/10/26 09:00 [pubmed] PHST- 2006/01/05 09:00 [medline] PHST- 2005/10/26 09:00 [entrez] AID - 10.1103/PhysRevE.67.061916 [doi] PST - ppublish SO - Phys Rev E Stat Nonlin Soft Matter Phys. 2003 Jun;67(6 Pt 1):061916. doi: 10.1103/PhysRevE.67.061916. Epub 2003 Jun 27.