PMID- 12767132 OWN - NLM STAT- MEDLINE DCOM- 20031009 LR - 20030527 IS - 0095-2338 (Print) IS - 0095-2338 (Linking) VI - 43 IP - 3 DP - 2003 May-Jun TI - Text Influenced Molecular Indexing (TIMI): a literature database mining approach that handles text and chemistry. PG - 743-52 AB - We present an application of a novel methodology called Text Influenced Molecular Indexing (TIMI) to mine the information in the scientific literature. TIMI is an extension of two existing methodologies: (1) Latent Semantic Structure Indexing (LaSSI), a method for calculating chemical similarity using two-dimensional topological descriptors, and (2) Latent Semantic Indexing (LSI), a method for generating correlations between textual terms. The singular value decomposition (SVD) of a feature/object matrix is the fundamental mathematical operation underlying LSI, LaSSI, and TIMI and is used in the identification of associations between textual and chemical descriptors. We present the results of our studies with a database containing 11,571 PubMed/MEDLINE abstracts which show the advantages of merging textual and chemical descriptors over using either text or chemistry alone. Our work demonstrates that searching text-only databases limits retrieved documents to those that explicitly mention compounds by name in the text. Similarly, searching chemistry-only databases can only retrieve those documents that have chemical structures in them. TIMI, however, enables search and retrieval of documents with textual, chemical, and/or text- and chemistry-based queries. Thus, the TIMI system offers a powerful new approach to uncovering the contextual scientific knowledge sought by the medical research community. FAU - Singh, Suresh B AU - Singh SB AD - Department of Molecular Systems, Merck Research Laboratories, 126 East Lincoln Avenue, RY50SW-100, Rahway, New Jersey 07065-0900, USA. singhsu@yahoo.com FAU - Hull, Richard D AU - Hull RD FAU - Fluder, Eugene M AU - Fluder EM LA - eng PT - Journal Article PL - United States TA - J Chem Inf Comput Sci JT - Journal of chemical information and computer sciences JID - 7505012 RN - 0 (Organic Chemicals) RN - 0 (Pharmaceutical Preparations) SB - IM MH - Algorithms MH - Chemistry, Pharmaceutical/methods MH - *Databases, Factual MH - MEDLINE MH - *Organic Chemicals MH - *Pharmaceutical Preparations MH - *Subject Headings EDAT- 2003/05/28 05:00 MHDA- 2003/10/10 05:00 CRDT- 2003/05/28 05:00 PHST- 2003/05/28 05:00 [pubmed] PHST- 2003/10/10 05:00 [medline] PHST- 2003/05/28 05:00 [entrez] AID - 10.1021/ci025587a [doi] PST - ppublish SO - J Chem Inf Comput Sci. 2003 May-Jun;43(3):743-52. doi: 10.1021/ci025587a.