PMID- 35204032 OWN - NLM STAT- PubMed-not-MEDLINE LR - 20220301 IS - 2076-3425 (Print) IS - 2076-3425 (Electronic) IS - 2076-3425 (Linking) VI - 12 IP - 2 DP - 2022 Feb 15 TI - Semantic Feature Extraction Using SBERT for Dementia Detection. LID - 10.3390/brainsci12020270 [doi] LID - 270 AB - Dementia is a neurodegenerative disease that leads to the development of cognitive deficits, such as aphasia, apraxia, and agnosia. It is currently considered one of the most significant major medical problems worldwide, primarily affecting the elderly. This condition gradually impairs the patient's cognition, eventually leading to the inability to perform everyday tasks without assistance. Since dementia is an incurable disease, early detection plays an important role in delaying its progression. Because of this, tools and methods have been developed to help accurately diagnose patients in their early stages. State-of-the-art methods have shown that the use of syntactic-type linguistic features provides a sensitive and noninvasive tool for detecting dementia in its early stages. However, these methods lack relevant semantic information. In this work, we propose a novel methodology, based on the semantic features approach, by using sentence embeddings computed by Siamese BERT networks (SBERT), along with support vector machine (SVM), K-nearest neighbors (KNN), random forest, and an artificial neural network (ANN) as classifiers. Our methodology extracted 17 features that provide demographic, lexical, syntactic, and semantic information from 550 oral production samples of elderly controls and people with Alzheimer's disease, provided by the DementiaBank Pitt Corpus database. To quantify the relevance of the extracted features for the dementia classification task, we calculated the mutual information score, which demonstrates a dependence between our features and the MMSE score. The experimental classification performance metrics, such as the accuracy, precision, recall, and F1 score (77, 80, 80, and 80%, respectively), validate that our methodology performs better than syntax-based methods and the BERT approach when only the linguistic features are used. FAU - Santander-Cruz, Yamanki AU - Santander-Cruz Y AD - Facultad de Ingenieria, Universidad Autonoma de Queretaro, Queretaro C.P. 76010, Mexico. FAU - Salazar-Colores, Sebastian AU - Salazar-Colores S AUID- ORCID: 0000-0002-6353-0864 AD - Centro de Investigaciones en Optica, Leon C.P. 37150, Mexico. FAU - Paredes-Garcia, Wilfrido Jacobo AU - Paredes-Garcia WJ AUID- ORCID: 0000-0001-9740-0358 AD - Facultad de Ingenieria, Universidad Autonoma de Queretaro, Queretaro C.P. 76010, Mexico. FAU - Guendulain-Arenas, Humberto AU - Guendulain-Arenas H AD - Departamento de Geriatria, Instituto Mexicano del Seguro Social, San Juan del Rio C.P. 76800, Mexico. FAU - Tovar-Arriaga, Saul AU - Tovar-Arriaga S AUID- ORCID: 0000-0002-2695-1934 AD - Facultad de Ingenieria, Universidad Autonoma de Queretaro, Queretaro C.P. 76010, Mexico. LA - eng PT - Journal Article DEP - 20220215 PL - Switzerland TA - Brain Sci JT - Brain sciences JID - 101598646 PMC - PMC8870383 OTO - NOTNLM OT - NLP feature extraction OT - SBERT OT - dementia OT - semantic analysis OT - syntax analysis COIS- The authors declare no conflict of interest. EDAT- 2022/02/26 06:00 MHDA- 2022/02/26 06:01 PMCR- 2022/02/15 CRDT- 2022/02/25 01:03 PHST- 2021/12/04 00:00 [received] PHST- 2022/01/28 00:00 [revised] PHST- 2022/01/29 00:00 [accepted] PHST- 2022/02/25 01:03 [entrez] PHST- 2022/02/26 06:00 [pubmed] PHST- 2022/02/26 06:01 [medline] PHST- 2022/02/15 00:00 [pmc-release] AID - brainsci12020270 [pii] AID - brainsci-12-00270 [pii] AID - 10.3390/brainsci12020270 [doi] PST - epublish SO - Brain Sci. 2022 Feb 15;12(2):270. doi: 10.3390/brainsci12020270.