PMID- 37246225 OWN - NLM STAT- MEDLINE DCOM- 20230530 LR - 20230531 IS - 1479-5876 (Electronic) IS - 1479-5876 (Linking) VI - 21 IP - 1 DP - 2023 May 29 TI - Development and validation of explainable machine-learning models for carotid atherosclerosis early screening. PG - 353 LID - 10.1186/s12967-023-04093-8 [doi] LID - 353 AB - BACKGROUND: Carotid atherosclerosis (CAS), an important factor in the development of stroke, is a major public health concern. The aim of this study was to establish and validate machine learning (ML) models for early screening of CAS using routine health check-up indicators in northeast China. METHODS: A total of 69,601 health check-up records from the health examination center of the First Hospital of China Medical University (Shenyang, China) were collected between 2018 and 2019. For the 2019 records, 80% were assigned to the training set and 20% to the testing set. The 2018 records were used as the external validation dataset. Ten ML algorithms, including decision tree (DT), K-nearest neighbors (KNN), logistic regression (LR), naive Bayes (NB), random forest (RF), multiplayer perceptron (MLP), extreme gradient boosting machine (XGB), gradient boosting decision tree (GBDT), linear support vector machine (SVM-linear), and non-linear support vector machine (SVM-nonlinear), were used to construct CAS screening models. The area under the receiver operating characteristic curve (auROC) and precision-recall curve (auPR) were used as measures of model performance. The SHapley Additive exPlanations (SHAP) method was used to demonstrate the interpretability of the optimal model. RESULTS: A total of 6315 records of patients undergoing carotid ultrasonography were collected; of these, 1632, 407, and 1141 patients were diagnosed with CAS in the training, internal validation, and external validation datasets, respectively. The GBDT model achieved the highest performance metrics with auROC of 0.860 (95% CI 0.839-0.880) in the internal validation dataset and 0.851 (95% CI 0.837-0.863) in the external validation dataset. Individuals with diabetes or those over 65 years of age showed low negative predictive value. In the interpretability analysis, age was the most important factor influencing the performance of the GBDT model, followed by sex and non-high-density lipoprotein cholesterol. CONCLUSIONS: The ML models developed could provide good performance for CAS identification using routine health check-up indicators and could hopefully be applied in scenarios without ethnic and geographic heterogeneity for CAS prevention. CI - (c) 2023. The Author(s). FAU - Yun, Ke AU - Yun K AUID- ORCID: 0000-0003-4030-4490 AD - National Clinical Research Center for Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. AD - Department of Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. FAU - He, Tao AU - He T AD - Neusoft Research Institute, Neusoft Corporation, Shenyang, Liaoning Province, China. FAU - Zhen, Shi AU - Zhen S AD - Department of Software Engineering, Northeastern University, Shenyang, Liaoning Province, China. FAU - Quan, Meihui AU - Quan M AD - National Clinical Research Center for Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. AD - Department of Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. FAU - Yang, Xiaotao AU - Yang X AD - National Clinical Research Center for Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. AD - Department of Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. FAU - Man, Dongliang AU - Man D AD - National Clinical Research Center for Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. AD - Department of Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. FAU - Zhang, Shuang AU - Zhang S AD - National Clinical Research Center for Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. AD - Department of Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. FAU - Wang, Wei AU - Wang W AD - Department of Physical Examination Center, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. 6899wangwei@163.com. FAU - Han, Xiaoxu AU - Han X AUID- ORCID: 0000-0003-1427-8428 AD - National Clinical Research Center for Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. hanxiaoxu@cmu.edu.cn. AD - Department of Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. hanxiaoxu@cmu.edu.cn. AD - Laboratory Medicine Innovation Unit, Chinese Academy of Medical Sciences, Shenyang, Liaoning Province, China. hanxiaoxu@cmu.edu.cn. AD - NHC Key Laboratory of AIDS Immunology (China Medical University), The First Affiliated Hospital of China Medical University, Shenyang, Liaoning Province, China. hanxiaoxu@cmu.edu.cn. LA - eng PT - Journal Article PT - Research Support, Non-U.S. Gov't DEP - 20230529 PL - England TA - J Transl Med JT - Journal of translational medicine JID - 101190741 SB - IM MH - Humans MH - Bayes Theorem MH - *Carotid Artery Diseases/diagnostic imaging MH - *Stroke MH - Algorithms MH - Machine Learning PMC - PMC10225282 OTO - NOTNLM OT - Carotid atherosclerosis OT - Explainable model OT - Machine learning COIS- The authors declare that they have no competing interests. EDAT- 2023/05/29 00:42 MHDA- 2023/05/30 06:42 PMCR- 2023/05/29 CRDT- 2023/05/28 23:14 PHST- 2022/11/25 00:00 [received] PHST- 2023/03/28 00:00 [accepted] PHST- 2023/05/30 06:42 [medline] PHST- 2023/05/29 00:42 [pubmed] PHST- 2023/05/28 23:14 [entrez] PHST- 2023/05/29 00:00 [pmc-release] AID - 10.1186/s12967-023-04093-8 [pii] AID - 4093 [pii] AID - 10.1186/s12967-023-04093-8 [doi] PST - epublish SO - J Transl Med. 2023 May 29;21(1):353. doi: 10.1186/s12967-023-04093-8.