PMID- 35673548 OWN - NLM STAT- PubMed-not-MEDLINE LR - 20230207 IS - 1959-0318 (Print) IS - 1876-0988 (Electronic) IS - 1876-0988 (Linking) VI - 44 IP - 1 DP - 2023 Feb TI - Automatic Detection of Severely and Mildly Infected COVID-19 Patients with Supervised Machine Learning Models. PG - 100725 LID - 10.1016/j.irbm.2022.05.006 [doi] AB - OBJECTIVES: When the prognosis of COVID-19 disease can be detected early, the intense-pressure and loss of workforce in health-services can be partially reduced. The primary-purpose of this article is to determine the feature-dataset consisting of the routine-blood-values (RBV) and demographic-data that affect the prognosis of COVID-19. Second, by applying the feature-dataset to the supervised machine-learning (ML) models, it is to identify severely and mildly infected COVID-19 patients at the time of admission. MATERIAL AND METHODS: The sample of this study consists of severely (n = 192) and mildly (n = 4010) infected-patients hospitalized with the diagnosis of COVID-19 between March-September, 2021. The RBV-data measured at the time of admission and age-gender characteristics of these patients were analyzed retrospectively. For the selection of the features, the minimum-redundancy-maximum-relevance (MRMR) method, principal-components-analysis and forward-multiple-logistics-regression analyzes were used. The features set were statistically compared between mild and severe infected-patients. Then, the performances of various supervised-ML-models were compared in identifying severely and mildly infected-patients using the feature set. RESULTS: In this study, 28 RBV-parameters and age-variable were found as the feature-dataset. The effect of features on the prognosis of the disease has been clinically proven. The ML-models with the highest overall-accuracy in identifying patient-groups were found respectively, as follows: local-weighted-learning (LWL)-97.86%, K-star (K*)-96.31%, Naive-Bayes (NB)-95.36% and k-nearest-neighbor (KNN)-94.05%. Also, the most successful models with the highest area-under-the-receiver-operating-characteristic-curve (AUC) values in identifying patient groups were found respectively, as follows: LWL-0.95%, K*-0.91%, NB-0.85% and KNN-0.75%. CONCLUSION: The findings in this article have significant a motivation for the healthcare professionals to detect at admission severely and mildly infected COVID-19 patients. CI - (c) 2022 AGBM. Published by Elsevier Masson SAS. All rights reserved. FAU - Huyut, M T AU - Huyut MT AD - Department of Biostatistics and Medical Informatics, Medical Faculty, Erzincan Binali Yildirim University, 24100, Erzincan, Turkey. LA - eng PT - Journal Article DEP - 20220601 PL - Netherlands TA - Ing Rech Biomed JT - Ingenierie et recherche biomedicale : IRBM = Biomedical engineering and research JID - 101475162 PMC - PMC9158375 OTO - NOTNLM OT - Biochemical and hematological biomarkers OT - COVID-19 OT - Classification OT - Feature selection methods OT - Routine blood values OT - Supervised machine learning models COIS- The authors declare that they have no known competing financial or personal relationships that could be viewed as influencing the work reported in this paper. EDAT- 2022/06/09 06:00 MHDA- 2022/06/09 06:01 PMCR- 2022/06/01 CRDT- 2022/06/08 01:59 PHST- 2021/10/22 00:00 [received] PHST- 2022/04/24 00:00 [revised] PHST- 2022/05/29 00:00 [accepted] PHST- 2022/06/09 06:00 [pubmed] PHST- 2022/06/09 06:01 [medline] PHST- 2022/06/08 01:59 [entrez] PHST- 2022/06/01 00:00 [pmc-release] AID - S1959-0318(22)00059-8 [pii] AID - 100725 [pii] AID - 10.1016/j.irbm.2022.05.006 [doi] PST - ppublish SO - Ing Rech Biomed. 2023 Feb;44(1):100725. doi: 10.1016/j.irbm.2022.05.006. Epub 2022 Jun 1.