PMID- 36765642 OWN - NLM STAT- PubMed-not-MEDLINE LR - 20230213 IS - 2072-6694 (Print) IS - 2072-6694 (Electronic) IS - 2072-6694 (Linking) VI - 15 IP - 3 DP - 2023 Jan 22 TI - Breast Cancer Prediction Using Fine Needle Aspiration Features and Upsampling with Supervised Machine Learning. LID - 10.3390/cancers15030681 [doi] LID - 681 AB - Breast cancer is one of the most common invasive cancers in women and it continues to be a worldwide medical problem since the number of cases has significantly increased over the past decade. Breast cancer is the second leading cause of death from cancer in women. The early detection of breast cancer can save human life but the traditional approach for detecting breast cancer disease needs various laboratory tests involving medical experts. To reduce human error and speed up breast cancer detection, an automatic system is required that would perform the diagnosis accurately and timely. Despite the research efforts for automated systems for cancer detection, a wide gap exists between the desired and provided accuracy of current approaches. To overcome this issue, this research proposes an approach for breast cancer prediction by selecting the best fine needle aspiration features. To enhance the prediction accuracy, several feature selection techniques are applied to analyze their efficacy, such as principal component analysis, singular vector decomposition, and chi-square (Chi2). Extensive experiments are performed with different features and different set sizes of features to investigate the optimal feature set. Additionally, the influence of imbalanced and balanced data using the SMOTE approach is investigated. Six classifiers including random forest, support vector machine, gradient boosting machine, logistic regression, multilayer perceptron, and K-nearest neighbors (KNN) are tuned to achieve increased classification accuracy. Results indicate that KNN outperforms all other classifiers on the used dataset with 20 features using SVD and with the 15 most important features using a PCA with a 100% accuracy score. FAU - Shafique, Rahman AU - Shafique R AUID- ORCID: 0000-0001-7641-2835 AD - Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea. FAU - Rustam, Furqan AU - Rustam F AUID- ORCID: 0000-0001-8403-1047 AD - School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland. FAU - Choi, Gyu Sang AU - Choi GS AUID- ORCID: 0000-0002-0854-768X AD - Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea. FAU - Diez, Isabel de la Torre AU - Diez IT AUID- ORCID: 0000-0003-3134-7720 AD - Department of Signal Theory and Communications and Telematic Engineering, University of Valladolid, Paseo de Belen 15, 47011 Valladolid, Spain. FAU - Mahmood, Arif AU - Mahmood A AD - Department of Computer Science & Information Technology, The Islamia University of Bahawalpur, Bahawalpur 63100, Punjab, Pakistan. FAU - Lipari, Vivian AU - Lipari V AD - Research Group on Foods, Nutritional Biochemistry and Health, Universidad Europea del Atlantico, Isabel Torres 21, 39011 Santander, Spain. AD - Department of Project Management, Universidad Internacional Iberoamericana, Campeche 24560, Mexico. AD - Fundacion Universitaria Internacional de Colombia Bogota, Bogota 11001, Colombia. FAU - Velasco, Carmen Lili Rodriguez AU - Velasco CLR AD - Research Group on Foods, Nutritional Biochemistry and Health, Universidad Europea del Atlantico, Isabel Torres 21, 39011 Santander, Spain. AD - Department of Project Management, Universidad Internacional Iberoamericana Arecibo, Arecibo, PR 00613, USA. AD - Project Management, Universidade Internacional do Cuanza, Cuito EN250, Bie, Angola. FAU - Ashraf, Imran AU - Ashraf I AUID- ORCID: 0000-0002-8271-6496 AD - Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea. LA - eng GR - N/A/European University of the Atlantic/ PT - Journal Article DEP - 20230122 PL - Switzerland TA - Cancers (Basel) JT - Cancers JID - 101526829 PMC - PMC9913345 OTO - NOTNLM OT - breast cancer prediction OT - deep learning OT - feature selection OT - fine-needle aspiration features OT - principal component analysis OT - singular value decomposition COIS- The authors declare no conflict of interest. EDAT- 2023/02/12 06:00 MHDA- 2023/02/12 06:01 PMCR- 2023/01/22 CRDT- 2023/02/11 01:03 PHST- 2022/12/20 00:00 [received] PHST- 2023/01/13 00:00 [revised] PHST- 2023/01/17 00:00 [accepted] PHST- 2023/02/11 01:03 [entrez] PHST- 2023/02/12 06:00 [pubmed] PHST- 2023/02/12 06:01 [medline] PHST- 2023/01/22 00:00 [pmc-release] AID - cancers15030681 [pii] AID - cancers-15-00681 [pii] AID - 10.3390/cancers15030681 [doi] PST - epublish SO - Cancers (Basel). 2023 Jan 22;15(3):681. doi: 10.3390/cancers15030681.