PMID- 37346563 OWN - NLM STAT- PubMed-not-MEDLINE LR - 20230626 IS - 2376-5992 (Electronic) IS - 2376-5992 (Linking) VI - 9 DP - 2023 TI - A semi supervised approach to Arabic aspect category detection using Bert and teacher-student model. PG - e1425 LID - 10.7717/peerj-cs.1425 [doi] LID - e1425 AB - Aspect-based sentiment analysis tasks are well researched in English. However, we find such research lacking in the context of the Arabic language, especially with reference to aspect category detection. Most of this research is focusing on supervised machine learning methods that require the use of large, labeled datasets. Therefore, the aim of this research is to implement a semi-supervised self-training approach which utilizes a noisy student framework to enhance the capability of a deep learning model, AraBERT v02. The objective is to perform aspect category detection on both the SemEval 2016 hotel review dataset and the Hotel Arabic-Reviews Dataset (HARD) 2016. The four-step framework firstly entails developing a teacher model that is trained on the aspect categories of the SemEval 2016 labeled dataset. Secondly, it generates pseudo labels for the unlabeled HARD dataset based on the teacher model. Thirdly, it creates a noisy student model that is trained on the combined datasets ( approximately 1 million sentences). The aim is to minimize the combined cross entropy loss. Fourthly, an ensembling of both teacher and student models is carried out to enhance the performance of AraBERT. Findings indicate that the ensembled teacher-student model demonstrates a 0.3% improvement in its micro F1 over the initial noisy student implementation, both in predicting the Aspect Categories in the combined datasets. However, it has achieved a 1% increase over the micro F1 of the teacher model. These results outperform both baselines and other deep learning models discussed in the related literature. CI - (c)2023 Almasri et al. FAU - Almasri, Miada AU - Almasri M AD - Information Technology Department/Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia. FAU - Al-Malki, Norah AU - Al-Malki N AD - European Languages Department/Faculty of Arts and Humanities, King Abdulaziz University, Jeddah, Saudi Arabia. FAU - Alotaibi, Reem AU - Alotaibi R AD - Information Technology Department/Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia. LA - eng PT - Journal Article DEP - 20230608 PL - United States TA - PeerJ Comput Sci JT - PeerJ. Computer science JID - 101660598 PMC - PMC10280399 OTO - NOTNLM OT - AraBERT OT - Aspect Category Detection OT - BERT OT - Noisy Student model OT - Sentiment Analysis OT - Teacher model OT - Transformer COIS- The authors declare there are no competing interests. EDAT- 2023/06/22 13:09 MHDA- 2023/06/22 13:10 PMCR- 2023/06/08 CRDT- 2023/06/22 09:53 PHST- 2022/12/23 00:00 [received] PHST- 2023/05/12 00:00 [accepted] PHST- 2023/06/22 13:10 [medline] PHST- 2023/06/22 13:09 [pubmed] PHST- 2023/06/22 09:53 [entrez] PHST- 2023/06/08 00:00 [pmc-release] AID - cs-1425 [pii] AID - 10.7717/peerj-cs.1425 [doi] PST - epublish SO - PeerJ Comput Sci. 2023 Jun 8;9:e1425. doi: 10.7717/peerj-cs.1425. eCollection 2023.