PMID- 33647938 OWN - NLM STAT- MEDLINE DCOM- 20211124 LR - 20220228 IS - 1527-974X (Electronic) IS - 1067-5027 (Print) IS - 1067-5027 (Linking) VI - 28 IP - 7 DP - 2021 Jul 14 TI - Extracting postmarketing adverse events from safety reports in the vaccine adverse event reporting system (VAERS) using deep learning. PG - 1393-1400 LID - 10.1093/jamia/ocab014 [doi] AB - OBJECTIVE: Automated analysis of vaccine postmarketing surveillance narrative reports is important to understand the progression of rare but severe vaccine adverse events (AEs). This study implemented and evaluated state-of-the-art deep learning algorithms for named entity recognition to extract nervous system disorder-related events from vaccine safety reports. MATERIALS AND METHODS: We collected Guillain-Barre syndrome (GBS) related influenza vaccine safety reports from the Vaccine Adverse Event Reporting System (VAERS) from 1990 to 2016. VAERS reports were selected and manually annotated with major entities related to nervous system disorders, including, investigation, nervous_AE, other_AE, procedure, social_circumstance, and temporal_expression. A variety of conventional machine learning and deep learning algorithms were then evaluated for the extraction of the above entities. We further pretrained domain-specific BERT (Bidirectional Encoder Representations from Transformers) using VAERS reports (VAERS BERT) and compared its performance with existing models. RESULTS AND CONCLUSIONS: Ninety-one VAERS reports were annotated, resulting in 2512 entities. The corpus was made publicly available to promote community efforts on vaccine AEs identification. Deep learning-based methods (eg, bi-long short-term memory and BERT models) outperformed conventional machine learning-based methods (ie, conditional random fields with extensive features). The BioBERT large model achieved the highest exact match F-1 scores on nervous_AE, procedure, social_circumstance, and temporal_expression; while VAERS BERT large models achieved the highest exact match F-1 scores on investigation and other_AE. An ensemble of these 2 models achieved the highest exact match microaveraged F-1 score at 0.6802 and the second highest lenient match microaveraged F-1 score at 0.8078 among peer models. CI - (c) The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com. FAU - Du, Jingcheng AU - Du J AUID- ORCID: 0000-0002-0322-4566 AD - School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA. FAU - Xiang, Yang AU - Xiang Y AUID- ORCID: 0000-0003-1395-6805 AD - School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA. FAU - Sankaranarayanapillai, Madhuri AU - Sankaranarayanapillai M AD - School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA. FAU - Zhang, Meng AU - Zhang M AD - School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA. FAU - Wang, Jingqi AU - Wang J AD - School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA. FAU - Si, Yuqi AU - Si Y AD - School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA. FAU - Pham, Huy Anh AU - Pham HA AD - School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA. FAU - Xu, Hua AU - Xu H AUID- ORCID: 0000-0002-5274-4672 AD - School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA. FAU - Chen, Yong AU - Chen Y AD - Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA. FAU - Tao, Cui AU - Tao C AD - School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA. LA - eng GR - R01 LM011829/LM/NLM NIH HHS/United States GR - R01 AI130460/AI/NIAID NIH HHS/United States PT - Journal Article PT - Research Support, N.I.H., Extramural PL - England TA - J Am Med Inform Assoc JT - Journal of the American Medical Informatics Association : JAMIA JID - 9430800 RN - 0 (Influenza Vaccines) SB - IM MH - Adverse Drug Reaction Reporting Systems MH - Computer Systems MH - *Deep Learning MH - *Guillain-Barre Syndrome MH - Humans MH - *Influenza Vaccines/adverse effects MH - United States PMC - PMC8279785 OTO - NOTNLM OT - VAERS OT - deep learning OT - named entity recognition OT - vaccine adverse events EDAT- 2021/03/02 06:00 MHDA- 2021/11/25 06:00 PMCR- 2022/02/27 CRDT- 2021/03/01 20:21 PHST- 2020/08/26 00:00 [received] PHST- 2021/01/14 00:00 [revised] PHST- 2021/01/20 00:00 [accepted] PHST- 2021/03/02 06:00 [pubmed] PHST- 2021/11/25 06:00 [medline] PHST- 2021/03/01 20:21 [entrez] PHST- 2022/02/27 00:00 [pmc-release] AID - 6153955 [pii] AID - ocab014 [pii] AID - 10.1093/jamia/ocab014 [doi] PST - ppublish SO - J Am Med Inform Assoc. 2021 Jul 14;28(7):1393-1400. doi: 10.1093/jamia/ocab014.