PMID- 29925405 OWN - NLM STAT- MEDLINE DCOM- 20190731 LR - 20190731 IS - 2041-1480 (Electronic) VI - 9 IP - 1 DP - 2018 Jun 20 TI - Adverse event detection by integrating twitter data and VAERS. PG - 19 LID - 10.1186/s13326-018-0184-y [doi] LID - 19 AB - BACKGROUND: Vaccine has been one of the most successful public health interventions to date. However, vaccines are pharmaceutical products that carry risks so that many adverse events (AEs) are reported after receiving vaccines. Traditional adverse event reporting systems suffer from several crucial challenges including poor timeliness. This motivates increasing social media-based detection systems, which demonstrate successful capability to capture timely and prevalent disease information. Despite these advantages, social media-based AE detection suffers from serious challenges such as labor-intensive labeling and class imbalance of the training data. RESULTS: To tackle both challenges from traditional reporting systems and social media, we exploit their complementary strength and develop a combinatorial classification approach by integrating Twitter data and the Vaccine Adverse Event Reporting System (VAERS) information aiming to identify potential AEs after influenza vaccine. Specifically, we combine formal reports which have accurately predefined labels with social media data to reduce the cost of manual labeling; in order to combat the class imbalance problem, a max-rule based multi-instance learning method is proposed to bias positive users. Various experiments were conducted to validate our model compared with other baselines. We observed that (1) multi-instance learning methods outperformed baselines when only Twitter data were used; (2) formal reports helped improve the performance metrics of our multi-instance learning methods consistently while affecting the performance of other baselines negatively; (3) the effect of formal reports was more obvious when the training size was smaller. Case studies show that our model labeled users and tweets accurately. CONCLUSIONS: We have developed a framework to detect vaccine AEs by combining formal reports with social media data. We demonstrate the power of formal reports on the performance improvement of AE detection when the amount of social media data was small. Various experiments and case studies show the effectiveness of our model. FAU - Wang, Junxiang AU - Wang J AD - Department of Information Science and Technology, George Mason University, Fairfax, VA, USA. FAU - Zhao, Liang AU - Zhao L AD - Department of Information Science and Technology, George Mason University, Fairfax, VA, USA. FAU - Ye, Yanfang AU - Ye Y AD - Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV, USA. AD - Benjamin M. Statler College of Engineering and Mineral Resources, West Virginia University, Morgantown, WV, USA. FAU - Zhang, Yuji AU - Zhang Y AUID- ORCID: 0000-0002-5429-6762 AD - Department of Epidemiology & Public Health, University of Maryland School of Medicine, Baltimore, MD, USA. Yuzhang@som.umaryland.edu. AD - Division of Biostatistics and Bioinformatics, University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Center, Baltimore, MD, USA. Yuzhang@som.umaryland.edu. LA - eng GR - P30 CA134274/CA/NCI NIH HHS/United States PT - Journal Article PT - Research Support, N.I.H., Extramural DEP - 20180620 PL - England TA - J Biomed Semantics JT - Journal of biomedical semantics JID - 101531992 RN - 0 (Vaccines) MH - *Adverse Drug Reaction Reporting Systems MH - Data Mining MH - *Social Media MH - Vaccines/*adverse effects PMC - PMC6011255 OTO - NOTNLM OT - Formal reports OT - Multi-instance learning OT - Social media OT - Vaccine adverse event detection COIS- ETHICS APPROVAL AND CONSENT TO PARTICIPATE: Not applicable. COMPETING INTERESTS: The authors declare that they have no competing interests. PUBLISHER'S NOTE: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. EDAT- 2018/06/22 06:00 MHDA- 2019/08/01 06:00 PMCR- 2018/06/20 CRDT- 2018/06/22 06:00 PHST- 2018/02/02 00:00 [received] PHST- 2018/05/10 00:00 [accepted] PHST- 2018/06/22 06:00 [entrez] PHST- 2018/06/22 06:00 [pubmed] PHST- 2019/08/01 06:00 [medline] PHST- 2018/06/20 00:00 [pmc-release] AID - 10.1186/s13326-018-0184-y [pii] AID - 184 [pii] AID - 10.1186/s13326-018-0184-y [doi] PST - epublish SO - J Biomed Semantics. 2018 Jun 20;9(1):19. doi: 10.1186/s13326-018-0184-y.