View Article

Abstract

Social media provides information about patients' health issues, including medication side effects and unsuccessful treatments. Social media patient reports of adverse drug events (ADEs) have the potential to significantly enhance pharmacovigilance procedures as they are today. In health informatics, obtaining these reports is still difficult, though. In this study, we develop a research framework with advanced natural language processing techniques for integrated and high-performance ADE extraction. The framework consists of medical entity extraction, ADE detection using shortest dependency path kernel-based statistical learning, semantic filtering using medical knowledge bases, and report source classification to reduce noise. Experiments were conducted using posts from major U.S.-based diabetes and heart disease forums. The results show each component significantly improves overall effectiveness. Our framework significantly outperforms previous methods.

Keywords

Medical Knowledge base, Semantic Filtering, Medical entity Extraction

Introduction

In recent years, more patients have shared their healthcare experiences online—creating a "cloud of patient experience." Social platforms such as blogs and forums allow patients to share diagnoses, treatments, medications, and side effects, especially for chronic conditions like hypertension, diabetes, and heart disease. Self-reports from patients frequently highlight medical issues and drug side effects overlooked by clinicians. These issues, if unrecorded, can lead to non-compliance and preventable adverse events. Mining such data from social media is a novel approach to capture evidence about drug effectiveness, compliance, and safety—providing insights that are often missed in clinical settings. An adverse drug reaction (ADR) is defined as “a harmful or unpleasant reaction resulting from the use of a medicinal product,” warranting intervention such as treatment modification or drug withdrawal.

Work Related

  • Pharmacovigilance in Health Social Media

Social media has become a crucial source for pharmacovigilance, where users often report adverse drug reactions that may not be officially documented.

  • Biomedical Relation Extraction

Extracting biomedical relations (e.g., gene-disease or protein interactions) has been extensively studied. Methods include co-occurrence analysis, rule-based systems, statistical learning, and hybrid approaches.

  • Research Gaps and Questions

Key issues identified include:

  • Limited use of advanced statistical learning in social media ADE research.
  • Over-reliance on co-occurrence analysis, which misses syntactic/semantic context.
  • Difficulty in distinguishing true patient experiences from noise or third-party narratives.

Research Questions:

  1. How can we create a scalable framework for mining patient-reported ADEs?
  2. Can semantic filtering and statistical learning improve ADE extraction?
  3. How can we isolate true patient-reported ADEs in noisy data?

Fig. 1

Research Method

  • Data Collection

An automated crawler and extractor were built to collect forum data including post IDs, URLs, authors, dates, and content.

  • Data Preprocessing

Text cleaning using regular expressions removed URLs, personal data, and excess punctuation. Sentence segmentation was performed using Open NLP.

Fig.2

  • Adverse Drug Event Extraction

Forum discussions use informal language, requiring a hybrid of machine learning and rule-based methods. We use statistical learning for relation detection and semantic rules to filter drug indications and negated ADEs.

Semantic Filtering

Our algorithm filters out false positives using drug safety databases and negation detection tools, improving precision.

Research Hypotheses

  • H1a: Statistical learning outperforms co-occurrence methods.
  • H1b: Semantic filtering enhances extraction accuracy.
  • H2: Report source classification improves identification of true ADE reports.

Experiments and Results

      • Research Test Bed

We gathered data from leading U.S. forums such as:

  • American Diabetes Association
  • Diabetes Forums
  • Med Help’s Heart Disease boards

These platforms support patients managing chronic conditions through community interaction.

• Evaluation Metrics

Standard metrics (Precision, Recall, F1-Score) were used to evaluate the system's performance.

Experiments

Three main tasks:

  1. Medical Entity Extraction
  2. ADE Extraction
  3. Report Source Classification

5-fold cross-validation was used. Each fold trained on 80% of labeled data, tested on 20%.

• ADE Extraction

400 annotated sentences per forum were used to evaluate drug-medical event relations within single sentences.

RESULTS AND DISCUSSION

Fig 3

ADE Extraction

We compared:

  • Co-occurrence method (CO)
  • Statistical learning (SL)
  • SL + Semantic Filtering (SL+SF)

SL+SF performed best across metrics

Fig.4

Hypothesis Testing

We conducted one-tailed t-tests (n=50 samples) using bootstrapping. Results validated all hypotheses, with statistically significant improvements from SL and SL+SF.

Fig. 5

CONCLUSIONS AND CONTRIBUTIONS

Social media offers unfiltered, real-time insights into patient healthcare experiences. Our framework:

  • Outperforms traditional pharmacovigilance methods
  • Leverages machine learning and semantic filtering
  • Identifies true patient reports in noisy data

This contributes to safer, patient-informed healthcare by enhancing adverse drug event detection in informal digital environments.

CONFLICT OF INTEREST

The author declares that there are no conflicts of interest.

REFERENCES

  1. A. R. Miller and C. Tucker, "Active social media management: the case of health care," Information Systems Research, vol. 24, no. 1, pp. 52–70, 2013.
  2. J. J. Mao, A. Chung, A. Benton, S. Hill, L. Ungar, C. E. Leonard, S. Hennessy, and J. H. Holmes, "Online discussion of drug side effects and discontinuation among breast cancer survivors," Pharmacoepidemiology and Drug Safety, vol. 22, no. 3, pp. 256–262, 2013.
  3. E. Basch, "The missing voice of patients in drug-safety reporting," New England Journal of Medicine, vol. 362, no. 10, pp. 865–869, 2010.
  4. M. Hauben and A. Bate, "Decision support methods for the detection of adverse events in post-marketing data," Drug Discovery Today, vol. 14, no. 7, pp. 343–357, 2009.
  5. A. Bate and S. Evans, "Quantitative signal detection using spontaneous ADR reporting," Pharmacoepidemiology and Drug Safety, vol. 18, no. 6, pp. 427–436, 2009.
  6. I. R. Edwards and M. Lindquist, "Social media and networks in pharmacovigilance," Drug Safety, vol. 34, no. 4, pp. 267–271, 2011.
  7. R. Chunara, J. R. Andrews, and J. S. Brownstein, "Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak," American Journal of Tropical Medicine and Hygiene, vol. 86, no. 1, pp. 39–45, 2012.
  8. R. Harpaz, W. DuMouchel, N. H. Shah, D. Madigan, P. Ryan, and C. Friedman, "Novel data-mining methodologies for adverse drug event discovery and analysis," Clinical Pharmacology & Therapeutics, vol. 91, no. 6, pp. 1010–1021, 2012.
  9. R. Leaman, L. Wojtulewicz, R. Sullivan, A. Skariah, J. Yang, and G. Gonzalez, "Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-related social networks," in Proc. 2010 Workshop on Biomedical Natural Language Processing, Association for Computational Linguistics, pp. 117–125, 2010.
  10. A. Benton, L. Ungar, S. Hill, S. Hennessy, J. Mao, A. Chung, C. E. Leonard, and J. H. Holmes, "Identifying potential adverse effects using the web: a new approach to medical hypothesis generation," Journal of Biomedical Informatics, vol. 44, no. 6, pp. 989–996, 2011.
  11. A. Nikfarjam and G. H. Gonzalez, "Pattern mining for extraction of mentions of adverse drug reactions from user comments," AMIA Annual Symposium Proceedings, vol. 2011, pp. 1019, 2011.
  12. A. Yates and N. Goharian, "Adrtrace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites," in Advances in Information Retrieval, Springer, pp. 816–819, 2013.
  13. X. Liu and H. Chen, "Azdrugminer: an information extraction system for mining patient-reported adverse drug events in online patient forums," in Smart Health, Springer, pp. 134–150, 2013.
  14. J. Bian, U. Topaloglu, and F. Yu, "Towards large-scale twitter mining for drug-related adverse events," in Proc. 2012 Int. Workshop on Smart Health and Wellbeing, ACM, pp. 25–32, 2012.
  15. A. Sarker and G. Gonzalez, "Portable automatic text classification for adverse drug reaction detection via multi-corpus training," Journal of Biomedical Informatics, vol. 53, pp. 196–207, 2015.
  16. I. Segura-Bedmar, P. Martínez, R. Revert, and J. Moreno-Schneider, "Exploring Spanish health social media for detecting drug effects," BMC Medical Informatics and Decision Making, vol. 15, Suppl. 2, S6, 2015.
  17. B. W. Chee, R. Berlin, and B. Schatz, "Predicting adverse drug events from personal health messages," AMIA Annual Symposium Proceedings, pp. 217, 2011.
  18. H. Wu, H. Fang, S. Stanhope, et al., "Exploiting online discussions to discover unrecognized drug side effects," Methods of Information in Medicine, vol. 52, no. 2, pp. 152–159, 2013.
  19. D. A. Lindberg, B. L. Humphreys, and A. T. McCray, "The unified medical language system," Methods of Information in Medicine, vol. 32, no. 4, pp. 281–291, 1993.
  20. J. Hadzi-Puric and J. Grmusa, "Automatic drug adverse reaction discovery from parenting websites using disproportionality methods," in Proc. 2012 Int. Conf. on Advances in Social Networks Analysis and Mining (ASONAM 2012), IEEE Computer Society, pp. 792–797, 2012.
  21. M. Kuhn, M. Campillos, I. Letunic, L. J. Jensen, and P. Bork, "A side effect resource to capture phenotypic effects of drugs," Molecular Systems Biology, vol. 6, no. 1, p. 343, 2010.
  22. C. C. Yang, H. Yang, L. Jiang, and M. Zhang, "Social media mining for drug safety signal detection," in Proc. 2012 Int. Workshop on Smart Health and Wellbeing, pp. 33–40, 2012.
  23. H. Gurulingappa, L. Toldo, A. M. Rajput, J. A. Kors, A. Taweel, and Y. Tayrouz, "Automatic detection of adverse events to predict drug label changes using text and data mining techniques," Pharmacoepidemiology and Drug Safety, vol. 22, no. 11, pp. 1189–1194, 2013.
  24. Q.-C. Bui, S. Katrenko, and P. M. Sloot, "A hybrid approach to extract protein–protein interactions," Bioinformatics, vol. 27, no. 2, pp. 259–265, 2011.
  25. K. Fundel, R. Küffner, and R. Zimmer, "Relex–relation extraction using dependency parse trees," Bioinformatics, [complete details needed

Reference

  1. A. R. Miller and C. Tucker, "Active social media management: the case of health care," Information Systems Research, vol. 24, no. 1, pp. 52–70, 2013.
  2. J. J. Mao, A. Chung, A. Benton, S. Hill, L. Ungar, C. E. Leonard, S. Hennessy, and J. H. Holmes, "Online discussion of drug side effects and discontinuation among breast cancer survivors," Pharmacoepidemiology and Drug Safety, vol. 22, no. 3, pp. 256–262, 2013.
  3. E. Basch, "The missing voice of patients in drug-safety reporting," New England Journal of Medicine, vol. 362, no. 10, pp. 865–869, 2010.
  4. M. Hauben and A. Bate, "Decision support methods for the detection of adverse events in post-marketing data," Drug Discovery Today, vol. 14, no. 7, pp. 343–357, 2009.
  5. A. Bate and S. Evans, "Quantitative signal detection using spontaneous ADR reporting," Pharmacoepidemiology and Drug Safety, vol. 18, no. 6, pp. 427–436, 2009.
  6. I. R. Edwards and M. Lindquist, "Social media and networks in pharmacovigilance," Drug Safety, vol. 34, no. 4, pp. 267–271, 2011.
  7. R. Chunara, J. R. Andrews, and J. S. Brownstein, "Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak," American Journal of Tropical Medicine and Hygiene, vol. 86, no. 1, pp. 39–45, 2012.
  8. R. Harpaz, W. DuMouchel, N. H. Shah, D. Madigan, P. Ryan, and C. Friedman, "Novel data-mining methodologies for adverse drug event discovery and analysis," Clinical Pharmacology & Therapeutics, vol. 91, no. 6, pp. 1010–1021, 2012.
  9. R. Leaman, L. Wojtulewicz, R. Sullivan, A. Skariah, J. Yang, and G. Gonzalez, "Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-related social networks," in Proc. 2010 Workshop on Biomedical Natural Language Processing, Association for Computational Linguistics, pp. 117–125, 2010.
  10. A. Benton, L. Ungar, S. Hill, S. Hennessy, J. Mao, A. Chung, C. E. Leonard, and J. H. Holmes, "Identifying potential adverse effects using the web: a new approach to medical hypothesis generation," Journal of Biomedical Informatics, vol. 44, no. 6, pp. 989–996, 2011.
  11. A. Nikfarjam and G. H. Gonzalez, "Pattern mining for extraction of mentions of adverse drug reactions from user comments," AMIA Annual Symposium Proceedings, vol. 2011, pp. 1019, 2011.
  12. A. Yates and N. Goharian, "Adrtrace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites," in Advances in Information Retrieval, Springer, pp. 816–819, 2013.
  13. X. Liu and H. Chen, "Azdrugminer: an information extraction system for mining patient-reported adverse drug events in online patient forums," in Smart Health, Springer, pp. 134–150, 2013.
  14. J. Bian, U. Topaloglu, and F. Yu, "Towards large-scale twitter mining for drug-related adverse events," in Proc. 2012 Int. Workshop on Smart Health and Wellbeing, ACM, pp. 25–32, 2012.
  15. A. Sarker and G. Gonzalez, "Portable automatic text classification for adverse drug reaction detection via multi-corpus training," Journal of Biomedical Informatics, vol. 53, pp. 196–207, 2015.
  16. I. Segura-Bedmar, P. Martínez, R. Revert, and J. Moreno-Schneider, "Exploring Spanish health social media for detecting drug effects," BMC Medical Informatics and Decision Making, vol. 15, Suppl. 2, S6, 2015.
  17. B. W. Chee, R. Berlin, and B. Schatz, "Predicting adverse drug events from personal health messages," AMIA Annual Symposium Proceedings, pp. 217, 2011.
  18. H. Wu, H. Fang, S. Stanhope, et al., "Exploiting online discussions to discover unrecognized drug side effects," Methods of Information in Medicine, vol. 52, no. 2, pp. 152–159, 2013.
  19. D. A. Lindberg, B. L. Humphreys, and A. T. McCray, "The unified medical language system," Methods of Information in Medicine, vol. 32, no. 4, pp. 281–291, 1993.
  20. J. Hadzi-Puric and J. Grmusa, "Automatic drug adverse reaction discovery from parenting websites using disproportionality methods," in Proc. 2012 Int. Conf. on Advances in Social Networks Analysis and Mining (ASONAM 2012), IEEE Computer Society, pp. 792–797, 2012.
  21. M. Kuhn, M. Campillos, I. Letunic, L. J. Jensen, and P. Bork, "A side effect resource to capture phenotypic effects of drugs," Molecular Systems Biology, vol. 6, no. 1, p. 343, 2010.
  22. C. C. Yang, H. Yang, L. Jiang, and M. Zhang, "Social media mining for drug safety signal detection," in Proc. 2012 Int. Workshop on Smart Health and Wellbeing, pp. 33–40, 2012.
  23. H. Gurulingappa, L. Toldo, A. M. Rajput, J. A. Kors, A. Taweel, and Y. Tayrouz, "Automatic detection of adverse events to predict drug label changes using text and data mining techniques," Pharmacoepidemiology and Drug Safety, vol. 22, no. 11, pp. 1189–1194, 2013.
  24. Q.-C. Bui, S. Katrenko, and P. M. Sloot, "A hybrid approach to extract protein–protein interactions," Bioinformatics, vol. 27, no. 2, pp. 259–265, 2011.
  25. K. Fundel, R. Küffner, and R. Zimmer, "Relex–relation extraction using dependency parse trees," Bioinformatics, [complete details needed

Photo
Sagar Saini
Corresponding author

Hari College Of Pharmacy

Photo
Harsh Kumar
Co-author

Hari College Of Pharmacy

Photo
Mohit Rana
Co-author

Hari College Of Pharmacy

Sagar Saini, Harsh Kumar, Mohit Rana, A Review on Identification and Evaluation of Patient Adverse Drug Event Report, Int. J. of Pharm. Sci., 2025, Vol 3, Issue 6, 5432-5438. https://doi.org/10.5281/zenodo.15758051

More related articles
Emulgels as Effective Carriers for Hydrophobic and...
Pratiksha Varhade, Pallavi Kandale, Vedangi Kulkarni, Shatrughna ...
Knowledge, Attitude, And Practice Towards Burnout ...
Arub Mohammed Albalawi, Marwa Gamal Mohammed, Ahmed Raja Albalawi...
Development And Evaluation of Polyherbal Ointment ...
Shivprasad Chate, Dhanashree Cholke, Mayur Bhosle, ...
Analytical Quality by Design Approach To RP-HPLC Method Development and Validati...
Rohan Pote, Dr. Sushil Patil, Ramdas Darade, Amol Gayake, Vikas Shinde, ...
A Comprehensive Review On Hydrogel...
Najreen bee , Nazneen Dubey, Rupanshi Sahu , Bharti choudhary, ...
Comprehensive Update on Allergic Rhinitis from an Otolaryngology Perspective: Cl...
Karla Peraza lafaurie, Lauren García Marenco, Andrea Martínez Garay, Valentina León Monsalvo, ...
Related Articles
Nanostructured Lipid Carriers: A Versatile Platform for Drug Delivery Across Th...
Shivani Rathor, Dr. Tanu Bhargava, Dr. Praveen Khirwadkar, Dr. Kamlesh Dashora, ...
Niosomes: Application Of Nanotechnology In Cancer Treatment...
Divya M sonvane, Gauri s salunke , Rakesh c thakare, Prasad s Borade, Ashok jagdale, , ...
Emulgels as Effective Carriers for Hydrophobic and Hydrophilic Drugs: A Review ...
Pratiksha Varhade, Pallavi Kandale, Vedangi Kulkarni, Shatrughna Nagrik, Dr. Shivshankar Mhaske, ...
More related articles
Emulgels as Effective Carriers for Hydrophobic and Hydrophilic Drugs: A Review ...
Pratiksha Varhade, Pallavi Kandale, Vedangi Kulkarni, Shatrughna Nagrik, Dr. Shivshankar Mhaske, ...
Knowledge, Attitude, And Practice Towards Burnout and Anxiety Among Family Medic...
Arub Mohammed Albalawi, Marwa Gamal Mohammed, Ahmed Raja Albalawi, Asma Ali Alharbi, Wejdan Mohammed...
Emulgels as Effective Carriers for Hydrophobic and Hydrophilic Drugs: A Review ...
Pratiksha Varhade, Pallavi Kandale, Vedangi Kulkarni, Shatrughna Nagrik, Dr. Shivshankar Mhaske, ...
Knowledge, Attitude, And Practice Towards Burnout and Anxiety Among Family Medic...
Arub Mohammed Albalawi, Marwa Gamal Mohammed, Ahmed Raja Albalawi, Asma Ali Alharbi, Wejdan Mohammed...