量化Twitter上自我报告的药物不良事件:信号和话题分析

Proceedings of the 7th 2016 International Conference on Social Media & Society Pub Date : 2016-07-11 DOI:10.1145/2930971.2930977

Vassilis Plachouras, Jochen L. Leidner, Andrew G. Garrow

{"title":"量化Twitter上自我报告的药物不良事件:信号和话题分析","authors":"Vassilis Plachouras, Jochen L. Leidner, Andrew G. Garrow","doi":"10.1145/2930971.2930977","DOIUrl":null,"url":null,"abstract":"When a drug that is sold exhibits side effects, a well functioning ecosystem of pharmaceutical drug suppliers includes responsive regulators and pharmaceutical companies. Existing systems for monitoring adverse drug events, such as the Federal Adverse Events Reporting System (FAERS) in the US, have shown limited effectiveness due to the lack of incentives for healthcare professionals and patients. While social media present opportunities to mine information about adverse events in near real-time, there are still important questions to be answered in order to understand their impact on pharmacovigilance. First, it is not known how many relevant social media posts occur per day on platforms like Twitter, i.e., whether there is \"enough signal\" for a post-market pharmacovigilance program based on Twitter mining. Second, it is not known what other topics are discussed by users in posts mentioning pharmaceutical drugs. In this paper, we outline how social media can be used as a human sensor for drug use monitoring. We introduce a large-scale, near real-time system for computational pharmacovigilance, and use our system to estimate the order of magnitude of the volume of daily self-reported pharmaceutical drug side effect tweets. The processing pipeline comprises a set of cascaded filters, followed by a supervised machine learning classifier. The cascaded filters quickly reduce the volume to a manageable sub-stream, from which a Support Vector Machine (SVM) based classifier identifies adverse events based on a rich set of features taking into account surface-textual properties, as well as domain knowledge about drugs, side effects and the Twitter medium. Using a dataset of 10,000 manually annotated tweets, a SVM classifier achieves F1=60.4% and AUC=0.894. The yield of the classifier for a drug universe comprising 2,600 keywords is 721 tweets per day. We also investigate what other topics are discussed in the posts mentioning pharmaceutical drugs. We conclude by suggesting an ecosystem where regulators and pharmaceutical companies utilize social media to obtain feedback about consequences of pharmaceutical drug use.","PeriodicalId":227482,"journal":{"name":"Proceedings of the 7th 2016 International Conference on Social Media & Society","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Quantifying Self-Reported Adverse Drug Events on Twitter: Signal and Topic Analysis\",\"authors\":\"Vassilis Plachouras, Jochen L. Leidner, Andrew G. Garrow\",\"doi\":\"10.1145/2930971.2930977\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When a drug that is sold exhibits side effects, a well functioning ecosystem of pharmaceutical drug suppliers includes responsive regulators and pharmaceutical companies. Existing systems for monitoring adverse drug events, such as the Federal Adverse Events Reporting System (FAERS) in the US, have shown limited effectiveness due to the lack of incentives for healthcare professionals and patients. While social media present opportunities to mine information about adverse events in near real-time, there are still important questions to be answered in order to understand their impact on pharmacovigilance. First, it is not known how many relevant social media posts occur per day on platforms like Twitter, i.e., whether there is \\\"enough signal\\\" for a post-market pharmacovigilance program based on Twitter mining. Second, it is not known what other topics are discussed by users in posts mentioning pharmaceutical drugs. In this paper, we outline how social media can be used as a human sensor for drug use monitoring. We introduce a large-scale, near real-time system for computational pharmacovigilance, and use our system to estimate the order of magnitude of the volume of daily self-reported pharmaceutical drug side effect tweets. The processing pipeline comprises a set of cascaded filters, followed by a supervised machine learning classifier. The cascaded filters quickly reduce the volume to a manageable sub-stream, from which a Support Vector Machine (SVM) based classifier identifies adverse events based on a rich set of features taking into account surface-textual properties, as well as domain knowledge about drugs, side effects and the Twitter medium. Using a dataset of 10,000 manually annotated tweets, a SVM classifier achieves F1=60.4% and AUC=0.894. The yield of the classifier for a drug universe comprising 2,600 keywords is 721 tweets per day. We also investigate what other topics are discussed in the posts mentioning pharmaceutical drugs. We conclude by suggesting an ecosystem where regulators and pharmaceutical companies utilize social media to obtain feedback about consequences of pharmaceutical drug use.\",\"PeriodicalId\":227482,\"journal\":{\"name\":\"Proceedings of the 7th 2016 International Conference on Social Media & Society\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th 2016 International Conference on Social Media & Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2930971.2930977\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th 2016 International Conference on Social Media & Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2930971.2930977","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

摘要

当销售的药物显示出副作用时，一个运作良好的药物供应商生态系统包括负责任的监管机构和制药公司。现有的药物不良事件监测系统，如美国的联邦不良事件报告系统(FAERS)，由于缺乏对医疗保健专业人员和患者的激励，显示出有限的有效性。虽然社交媒体提供了近乎实时地挖掘不良事件信息的机会，但为了了解它们对药物警戒的影响，仍然有重要的问题需要回答。首先，不知道Twitter等平台上每天有多少相关的社交媒体帖子，也就是说，是否有“足够的信号”来开展基于Twitter挖掘的上市后药物警戒项目。其次，不知道用户在提到药品的帖子中还讨论了哪些话题。在本文中，我们概述了如何将社交媒体用作药物使用监测的人体传感器。我们引入了一个大规模的、接近实时的计算药物警戒系统，并使用我们的系统来估计每天自我报告药物副作用的推文的数量。处理管道包括一组级联过滤器，然后是一个监督机器学习分类器。级联过滤器迅速将体积减少到一个可管理的子流，其中基于支持向量机(SVM)的分类器基于一组丰富的特征来识别不良事件，考虑到表面文本属性，以及关于药物、副作用和Twitter媒体的领域知识。使用10000条人工标注推文的数据集，SVM分类器实现F1=60.4%， AUC=0.894。对于包含2600个关键词的药物领域，分类器的产出是每天721条tweet。我们也调查了在提到药物的帖子中讨论的其他话题。最后，我们建议建立一个生态系统，监管机构和制药公司利用社交媒体获得有关药物使用后果的反馈。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Quantifying Self-Reported Adverse Drug Events on Twitter: Signal and Topic Analysis

When a drug that is sold exhibits side effects, a well functioning ecosystem of pharmaceutical drug suppliers includes responsive regulators and pharmaceutical companies. Existing systems for monitoring adverse drug events, such as the Federal Adverse Events Reporting System (FAERS) in the US, have shown limited effectiveness due to the lack of incentives for healthcare professionals and patients. While social media present opportunities to mine information about adverse events in near real-time, there are still important questions to be answered in order to understand their impact on pharmacovigilance. First, it is not known how many relevant social media posts occur per day on platforms like Twitter, i.e., whether there is "enough signal" for a post-market pharmacovigilance program based on Twitter mining. Second, it is not known what other topics are discussed by users in posts mentioning pharmaceutical drugs. In this paper, we outline how social media can be used as a human sensor for drug use monitoring. We introduce a large-scale, near real-time system for computational pharmacovigilance, and use our system to estimate the order of magnitude of the volume of daily self-reported pharmaceutical drug side effect tweets. The processing pipeline comprises a set of cascaded filters, followed by a supervised machine learning classifier. The cascaded filters quickly reduce the volume to a manageable sub-stream, from which a Support Vector Machine (SVM) based classifier identifies adverse events based on a rich set of features taking into account surface-textual properties, as well as domain knowledge about drugs, side effects and the Twitter medium. Using a dataset of 10,000 manually annotated tweets, a SVM classifier achieves F1=60.4% and AUC=0.894. The yield of the classifier for a drug universe comprising 2,600 keywords is 721 tweets per day. We also investigate what other topics are discussed in the posts mentioning pharmaceutical drugs. We conclude by suggesting an ecosystem where regulators and pharmaceutical companies utilize social media to obtain feedback about consequences of pharmaceutical drug use.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 7th 2016 International Conference on Social Media & Society

自引率

0.00%

发文量