一个健壮且可扩展的框架,用于检测twitter上自我报告的疾病

Muhammad Asif Hossain Khan, M. Iwai, K. Sezaki
{"title":"一个健壮且可扩展的框架,用于检测twitter上自我报告的疾病","authors":"Muhammad Asif Hossain Khan, M. Iwai, K. Sezaki","doi":"10.1109/HealthCom.2012.6379425","DOIUrl":null,"url":null,"abstract":"Early detection of onset and outbreak of infectious diseases has paramount importance in containing such diseases before they turn into epidemics. The incredible growth in popularity and spatial resolution of coverage have made micro-blogging sites like Twitter a promising source of information for assessing the evolution of intensity of such diseases within a locality. However, identifying tweets with self-reported illness from other `disease related' tweets is important for avoiding false alarms. In this research, our endeavor is to segregate the tweets all of which fall under the general category of `disease related'. By using relatively very small training set and modifying the conventional n-gram feature selection method, we could isolate tweets reporting individual's illness with around 88.7% precision.","PeriodicalId":138952,"journal":{"name":"2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A robust and scalable framework for detecting self-reported illness from twitter\",\"authors\":\"Muhammad Asif Hossain Khan, M. Iwai, K. Sezaki\",\"doi\":\"10.1109/HealthCom.2012.6379425\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Early detection of onset and outbreak of infectious diseases has paramount importance in containing such diseases before they turn into epidemics. The incredible growth in popularity and spatial resolution of coverage have made micro-blogging sites like Twitter a promising source of information for assessing the evolution of intensity of such diseases within a locality. However, identifying tweets with self-reported illness from other `disease related' tweets is important for avoiding false alarms. In this research, our endeavor is to segregate the tweets all of which fall under the general category of `disease related'. By using relatively very small training set and modifying the conventional n-gram feature selection method, we could isolate tweets reporting individual's illness with around 88.7% precision.\",\"PeriodicalId\":138952,\"journal\":{\"name\":\"2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HealthCom.2012.6379425\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HealthCom.2012.6379425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

早期发现传染病的发病和爆发,对于在这些疾病转变为流行病之前加以控制至关重要。令人难以置信的普及和覆盖的空间分辨率的增长,使得像Twitter这样的微博客网站成为评估这类疾病在一个地方的强度演变的有希望的信息来源。然而,从其他“与疾病相关”的推文中识别出自我报告疾病的推文对于避免误报很重要。在这项研究中,我们的努力是将所有属于“疾病相关”一般类别的推文隔离开来。通过使用相对非常小的训练集并修改传统的n-gram特征选择方法,我们可以以大约88.7%的精度分离出报告个体疾病的推文。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A robust and scalable framework for detecting self-reported illness from twitter
Early detection of onset and outbreak of infectious diseases has paramount importance in containing such diseases before they turn into epidemics. The incredible growth in popularity and spatial resolution of coverage have made micro-blogging sites like Twitter a promising source of information for assessing the evolution of intensity of such diseases within a locality. However, identifying tweets with self-reported illness from other `disease related' tweets is important for avoiding false alarms. In this research, our endeavor is to segregate the tweets all of which fall under the general category of `disease related'. By using relatively very small training set and modifying the conventional n-gram feature selection method, we could isolate tweets reporting individual's illness with around 88.7% precision.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信