{"title":"一个健壮且可扩展的框架,用于检测twitter上自我报告的疾病","authors":"Muhammad Asif Hossain Khan, M. Iwai, K. Sezaki","doi":"10.1109/HealthCom.2012.6379425","DOIUrl":null,"url":null,"abstract":"Early detection of onset and outbreak of infectious diseases has paramount importance in containing such diseases before they turn into epidemics. The incredible growth in popularity and spatial resolution of coverage have made micro-blogging sites like Twitter a promising source of information for assessing the evolution of intensity of such diseases within a locality. However, identifying tweets with self-reported illness from other `disease related' tweets is important for avoiding false alarms. In this research, our endeavor is to segregate the tweets all of which fall under the general category of `disease related'. By using relatively very small training set and modifying the conventional n-gram feature selection method, we could isolate tweets reporting individual's illness with around 88.7% precision.","PeriodicalId":138952,"journal":{"name":"2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A robust and scalable framework for detecting self-reported illness from twitter\",\"authors\":\"Muhammad Asif Hossain Khan, M. Iwai, K. Sezaki\",\"doi\":\"10.1109/HealthCom.2012.6379425\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Early detection of onset and outbreak of infectious diseases has paramount importance in containing such diseases before they turn into epidemics. The incredible growth in popularity and spatial resolution of coverage have made micro-blogging sites like Twitter a promising source of information for assessing the evolution of intensity of such diseases within a locality. However, identifying tweets with self-reported illness from other `disease related' tweets is important for avoiding false alarms. In this research, our endeavor is to segregate the tweets all of which fall under the general category of `disease related'. By using relatively very small training set and modifying the conventional n-gram feature selection method, we could isolate tweets reporting individual's illness with around 88.7% precision.\",\"PeriodicalId\":138952,\"journal\":{\"name\":\"2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HealthCom.2012.6379425\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HealthCom.2012.6379425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A robust and scalable framework for detecting self-reported illness from twitter
Early detection of onset and outbreak of infectious diseases has paramount importance in containing such diseases before they turn into epidemics. The incredible growth in popularity and spatial resolution of coverage have made micro-blogging sites like Twitter a promising source of information for assessing the evolution of intensity of such diseases within a locality. However, identifying tweets with self-reported illness from other `disease related' tweets is important for avoiding false alarms. In this research, our endeavor is to segregate the tweets all of which fall under the general category of `disease related'. By using relatively very small training set and modifying the conventional n-gram feature selection method, we could isolate tweets reporting individual's illness with around 88.7% precision.