Ashwin Ashok, M. Guruprasad, C. Prakash, S. Shylaja
{"title":"A Machine Learning Approach for Disease Surveillance and Visualization using Twitter Data","authors":"Ashwin Ashok, M. Guruprasad, C. Prakash, S. Shylaja","doi":"10.1109/ICCIDS.2019.8862087","DOIUrl":null,"url":null,"abstract":"Insights from real-time disease surveillance systems are very useful for the public to take preventive measures against the diseases and it also benefits the pharmaceutical manufacturers in improving the sales of medicines for the particular disease and ensuring adequate availability of medicines when they are needed.A disease outbreak is an event wherein there is a rise in the number of positive cases for a disease in a short span of time. An outbreak can be limited to a particular region or time of the year. Diseases can be detected by several approaches, social media being preferred method due to availability of real-time data. Hence, data from social media, especially Twitter can be used to detect live events and monitor them efficiently. In order to detect diseases precisely, this paper proposes an approach wherein tweets, which are collected and pre-processed, can be effectively vectorized and clustered into the appropriate diseases with the use Agglomerative Clustering technique. The tweets can also be visualized using their geo information in order to generate zones which have high density of diseases. Such a surveillance system can be of use for early prediction of disease outbreaks, in turn facilitating faster and better handling of the situation.","PeriodicalId":196915,"journal":{"name":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Computational Intelligence in Data Science (ICCIDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIDS.2019.8862087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Insights from real-time disease surveillance systems are very useful for the public to take preventive measures against the diseases and it also benefits the pharmaceutical manufacturers in improving the sales of medicines for the particular disease and ensuring adequate availability of medicines when they are needed.A disease outbreak is an event wherein there is a rise in the number of positive cases for a disease in a short span of time. An outbreak can be limited to a particular region or time of the year. Diseases can be detected by several approaches, social media being preferred method due to availability of real-time data. Hence, data from social media, especially Twitter can be used to detect live events and monitor them efficiently. In order to detect diseases precisely, this paper proposes an approach wherein tweets, which are collected and pre-processed, can be effectively vectorized and clustered into the appropriate diseases with the use Agglomerative Clustering technique. The tweets can also be visualized using their geo information in order to generate zones which have high density of diseases. Such a surveillance system can be of use for early prediction of disease outbreaks, in turn facilitating faster and better handling of the situation.