LZ Albances, Beatrice Anne Bungar, Jannah Patrize Patio, Rio Jan Marty Sevilla, Donata D. Acula
{"title":"Application of C5.0 Algorithm to Flu Prediction Using Twitter Data","authors":"LZ Albances, Beatrice Anne Bungar, Jannah Patrize Patio, Rio Jan Marty Sevilla, Donata D. Acula","doi":"10.1109/PLATCON.2018.8472737","DOIUrl":null,"url":null,"abstract":"Since one's health is a factor considered, data coming from Twitter, one of the most popular social media platforms often used by millions of people, is beneficial for predictions of certain diseases. The researchers created a system that will improve the precision rate of the current system conducted by Santos and Matos using C5.0 algorithm instead of Naive Bayes algorithm for classifying tweets with flu or without flu. For the testing part, a total of 1000 tweets which is only limited within the Philippines were gathered to evaluate the system. Moreover, both English and Tagalog tweets are included in the dataset. The researchers found that the proposed system, after examination, has achieved a rate of 62.40% in terms of precision, and 66% in terms of accuracy. It was concluded that the C5.0 algorithm is less precise but more accurate than the Naive Bayes algorithm.","PeriodicalId":231523,"journal":{"name":"2018 International Conference on Platform Technology and Service (PlatCon)","volume":"64 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Platform Technology and Service (PlatCon)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PLATCON.2018.8472737","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Since one's health is a factor considered, data coming from Twitter, one of the most popular social media platforms often used by millions of people, is beneficial for predictions of certain diseases. The researchers created a system that will improve the precision rate of the current system conducted by Santos and Matos using C5.0 algorithm instead of Naive Bayes algorithm for classifying tweets with flu or without flu. For the testing part, a total of 1000 tweets which is only limited within the Philippines were gathered to evaluate the system. Moreover, both English and Tagalog tweets are included in the dataset. The researchers found that the proposed system, after examination, has achieved a rate of 62.40% in terms of precision, and 66% in terms of accuracy. It was concluded that the C5.0 algorithm is less precise but more accurate than the Naive Bayes algorithm.