使用应用机器学习的社交网络语境预测:对加拿大推特用户的研究

Hamman W. Samuel, Benyamin Noori, Sara Farazi, Osmar R Zaiane
{"title":"使用应用机器学习的社交网络语境预测:对加拿大推特用户的研究","authors":"Hamman W. Samuel, Benyamin Noori, Sara Farazi, Osmar R Zaiane","doi":"10.1109/WI.2018.00-85","DOIUrl":null,"url":null,"abstract":"In this ongoing work, we present the Grebe social data aggregation framework for extracting geo-fenced Twitter data for analysis of user engagement in health and wellness topics. Grebe also provides various visualization tools for analyzing temporal and geographical health trends. Grebe currently has over 18 million indexed public tweets, and is the first of its kind for Canadian researchers. The large dataset is used for analyzing three types of contexts: geographical context via prediction of user location using supervised learning, topical context via determining health-related tweets using various learning approaches, and affective context via sentiment analysis of tweets using rule-based methods. For the first, we define user location as the position from which users are posting a tweet and use standard precision metrics for evaluation with promising results for predicting provinces and cities from tweet text. For the second, we use a broader definition of health using the six dimensions of wellness model and evaluate using manually annotated documents with good results using supervised and semi-supervised machine learning. For the third, we use the indexed tweets to show current trends in emotions and opinions and demonstrate trends in polarity and emotions across various Canadian provinces. The combination of these contexts provides useful insights for digital epidemiology. Ultimately, the vision of Grebe is to provide researchers with Canada-specific social web datasets through an open source platform with an accessible RESTful API, and this paper showcases Grebe's potential and presents our progress towards achieving these goals.","PeriodicalId":405966,"journal":{"name":"2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Context Prediction in the Social Web Using Applied Machine Learning: A Study of Canadian Tweeters\",\"authors\":\"Hamman W. Samuel, Benyamin Noori, Sara Farazi, Osmar R Zaiane\",\"doi\":\"10.1109/WI.2018.00-85\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this ongoing work, we present the Grebe social data aggregation framework for extracting geo-fenced Twitter data for analysis of user engagement in health and wellness topics. Grebe also provides various visualization tools for analyzing temporal and geographical health trends. Grebe currently has over 18 million indexed public tweets, and is the first of its kind for Canadian researchers. The large dataset is used for analyzing three types of contexts: geographical context via prediction of user location using supervised learning, topical context via determining health-related tweets using various learning approaches, and affective context via sentiment analysis of tweets using rule-based methods. For the first, we define user location as the position from which users are posting a tweet and use standard precision metrics for evaluation with promising results for predicting provinces and cities from tweet text. For the second, we use a broader definition of health using the six dimensions of wellness model and evaluate using manually annotated documents with good results using supervised and semi-supervised machine learning. For the third, we use the indexed tweets to show current trends in emotions and opinions and demonstrate trends in polarity and emotions across various Canadian provinces. The combination of these contexts provides useful insights for digital epidemiology. Ultimately, the vision of Grebe is to provide researchers with Canada-specific social web datasets through an open source platform with an accessible RESTful API, and this paper showcases Grebe's potential and presents our progress towards achieving these goals.\",\"PeriodicalId\":405966,\"journal\":{\"name\":\"2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI.2018.00-85\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2018.00-85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

在这项正在进行的工作中,我们提出了Grebe社交数据聚合框架,用于提取地理隔离的Twitter数据,以分析用户在健康和保健主题中的参与度。Grebe还提供各种可视化工具,用于分析时间和地理健康趋势。Grebe目前有超过1800万条被索引的公共推文,这是加拿大研究人员的第一次。大型数据集用于分析三种类型的上下文:通过使用监督学习预测用户位置的地理上下文,通过使用各种学习方法确定与健康相关的推文的主题上下文,以及通过使用基于规则的方法对推文进行情感分析的情感上下文。首先,我们将用户位置定义为用户发布tweet的位置,并使用标准精度指标进行评估,从tweet文本预测省份和城市的结果很有希望。其次,我们使用健康模型的六个维度使用更广泛的健康定义,并使用人工注释文档进行评估,使用监督和半监督机器学习获得良好的结果。第三,我们使用索引tweet来显示情绪和观点的当前趋势,并展示加拿大各省极性和情绪的趋势。这些背景的结合为数字流行病学提供了有用的见解。最终,Grebe的愿景是通过一个具有可访问的RESTful API的开源平台为研究人员提供加拿大特定的社交网络数据集,本文展示了Grebe的潜力,并展示了我们实现这些目标的进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Context Prediction in the Social Web Using Applied Machine Learning: A Study of Canadian Tweeters
In this ongoing work, we present the Grebe social data aggregation framework for extracting geo-fenced Twitter data for analysis of user engagement in health and wellness topics. Grebe also provides various visualization tools for analyzing temporal and geographical health trends. Grebe currently has over 18 million indexed public tweets, and is the first of its kind for Canadian researchers. The large dataset is used for analyzing three types of contexts: geographical context via prediction of user location using supervised learning, topical context via determining health-related tweets using various learning approaches, and affective context via sentiment analysis of tweets using rule-based methods. For the first, we define user location as the position from which users are posting a tweet and use standard precision metrics for evaluation with promising results for predicting provinces and cities from tweet text. For the second, we use a broader definition of health using the six dimensions of wellness model and evaluate using manually annotated documents with good results using supervised and semi-supervised machine learning. For the third, we use the indexed tweets to show current trends in emotions and opinions and demonstrate trends in polarity and emotions across various Canadian provinces. The combination of these contexts provides useful insights for digital epidemiology. Ultimately, the vision of Grebe is to provide researchers with Canada-specific social web datasets through an open source platform with an accessible RESTful API, and this paper showcases Grebe's potential and presents our progress towards achieving these goals.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信