使用基于内容和基于用户的功能评估与灾难相关的推特可信度

IF 2.6 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

Information Discovery and Delivery Pub Date : 2021-04-05 DOI:10.1108/IDD-04-2020-0044

Nasser Assery, Y. Xiaohong, Qu Xiuli, Roy Kaushik, S. Almalki

{"title":"使用基于内容和基于用户的功能评估与灾难相关的推特可信度","authors":"Nasser Assery, Y. Xiaohong, Qu Xiuli, Roy Kaushik, S. Almalki","doi":"10.1108/IDD-04-2020-0044","DOIUrl":null,"url":null,"abstract":"PurposeThis study aims to propose an unsupervised learning model to evaluate the credibility of disaster-related Twitter data and present a performance comparison with commonly used supervised machine learning models.Design/methodology/approachFirst historical tweets on two recent hurricane events are collected via Twitter API. Then a credibility scoring system is implemented in which the tweet features are analyzed to give a credibility score and credibility label to the tweet. After that, supervised machine learning classification is implemented using various classification algorithms and their performances are compared.FindingsThe proposed unsupervised learning model could enhance the emergency response by providing a fast way to determine the credibility of disaster-related tweets. Additionally, the comparison of the supervised classification models reveals that the Random Forest classifier performs significantly better than the SVM and Logistic Regression classifiers in classifying the credibility of disaster-related tweets.Originality/valueIn this paper, an unsupervised 10-point scoring model is proposed to evaluate the tweets’ credibility based on the user-based and content-based features. This technique could be used to evaluate the credibility of disaster-related tweets on future hurricanes and would have the potential to enhance emergency response during critical events. The comparative study of different supervised learning methods has revealed effective supervised learning methods for evaluating the credibility of Tweeter data.","PeriodicalId":43488,"journal":{"name":"Information Discovery and Delivery","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2021-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating disaster-related tweet credibility using content-based and user-based features\",\"authors\":\"Nasser Assery, Y. Xiaohong, Qu Xiuli, Roy Kaushik, S. Almalki\",\"doi\":\"10.1108/IDD-04-2020-0044\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"PurposeThis study aims to propose an unsupervised learning model to evaluate the credibility of disaster-related Twitter data and present a performance comparison with commonly used supervised machine learning models.Design/methodology/approachFirst historical tweets on two recent hurricane events are collected via Twitter API. Then a credibility scoring system is implemented in which the tweet features are analyzed to give a credibility score and credibility label to the tweet. After that, supervised machine learning classification is implemented using various classification algorithms and their performances are compared.FindingsThe proposed unsupervised learning model could enhance the emergency response by providing a fast way to determine the credibility of disaster-related tweets. Additionally, the comparison of the supervised classification models reveals that the Random Forest classifier performs significantly better than the SVM and Logistic Regression classifiers in classifying the credibility of disaster-related tweets.Originality/valueIn this paper, an unsupervised 10-point scoring model is proposed to evaluate the tweets’ credibility based on the user-based and content-based features. This technique could be used to evaluate the credibility of disaster-related tweets on future hurricanes and would have the potential to enhance emergency response during critical events. The comparative study of different supervised learning methods has revealed effective supervised learning methods for evaluating the credibility of Tweeter data.\",\"PeriodicalId\":43488,\"journal\":{\"name\":\"Information Discovery and Delivery\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2021-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Discovery and Delivery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1108/IDD-04-2020-0044\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Discovery and Delivery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/IDD-04-2020-0044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}

引用次数: 0

摘要

目的本研究旨在提出一种无监督学习模型来评估灾难相关推特数据的可信度，并与常用的有监督机器学习模型进行性能比较。设计/方法论/方法通过Twitter API收集最近两次飓风事件的第一条历史推文。然后实现了一个可信度评分系统，对推文特征进行分析，给出推文的可信度评分和可信度标签。然后，使用各种分类算法实现了有监督的机器学习分类，并对其性能进行了比较。发现所提出的无监督学习模型可以通过提供一种快速确定灾难相关推文可信度的方法来增强应急响应。此外，监督分类模型的比较表明，随机森林分类器在对灾害相关推文的可信度进行分类方面明显优于SVM和Logistic回归分类器。原创性/价值本文提出了一个无监督的10分评分模型，基于基于用户和基于内容的特征来评估推文的可信度。这项技术可用于评估未来飓风灾害相关推文的可信度，并有可能在重大事件期间加强应急响应。对不同监督学习方法的比较研究揭示了评估推特数据可信度的有效监督学习方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluating disaster-related tweet credibility using content-based and user-based features

PurposeThis study aims to propose an unsupervised learning model to evaluate the credibility of disaster-related Twitter data and present a performance comparison with commonly used supervised machine learning models.Design/methodology/approachFirst historical tweets on two recent hurricane events are collected via Twitter API. Then a credibility scoring system is implemented in which the tweet features are analyzed to give a credibility score and credibility label to the tweet. After that, supervised machine learning classification is implemented using various classification algorithms and their performances are compared.FindingsThe proposed unsupervised learning model could enhance the emergency response by providing a fast way to determine the credibility of disaster-related tweets. Additionally, the comparison of the supervised classification models reveals that the Random Forest classifier performs significantly better than the SVM and Logistic Regression classifiers in classifying the credibility of disaster-related tweets.Originality/valueIn this paper, an unsupervised 10-point scoring model is proposed to evaluate the tweets’ credibility based on the user-based and content-based features. This technique could be used to evaluate the credibility of disaster-related tweets on future hurricanes and would have the potential to enhance emergency response during critical events. The comparative study of different supervised learning methods has revealed effective supervised learning methods for evaluating the credibility of Tweeter data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Discovery and Delivery INFORMATION SCIENCE & LIBRARY SCIENCE-

CiteScore

5.40

自引率

4.80%

发文量

期刊介绍： Information Discovery and Delivery covers information discovery and access for digital information researchers. This includes educators, knowledge professionals in education and cultural organisations, knowledge managers in media, health care and government, as well as librarians. The journal publishes research and practice which explores the digital information supply chain ie transport, flows, tracking, exchange and sharing, including within and between libraries. It is also interested in digital information capture, packaging and storage by ‘collectors’ of all kinds. Information is widely defined, including but not limited to: Records, Documents, Learning objects, Visual and sound files, Data and metadata and , User-generated content.