预测新闻文章病毒式传播的人工智能模型

Yesid L. Lopez, D. Grimaldi, Sebastian Garcia, Jonatan Ordoez, Carlos Carrasco-Farré, Andres A. Aristizabal
{"title":"预测新闻文章病毒式传播的人工智能模型","authors":"Yesid L. Lopez, D. Grimaldi, Sebastian Garcia, Jonatan Ordoez, Carlos Carrasco-Farré, Andres A. Aristizabal","doi":"10.1145/3529836.3529953","DOIUrl":null,"url":null,"abstract":"Currently, many people share news, links, or videos, without being aware of the impact they can have on people's decisions or ways of acting. A clear example, recently experienced in Colombia, corresponds to the national strike which happened at the time of this research. Due to these unexpected circumstances, colombians experienced the influence news have on decision making that can affect the country, not only economically but politically, and socially. It showed how news can generate fear in people, or even misinform, as is the case of fake news. For these reasons, it is key to determine the relevance a story can have. Predicting the impact, will allow us to pay more attention to those news that can affect people more, avoiding misinformation and fake news. However, the problem is that there is no way of predicting the impact that a press article can have. Therefore, the aim of this work is to implement a machine learning model that allows us to predict, with the best possible accuracy, the virality of online press articles (defining virality as the amount of clicks that an article receives when it is opened). In order to achieve this goal, we followed the CRISP-DM methodology, which focuses on machine learning projects. The best obtained result corresponds to the model where the core of the architecture was based on BERT, a pre-trained model, which, for a pair of press articles headlines, predicted whether the first headline would be more viral than the second one. On the other hand, the evaluation was carried out by comparing the amount of clicks for a pair of articles. For a practitioner point of view, digital marketers can use our results to select the best words for their online marketing campaign. For a theoretical point of view, our results present an innovative natural language processing approach based on one of the best breed of Neural network models (BERT).","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Artificial Intelligence Model to Predict the Virality of Press Articles\",\"authors\":\"Yesid L. Lopez, D. Grimaldi, Sebastian Garcia, Jonatan Ordoez, Carlos Carrasco-Farré, Andres A. Aristizabal\",\"doi\":\"10.1145/3529836.3529953\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently, many people share news, links, or videos, without being aware of the impact they can have on people's decisions or ways of acting. A clear example, recently experienced in Colombia, corresponds to the national strike which happened at the time of this research. Due to these unexpected circumstances, colombians experienced the influence news have on decision making that can affect the country, not only economically but politically, and socially. It showed how news can generate fear in people, or even misinform, as is the case of fake news. For these reasons, it is key to determine the relevance a story can have. Predicting the impact, will allow us to pay more attention to those news that can affect people more, avoiding misinformation and fake news. However, the problem is that there is no way of predicting the impact that a press article can have. Therefore, the aim of this work is to implement a machine learning model that allows us to predict, with the best possible accuracy, the virality of online press articles (defining virality as the amount of clicks that an article receives when it is opened). In order to achieve this goal, we followed the CRISP-DM methodology, which focuses on machine learning projects. The best obtained result corresponds to the model where the core of the architecture was based on BERT, a pre-trained model, which, for a pair of press articles headlines, predicted whether the first headline would be more viral than the second one. On the other hand, the evaluation was carried out by comparing the amount of clicks for a pair of articles. For a practitioner point of view, digital marketers can use our results to select the best words for their online marketing campaign. For a theoretical point of view, our results present an innovative natural language processing approach based on one of the best breed of Neural network models (BERT).\",\"PeriodicalId\":285191,\"journal\":{\"name\":\"2022 14th International Conference on Machine Learning and Computing (ICMLC)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 14th International Conference on Machine Learning and Computing (ICMLC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3529836.3529953\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529836.3529953","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目前,许多人分享新闻、链接或视频,却没有意识到他们会对人们的决定或行为方式产生影响。最近在哥伦比亚经历的一个明显的例子与本研究期间发生的全国罢工相对应。由于这些意想不到的情况,哥伦比亚人经历了新闻对决策的影响,这不仅会影响到国家的经济,还会影响到政治和社会。它展示了新闻是如何让人们产生恐惧,甚至是误导,就像假新闻一样。基于这些原因,决定故事的相关性是关键。预测影响,将使我们更加关注那些对人们影响更大的新闻,避免错误信息和假新闻。然而,问题是没有办法预测一篇新闻文章可能产生的影响。因此,这项工作的目的是实现一个机器学习模型,使我们能够尽可能准确地预测在线新闻文章的病毒式传播(将病毒式传播定义为一篇文章被打开时收到的点击量)。为了实现这一目标,我们遵循了CRISP-DM方法,该方法侧重于机器学习项目。得到的最佳结果与模型相对应,其中架构的核心是基于BERT的模型,BERT是一个预训练的模型,对于一对新闻文章标题,该模型预测第一个标题是否会比第二个标题更具病毒性。另一方面,通过对比两篇文章的点击量进行评价。从从业者的角度来看,数字营销人员可以使用我们的结果来为他们的在线营销活动选择最佳词汇。从理论的角度来看,我们的研究结果提出了一种基于神经网络模型(BERT)的创新自然语言处理方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Artificial Intelligence Model to Predict the Virality of Press Articles
Currently, many people share news, links, or videos, without being aware of the impact they can have on people's decisions or ways of acting. A clear example, recently experienced in Colombia, corresponds to the national strike which happened at the time of this research. Due to these unexpected circumstances, colombians experienced the influence news have on decision making that can affect the country, not only economically but politically, and socially. It showed how news can generate fear in people, or even misinform, as is the case of fake news. For these reasons, it is key to determine the relevance a story can have. Predicting the impact, will allow us to pay more attention to those news that can affect people more, avoiding misinformation and fake news. However, the problem is that there is no way of predicting the impact that a press article can have. Therefore, the aim of this work is to implement a machine learning model that allows us to predict, with the best possible accuracy, the virality of online press articles (defining virality as the amount of clicks that an article receives when it is opened). In order to achieve this goal, we followed the CRISP-DM methodology, which focuses on machine learning projects. The best obtained result corresponds to the model where the core of the architecture was based on BERT, a pre-trained model, which, for a pair of press articles headlines, predicted whether the first headline would be more viral than the second one. On the other hand, the evaluation was carried out by comparing the amount of clicks for a pair of articles. For a practitioner point of view, digital marketers can use our results to select the best words for their online marketing campaign. For a theoretical point of view, our results present an innovative natural language processing approach based on one of the best breed of Neural network models (BERT).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信