使用词向量在新闻文章中发现偏见

A. Patankar, Joy Bose
{"title":"使用词向量在新闻文章中发现偏见","authors":"A. Patankar, Joy Bose","doi":"10.1109/ICMLA.2017.00-62","DOIUrl":null,"url":null,"abstract":"Given the ongoing controversy over biased news, it would be useful to have a system that can detect the extent of bias in online news articles and indicate it to the user in real time. Here we measure bias in a given sentence or article as the word vector similarity with a corpus of biased words. We compute the word vector similarity of each of the sentences with the words taken from a Wikipedia Neutral Point of View (NPOV) corpus, measured using the word2vec tool, where our model is trained using Wikipedia articles. We then compute the bias score, which indicates how much that article uses biased words. This is implemented as a web browser extension, which queries an online server running our bias detection algorithm. Finally, we validate the accuracy of our bias detection by comparing bias rankings of a variety of articles from various sources. We get lower bias scores for Wikipedia articles than for news articles, which is lower than that for opinion articles.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"4 1","pages":"785-788"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Bias Discovery in News Articles Using Word Vectors\",\"authors\":\"A. Patankar, Joy Bose\",\"doi\":\"10.1109/ICMLA.2017.00-62\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given the ongoing controversy over biased news, it would be useful to have a system that can detect the extent of bias in online news articles and indicate it to the user in real time. Here we measure bias in a given sentence or article as the word vector similarity with a corpus of biased words. We compute the word vector similarity of each of the sentences with the words taken from a Wikipedia Neutral Point of View (NPOV) corpus, measured using the word2vec tool, where our model is trained using Wikipedia articles. We then compute the bias score, which indicates how much that article uses biased words. This is implemented as a web browser extension, which queries an online server running our bias detection algorithm. Finally, we validate the accuracy of our bias detection by comparing bias rankings of a variety of articles from various sources. We get lower bias scores for Wikipedia articles than for news articles, which is lower than that for opinion articles.\",\"PeriodicalId\":6636,\"journal\":{\"name\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"4 1\",\"pages\":\"785-788\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2017.00-62\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2017.00-62","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

考虑到目前关于有偏见的新闻的争议,有一个系统可以检测在线新闻文章的偏见程度,并实时向用户显示它是有用的。在这里,我们将给定句子或文章中的偏差作为与有偏差的单词语料库的单词向量相似性来测量。我们用维基百科中立观点(NPOV)语料库中的单词计算每个句子的单词向量相似性,使用word2vec工具进行测量,我们的模型使用维基百科文章进行训练。然后我们计算偏差分数,这表明该文章使用了多少有偏差的词。这是作为web浏览器扩展实现的,它查询运行我们的偏见检测算法的在线服务器。最后,我们通过比较来自不同来源的各种文章的偏见排名来验证我们的偏见检测的准确性。我们对维基百科文章的偏见得分比新闻文章低,新闻文章比观点文章低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Bias Discovery in News Articles Using Word Vectors
Given the ongoing controversy over biased news, it would be useful to have a system that can detect the extent of bias in online news articles and indicate it to the user in real time. Here we measure bias in a given sentence or article as the word vector similarity with a corpus of biased words. We compute the word vector similarity of each of the sentences with the words taken from a Wikipedia Neutral Point of View (NPOV) corpus, measured using the word2vec tool, where our model is trained using Wikipedia articles. We then compute the bias score, which indicates how much that article uses biased words. This is implemented as a web browser extension, which queries an online server running our bias detection algorithm. Finally, we validate the accuracy of our bias detection by comparing bias rankings of a variety of articles from various sources. We get lower bias scores for Wikipedia articles than for news articles, which is lower than that for opinion articles.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信