Tobin South , Bridget Smart , Matthew Roughan , Lewis Mitchell
{"title":"信息流估计:推特新闻研究","authors":"Tobin South , Bridget Smart , Matthew Roughan , Lewis Mitchell","doi":"10.1016/j.osnem.2022.100231","DOIUrl":null,"url":null,"abstract":"<div><p>News media has long been an ecosystem of creation, reproduction, and critique, where news outlets report on current events and add commentary to ongoing stories. Understanding the dynamics of news information creation and dispersion is important to accurately ascribe credit to influential work and understand how societal narratives develop. These dynamics can be modelled through a combination of information-theoretic natural language processing and networks; and can be parameterised using large quantities of textual data. However, it is challenging to see “the wood for the trees”, <em>i.e.,</em> to detect small but important flows of information in a sea of noise. Here we develop new comparative techniques to estimate temporal information flow between pairs of text producers. Using both simulated and real text data we compare the reliability and sensitivity of methods for estimating textual information flow, showing that a metric that normalises by local neighbourhood structure provides a robust estimate of information flow in large networks. We apply this metric to a large corpus of news organisations on Twitter and demonstrate its usefulness in identifying influence within an information ecosystem, finding that average information contribution to the network is not correlated with the number of followers or the number of tweets. This suggests that small local organisations and right-wing organisations which have lower average follower counts still contribute significant information to the ecosystem. Further, the methods are applied to smaller full-text datasets of specific news events across news sites and Russian troll accounts on Twitter. The information flow estimation reveals and quantifies features of how these events develop and the role of groups of trolls in setting disinformation narratives. In summary, this work provides a new methodology for examining the information transmitted between content producers in any connected system of natural language, a toolkit with applications to the many networked discourses of our online world.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000337/pdfft?md5=4198fd7ff22efcd01f7d1b4875be626a&pid=1-s2.0-S2468696422000337-main.pdf","citationCount":"4","resultStr":"{\"title\":\"Information flow estimation: A study of news on Twitter\",\"authors\":\"Tobin South , Bridget Smart , Matthew Roughan , Lewis Mitchell\",\"doi\":\"10.1016/j.osnem.2022.100231\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>News media has long been an ecosystem of creation, reproduction, and critique, where news outlets report on current events and add commentary to ongoing stories. Understanding the dynamics of news information creation and dispersion is important to accurately ascribe credit to influential work and understand how societal narratives develop. These dynamics can be modelled through a combination of information-theoretic natural language processing and networks; and can be parameterised using large quantities of textual data. However, it is challenging to see “the wood for the trees”, <em>i.e.,</em> to detect small but important flows of information in a sea of noise. Here we develop new comparative techniques to estimate temporal information flow between pairs of text producers. Using both simulated and real text data we compare the reliability and sensitivity of methods for estimating textual information flow, showing that a metric that normalises by local neighbourhood structure provides a robust estimate of information flow in large networks. We apply this metric to a large corpus of news organisations on Twitter and demonstrate its usefulness in identifying influence within an information ecosystem, finding that average information contribution to the network is not correlated with the number of followers or the number of tweets. This suggests that small local organisations and right-wing organisations which have lower average follower counts still contribute significant information to the ecosystem. Further, the methods are applied to smaller full-text datasets of specific news events across news sites and Russian troll accounts on Twitter. The information flow estimation reveals and quantifies features of how these events develop and the role of groups of trolls in setting disinformation narratives. In summary, this work provides a new methodology for examining the information transmitted between content producers in any connected system of natural language, a toolkit with applications to the many networked discourses of our online world.</p></div>\",\"PeriodicalId\":52228,\"journal\":{\"name\":\"Online Social Networks and Media\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2468696422000337/pdfft?md5=4198fd7ff22efcd01f7d1b4875be626a&pid=1-s2.0-S2468696422000337-main.pdf\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Online Social Networks and Media\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2468696422000337\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Online Social Networks and Media","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468696422000337","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
Information flow estimation: A study of news on Twitter
News media has long been an ecosystem of creation, reproduction, and critique, where news outlets report on current events and add commentary to ongoing stories. Understanding the dynamics of news information creation and dispersion is important to accurately ascribe credit to influential work and understand how societal narratives develop. These dynamics can be modelled through a combination of information-theoretic natural language processing and networks; and can be parameterised using large quantities of textual data. However, it is challenging to see “the wood for the trees”, i.e., to detect small but important flows of information in a sea of noise. Here we develop new comparative techniques to estimate temporal information flow between pairs of text producers. Using both simulated and real text data we compare the reliability and sensitivity of methods for estimating textual information flow, showing that a metric that normalises by local neighbourhood structure provides a robust estimate of information flow in large networks. We apply this metric to a large corpus of news organisations on Twitter and demonstrate its usefulness in identifying influence within an information ecosystem, finding that average information contribution to the network is not correlated with the number of followers or the number of tweets. This suggests that small local organisations and right-wing organisations which have lower average follower counts still contribute significant information to the ecosystem. Further, the methods are applied to smaller full-text datasets of specific news events across news sites and Russian troll accounts on Twitter. The information flow estimation reveals and quantifies features of how these events develop and the role of groups of trolls in setting disinformation narratives. In summary, this work provides a new methodology for examining the information transmitted between content producers in any connected system of natural language, a toolkit with applications to the many networked discourses of our online world.