Predicting and Correcting Missing Data on Diffusion Processes in Multiplex Networks.

IF 0.6 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL
Alireza Khosravani, Mostafa Salehi, Vahid Ranjbar, Rajesh Sharma, Shaghayegh Najari
{"title":"Predicting and Correcting Missing Data on Diffusion Processes in Multiplex Networks.","authors":"Alireza Khosravani,&nbsp;Mostafa Salehi,&nbsp;Vahid Ranjbar,&nbsp;Rajesh Sharma,&nbsp;Shaghayegh Najari","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>The diffusion process in networks is studied with the objective of identifying the dynamics and for predicting the behavior of network entities. Social media plays an important role in people's lives. Diffusion processes, as one of the most important branches of social media analysis, have their presence in various domains such as information spreading, diffusion of innovation, idea dissemination, and product acceptance to identify user's pattern and their behavior in social media networks. Users are not limited to one social network and are engaged in multiple social media such as Twitter, Instagram, Telegram, and Facebook. This fact has created new phenomena in social network analysis, called multiplex network analysis. Thus, the scope of diffusion process analysis has been transferred from single layer networks to multiplex networks. Diffusion process analysis can be studied at both infrastructure-level and diffusion-level; at infrastructure-level, the structural network's properties such as clustering coefficient and degree centrality are being studied; and in diffusion-level the diffusion network's properties such as diffusion depth and seed nodes are being studied. On the other hand, a reliable analysis requires complete information on both infrastructure and diffusion networks. However, complete data is not accessible forever, this fact is due to some limitations such as crawling big data, gathering social media policies, and user privacy. Incomplete data can lead to poor analysis, so in this work we, first of all, investigate the impact of missing data in both infrastructure and diffusion networks, the impact of random and non-random missing infrastructure data on nine diffusion network's properties such as number of infected nodes, number of infected edges, diffusion length and number of seed nodes. Secondly, based on the multiplex diffusion tree, we introduce a new model named as MLC-tree for an incomplete diffusion network. Finally, we evaluate our model on both synthetic and real social networks; these results show that the MLC-tree can decrease the relative error more than 50 percent while missing 20 to 80 percent of complete data.</p>","PeriodicalId":46218,"journal":{"name":"Nonlinear Dynamics Psychology and Life Sciences","volume":"25 2","pages":"127-155"},"PeriodicalIF":0.6000,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nonlinear Dynamics Psychology and Life Sciences","FirstCategoryId":"102","ListUrlMain":"","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}
引用次数: 0

Abstract

The diffusion process in networks is studied with the objective of identifying the dynamics and for predicting the behavior of network entities. Social media plays an important role in people's lives. Diffusion processes, as one of the most important branches of social media analysis, have their presence in various domains such as information spreading, diffusion of innovation, idea dissemination, and product acceptance to identify user's pattern and their behavior in social media networks. Users are not limited to one social network and are engaged in multiple social media such as Twitter, Instagram, Telegram, and Facebook. This fact has created new phenomena in social network analysis, called multiplex network analysis. Thus, the scope of diffusion process analysis has been transferred from single layer networks to multiplex networks. Diffusion process analysis can be studied at both infrastructure-level and diffusion-level; at infrastructure-level, the structural network's properties such as clustering coefficient and degree centrality are being studied; and in diffusion-level the diffusion network's properties such as diffusion depth and seed nodes are being studied. On the other hand, a reliable analysis requires complete information on both infrastructure and diffusion networks. However, complete data is not accessible forever, this fact is due to some limitations such as crawling big data, gathering social media policies, and user privacy. Incomplete data can lead to poor analysis, so in this work we, first of all, investigate the impact of missing data in both infrastructure and diffusion networks, the impact of random and non-random missing infrastructure data on nine diffusion network's properties such as number of infected nodes, number of infected edges, diffusion length and number of seed nodes. Secondly, based on the multiplex diffusion tree, we introduce a new model named as MLC-tree for an incomplete diffusion network. Finally, we evaluate our model on both synthetic and real social networks; these results show that the MLC-tree can decrease the relative error more than 50 percent while missing 20 to 80 percent of complete data.

多路网络扩散过程中缺失数据的预测与校正。
研究网络中的扩散过程,目的是识别网络实体的动态和预测网络实体的行为。社交媒体在人们的生活中扮演着重要的角色。扩散过程作为社交媒体分析的一个重要分支,存在于信息传播、创新扩散、理念传播、产品接受等各个领域,用以识别用户在社交媒体网络中的模式和行为。用户不局限于一个社交网络,而是参与多个社交媒体,如Twitter、Instagram、Telegram和Facebook。这一事实在社会网络分析中产生了新的现象,即多重网络分析。因此,扩散过程分析的范围已从单层网络转移到多层网络。扩散过程分析可以在基础设施层面和扩散层面进行研究;在基础设施层面,研究了结构网络的聚类系数和度中心性等特性;在扩散层,研究了扩散网络的扩散深度和种子节点等特性。另一方面,可靠的分析需要关于基础设施和扩散网络的完整信息。然而,完整的数据不是永远可以访问的,这一事实是由于一些限制,如爬行大数据,收集社交媒体政策和用户隐私。不完整的数据会导致较差的分析,因此在本工作中,我们首先研究了基础设施和扩散网络中缺失数据的影响,随机和非随机缺失基础设施数据对9个扩散网络的属性(如感染节点数、感染边数、扩散长度和种子节点数)的影响。其次,在多重扩散树的基础上,对不完全扩散网络引入了一种新的mlc -树模型。最后,我们在合成社交网络和真实社交网络上对我们的模型进行了评估;这些结果表明,mlc树可以在丢失20 - 80%完整数据的情况下,将相对误差降低50%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.40
自引率
11.10%
发文量
26
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信