{"title":"一种分析数据集成对复杂网络扩散模型影响的方法","authors":"J. Nevin, Paul Groth, M. Lees","doi":"10.1093/comnet/cnad025","DOIUrl":null,"url":null,"abstract":"\n Complex networks are a powerful way to reason about systems with non-trivial patterns of interaction. The increased attention in this research area is accelerated by the increasing availability of complex network data sets, with data often being reused as secondary data sources. Typically, multiple data sources are combined to create a larger, fuller picture of these complex networks and in doing so scientists have to make sometimes subjective decisions about how these sources should be integrated. These seemingly trivial decisions can sometimes have significant impact on both the resultant integrated networks and any downstream network models executed on them. We highlight the importance of this impact in online social networks and dark networks, two use-cases where data are regularly combined from multiple sources due to challenges in measurement or overlap of networks. We present a method for systematically testing how different, realistic data integration approaches can alter both the networks themselves and network models run on them, as well as an associated Python package (NIDMod) that implements this method. A number of experiments show the effectiveness of our method in identifying the impact of different data integration setups on network diffusion models.","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An approach for analysing the impact of data integration on complex network diffusion models\",\"authors\":\"J. Nevin, Paul Groth, M. Lees\",\"doi\":\"10.1093/comnet/cnad025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Complex networks are a powerful way to reason about systems with non-trivial patterns of interaction. The increased attention in this research area is accelerated by the increasing availability of complex network data sets, with data often being reused as secondary data sources. Typically, multiple data sources are combined to create a larger, fuller picture of these complex networks and in doing so scientists have to make sometimes subjective decisions about how these sources should be integrated. These seemingly trivial decisions can sometimes have significant impact on both the resultant integrated networks and any downstream network models executed on them. We highlight the importance of this impact in online social networks and dark networks, two use-cases where data are regularly combined from multiple sources due to challenges in measurement or overlap of networks. We present a method for systematically testing how different, realistic data integration approaches can alter both the networks themselves and network models run on them, as well as an associated Python package (NIDMod) that implements this method. A number of experiments show the effectiveness of our method in identifying the impact of different data integration setups on network diffusion models.\",\"PeriodicalId\":2,\"journal\":{\"name\":\"ACS Applied Bio Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2023-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Bio Materials\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/comnet/cnad025\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, BIOMATERIALS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/comnet/cnad025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
An approach for analysing the impact of data integration on complex network diffusion models
Complex networks are a powerful way to reason about systems with non-trivial patterns of interaction. The increased attention in this research area is accelerated by the increasing availability of complex network data sets, with data often being reused as secondary data sources. Typically, multiple data sources are combined to create a larger, fuller picture of these complex networks and in doing so scientists have to make sometimes subjective decisions about how these sources should be integrated. These seemingly trivial decisions can sometimes have significant impact on both the resultant integrated networks and any downstream network models executed on them. We highlight the importance of this impact in online social networks and dark networks, two use-cases where data are regularly combined from multiple sources due to challenges in measurement or overlap of networks. We present a method for systematically testing how different, realistic data integration approaches can alter both the networks themselves and network models run on them, as well as an associated Python package (NIDMod) that implements this method. A number of experiments show the effectiveness of our method in identifying the impact of different data integration setups on network diffusion models.