{"title":"在软件存储库之间建立合理影响的综合方法","authors":"Md Omar Faruk Rokon, Risul Islam, Md Rayhanul Masud, M. Faloutsos","doi":"10.1109/ASONAM55673.2022.10068629","DOIUrl":null,"url":null,"abstract":"How can we quantify the influence among repos-itories in online archives like GitHub? Determining repository influence is an essential building block for understanding the dynamics of GitHub-like software archives. The key challenge is to define the appropriate representation model of influence that captures the nuances of the concept and considers its diverse manifestations. We propose PIMan, a systematic approach to quantify the influence among the repositories in a software archive by focusing on the social level interactions. As our key novelty, we introduce the concept of Plausible Influence which considers three types of information: (a) repository level interactions, (b) author level interactions, and (c) temporal considerations. We evaluate and apply our method using 2089 malware repositories from GitHub spanning approximately 12 years. First, we show how our approach provides a powerful and flexible way to generate a plausible influence graph whose density is determined by the Plausible Influence Threshold (PIT), which is modifiable to meet the needs of a study. Second, we find that there is a significant collaboration and influence among the repositories in our dataset. We identify 28 connected components in the plausible influence graph (PIT = 0.25) with 7% of the components containing at least 15 repositories. Furthermore, we find 19 repositories that influenced at least 10 other repositories directly and spawned at least two “families” of repositories. In addition, the results show that our influence metrics capture the manifold aspects of the interactions that are not captured by the typical repository popularity metrics (e.g. number of stars). Overall, our work is a fundamental building block for identifying the influence and lineage of the repositories in online software platforms.","PeriodicalId":423113,"journal":{"name":"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PIMan: A Comprehensive Approach for Establishing Plausible Influence among Software Repositories\",\"authors\":\"Md Omar Faruk Rokon, Risul Islam, Md Rayhanul Masud, M. Faloutsos\",\"doi\":\"10.1109/ASONAM55673.2022.10068629\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"How can we quantify the influence among repos-itories in online archives like GitHub? Determining repository influence is an essential building block for understanding the dynamics of GitHub-like software archives. The key challenge is to define the appropriate representation model of influence that captures the nuances of the concept and considers its diverse manifestations. We propose PIMan, a systematic approach to quantify the influence among the repositories in a software archive by focusing on the social level interactions. As our key novelty, we introduce the concept of Plausible Influence which considers three types of information: (a) repository level interactions, (b) author level interactions, and (c) temporal considerations. We evaluate and apply our method using 2089 malware repositories from GitHub spanning approximately 12 years. First, we show how our approach provides a powerful and flexible way to generate a plausible influence graph whose density is determined by the Plausible Influence Threshold (PIT), which is modifiable to meet the needs of a study. Second, we find that there is a significant collaboration and influence among the repositories in our dataset. We identify 28 connected components in the plausible influence graph (PIT = 0.25) with 7% of the components containing at least 15 repositories. Furthermore, we find 19 repositories that influenced at least 10 other repositories directly and spawned at least two “families” of repositories. In addition, the results show that our influence metrics capture the manifold aspects of the interactions that are not captured by the typical repository popularity metrics (e.g. number of stars). Overall, our work is a fundamental building block for identifying the influence and lineage of the repositories in online software platforms.\",\"PeriodicalId\":423113,\"journal\":{\"name\":\"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASONAM55673.2022.10068629\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASONAM55673.2022.10068629","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
PIMan: A Comprehensive Approach for Establishing Plausible Influence among Software Repositories
How can we quantify the influence among repos-itories in online archives like GitHub? Determining repository influence is an essential building block for understanding the dynamics of GitHub-like software archives. The key challenge is to define the appropriate representation model of influence that captures the nuances of the concept and considers its diverse manifestations. We propose PIMan, a systematic approach to quantify the influence among the repositories in a software archive by focusing on the social level interactions. As our key novelty, we introduce the concept of Plausible Influence which considers three types of information: (a) repository level interactions, (b) author level interactions, and (c) temporal considerations. We evaluate and apply our method using 2089 malware repositories from GitHub spanning approximately 12 years. First, we show how our approach provides a powerful and flexible way to generate a plausible influence graph whose density is determined by the Plausible Influence Threshold (PIT), which is modifiable to meet the needs of a study. Second, we find that there is a significant collaboration and influence among the repositories in our dataset. We identify 28 connected components in the plausible influence graph (PIT = 0.25) with 7% of the components containing at least 15 repositories. Furthermore, we find 19 repositories that influenced at least 10 other repositories directly and spawned at least two “families” of repositories. In addition, the results show that our influence metrics capture the manifold aspects of the interactions that are not captured by the typical repository popularity metrics (e.g. number of stars). Overall, our work is a fundamental building block for identifying the influence and lineage of the repositories in online software platforms.