{"title":"Beyond Dependencies: The Role of Copy-Based Reuse in Open Source Software Development","authors":"Mahmoud Jahanshahi, David Reid, Audris Mockus","doi":"arxiv-2409.04830","DOIUrl":null,"url":null,"abstract":"In Open Source Software, resources of any project are open for reuse by\nintroducing dependencies or copying the resource itself. In contrast to\ndependency-based reuse, the infrastructure to systematically support copy-based\nreuse appears to be entirely missing. Our aim is to enable future research and\ntool development to increase efficiency and reduce the risks of copy-based\nreuse. We seek a better understanding of such reuse by measuring its prevalence\nand identifying factors affecting the propensity to reuse. To identify reused\nartifacts and trace their origins, our method exploits World of Code\ninfrastructure. We begin with a set of theory-derived factors related to the\npropensity to reuse, sample instances of different reuse types, and survey\ndevelopers to better understand their intentions. Our results indicate that\ncopy-based reuse is common, with many developers being aware of it when writing\ncode. The propensity for a file to be reused varies greatly among languages and\nbetween source code and binary files, consistently decreasing over time. Files\nintroduced by popular projects are more likely to be reused, but at least half\nof reused resources originate from ``small'' and ``medium'' projects.\nDevelopers had various reasons for reuse but were generally positive about\nusing a package manager.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.04830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In Open Source Software, resources of any project are open for reuse by
introducing dependencies or copying the resource itself. In contrast to
dependency-based reuse, the infrastructure to systematically support copy-based
reuse appears to be entirely missing. Our aim is to enable future research and
tool development to increase efficiency and reduce the risks of copy-based
reuse. We seek a better understanding of such reuse by measuring its prevalence
and identifying factors affecting the propensity to reuse. To identify reused
artifacts and trace their origins, our method exploits World of Code
infrastructure. We begin with a set of theory-derived factors related to the
propensity to reuse, sample instances of different reuse types, and survey
developers to better understand their intentions. Our results indicate that
copy-based reuse is common, with many developers being aware of it when writing
code. The propensity for a file to be reused varies greatly among languages and
between source code and binary files, consistently decreasing over time. Files
introduced by popular projects are more likely to be reused, but at least half
of reused resources originate from ``small'' and ``medium'' projects.
Developers had various reasons for reuse but were generally positive about
using a package manager.