Empirical Software Engineering最新文献_第8页

Mining architectural information: A systematic mapping study 挖掘建筑信息：系统制图研究

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-06-04 DOI: 10.1007/s10664-024-10480-6

Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin, Chen Yang, Zengyang Li

{"title":"Mining architectural information: A systematic mapping study","authors":"Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin, Chen Yang, Zengyang Li","doi":"10.1007/s10664-024-10480-6","DOIUrl":"https://doi.org/10.1007/s10664-024-10480-6","url":null,"abstract":"Mining Software Repositories (MSR) has become an essential activity in software development. Mining architectural information (e.g., architectural models) to support architecting activities, such as architecture understanding, has received significant attention in recent years. However, there is a lack of clarity on what literature on mining architectural information is available. Consequently, this may create difficulty for practitioners to understand and adopt the state-of-the-art research results, such as what approaches should be adopted to mine what architectural information in order to support architecting activities. It also hinders researchers from being aware of the challenges and remedies for the identified research gaps. We aim to identify, analyze, and synthesize the literature on mining architectural information in software repositories in terms of architectural information and sources mined, architecting activities supported, approaches and tools used, and challenges faced. A Systematic Mapping Study (SMS) has been conducted on the literature published between January 2006 and December 2022. Of the 104 primary studies finally selected, 7 categories of architectural information have been mined, among which architectural description is the most mined architectural information; 11 categories of sources have been leveraged for mining architectural information, among which version control system (e.g., GitHub) is the most popular source; 11 architecting activities can be supported by the mined architectural information, among which architecture understanding is the most supported activity; 95 approaches and 56 tools were proposed and employed in mining architectural information; and 4 types of challenges in mining architectural information were identified. This SMS provides researchers with promising future directions and help practitioners be aware of what approaches and tools can be used to mine what architectural information from what sources to support various architecting activities.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"8 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A large-scale exploratory study on the proxy pattern in Ethereum 关于以太坊代理模式的大规模探索性研究

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-06-04 DOI: 10.1007/s10664-024-10485-1

Amir M. Ebrahimi, Bram Adams, Gustavo A. Oliva, Ahmed E. Hassan

{"title":"A large-scale exploratory study on the proxy pattern in Ethereum","authors":"Amir M. Ebrahimi, Bram Adams, Gustavo A. Oliva, Ahmed E. Hassan","doi":"10.1007/s10664-024-10485-1","DOIUrl":"https://doi.org/10.1007/s10664-024-10485-1","url":null,"abstract":"The proxy pattern is a well-known design pattern with numerous use cases in several sectors of the software industry (e.g., network applications, microservices, and IoT). As such, the use of the proxy pattern is also a common approach in the development of complex decentralized applications (DApps) on the Ethereum blockchain. A contract that implements the proxy pattern (proxy contract) acts as a layer between the clients and the target contract, enabling greater flexibility (e.g., data validation checks) and upgradeability (e.g., online smart contract replacement with zero downtime) in DApp development. Despite the importance of proxy contracts, little is known about (i) how their prevalence changed over time, (ii) the ways in which developers integrate proxies in the design of DApps, and (iii) what proxy types are being most commonly leveraged by developers. In this paper, we present a large-scale exploratory study on the use of the proxy pattern in Ethereum. We analyze a dataset of all Ethereum smart contracts as of Sep. 2022 containing 50M smart contracts and 1.6B transactions, and apply both quantitative and qualitative methods in order to (i) determine the prevalence of proxy contracts, (ii) understand the ways they are deployed and integrated into applications, and (iii) uncover the prevalence of different types of proxy contracts. Our findings reveal that 14.2% of all deployed smart contracts are proxy contracts. We show that proxy contracts are being more actively used than non-proxy contracts. Also, the usage of proxy contracts in various contexts, transactions involving proxy contracts, and adoption of proxy contracts by users have shown an upward trend over time, peaking at the end of our study period. They are either deployed through off-chain scripts or on-chain factory contracts, with the former and latter being employed in 39.1% and 60.9% of identified usage contexts in turn. We found that while the majority (67.8%) of proxies act as an interceptor, 32.2% enables upgradeability. Proxy contracts are typically (79%) implemented based on known reference implementations with 29.4% being of type ERC-1167, a class of proxies that aims to cheaply reuse and clone contracts’ functionality. Our evaluation shows that our proposed behavioral proxy detection method has a precision and recall of 100% in detecting active proxies. Finally, we derive a set of practical recommendations for developers and introduce open research questions to guide future research on the topic.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"51 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141256002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cross-project defect prediction via semantic and syntactic encoding 通过语义和句法编码进行跨项目缺陷预测

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-06-04 DOI: 10.1007/s10664-024-10495-z

Siyu Jiang, Yuwen Chen, Zhenhang He, Yunpeng Shang, Le Ma

{"title":"Cross-project defect prediction via semantic and syntactic encoding","authors":"Siyu Jiang, Yuwen Chen, Zhenhang He, Yunpeng Shang, Le Ma","doi":"10.1007/s10664-024-10495-z","DOIUrl":"https://doi.org/10.1007/s10664-024-10495-z","url":null,"abstract":"Cross-Project Defect Prediction (CPDP) is a promising research field that focuses on detecting defects in projects with limited labeled data by utilizing prediction models trained on projects with abundant data. However, previous CPDP approaches based on Abstract Syntax Tree (AST) have often encountered challenges in effectively acquiring semantic and syntactic information, resulting in their limited ability to combine both productively. This issue arises primarily from the practice of flattening the AST into a linear sequence in many AST-based methods, leading to the loss of hierarchical syntactic structure and structural information within the code. Besides, other AST-based methods use a recursive way to traverse the tree-structured AST, which is susceptible to gradient vanishing. To alleviate these concerns, we introduce a novel CPDP method named defect prediction via Semantic and Syntactic Encoding (SSE) that enhances Zhang’s approach by encoding semantic and syntactic information while retaining and considering AST structure. Specifically, we perform pre-training on a large corpus using a language model to learn semantic information. Next, we present a new rule for splitting AST into subtrees to avoid vanishing gradients. Then, the absolute paths originating from the root node and leading to the leaf nodes are encoded as hierarchical syntactic information. Finally, we design an encoder to integrate syntactic information into semantic information and leverage Bi-directional Long-Short Term Memory to learn the entire tree representation for prediction. Experimental results on 12 benchmark projects illustrate that the SSE method we proposed surpasses current state-of-the-art methods.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"26 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141256003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Demystifying code snippets in code reviews: a study of the OpenStack and Qt communities and a practitioner survey 解密代码审查中的代码片段：OpenStack 和 Qt 社区研究及从业人员调查

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-06-03 DOI: 10.1007/s10664-024-10484-2

Beiqi Zhang, Liming Fu, Peng Liang, Jiaxin Yu, Chong Wang

{"title":"Demystifying code snippets in code reviews: a study of the OpenStack and Qt communities and a practitioner survey","authors":"Beiqi Zhang, Liming Fu, Peng Liang, Jiaxin Yu, Chong Wang","doi":"10.1007/s10664-024-10484-2","DOIUrl":"https://doi.org/10.1007/s10664-024-10484-2","url":null,"abstract":"Code review is widely known as one of the best practices for software quality assurance in software development. In a typical code review process, reviewers check the code committed by developers to ensure the quality of the code, during which reviewers and developers would communicate with each other in review comments to exchange necessary information. As a result, understanding the information in review comments is a prerequisite for reviewers and developers to conduct an effective code review. Code snippet, as a special form of code, can be used to convey necessary information in code reviews. For example, reviewers can use code snippets to make suggestions or elaborate their ideas to meet developers’ information needs in code reviews. However, little research has focused on the practices of providing code snippets in code reviews. To bridge this gap, we conduct a mixed-methods study to mine information and knowledge related to code snippets in code reviews, which can help practitioners and researchers get a better understanding about using code snippets in code review. Specifically, our study includes two phases: mining code review data and conducting practitioners’ survey. In Phase 1, we conducted an exploratory study to mine code review data from two popular developer communities (i.e., OpenStack and Qt). We manually labelled 69,604 review comments and finally identified 3,213 review comments that contain code snippets. Based on the code review data collected, we analyzed the extent of using code snippets, the reviewers’ purposes of providing code snippets, the developers’ acceptance of code snippet suggestions, and the reasons that developers do not accept code snippet suggestions in code reviews. In Phase 2, we used an online questionnaire to survey practitioners from industry. By analyzing the 63 valid responses we received, we explored the scenarios reviewers provide code snippets, the developers’ attitudes towards code snippets, and the characteristics of code snippets developers expect reviewers to provide in code reviews. Our results show that: (1) code snippets are not frequently used in code reviews, and most of the code snippets are provided by reviewers rather than developers; (2) the purposes of reviewers providing code snippets in code reviews are Suggestion and Citation, in which Suggestion is the main purpose; (3) most developers would accept reviewers’ code snippet suggestions; (4) the most common reasons that developers do not accept reviewers’ code snippet suggestions in code reviews are difference in the opinions between developers and reviewers and reviewer’s suggestion is flawed; (5) reviewers often provide code snippets in code reviews when code is more illustrate than words; (6) most developers hold positive attitudes towards code snippet comments; and (7) most developers expect that code snippets in review comments are understandable and fitting into existing co","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"6 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Transformers and meta-tokenization in sentiment analysis for software engineering 软件工程情感分析中的转换器和元标记化

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-06-03 DOI: 10.1007/s10664-024-10468-2

Nathan Cassee, Andrei Agaronian, Eleni Constantinou, Nicole Novielli, Alexander Serebrenik

{"title":"Transformers and meta-tokenization in sentiment analysis for software engineering","authors":"Nathan Cassee, Andrei Agaronian, Eleni Constantinou, Nicole Novielli, Alexander Serebrenik","doi":"10.1007/s10664-024-10468-2","DOIUrl":"https://doi.org/10.1007/s10664-024-10468-2","url":null,"abstract":"Sentiment analysis has been used to study aspects of software engineering, such as issue resolution, toxicity, and self-admitted technical debt. To address the peculiarities of software engineering texts, sentiment analysis tools often consider the specific technical lingo practitioners use. To further improve the application of sentiment analysis, there have been two recommendations: Using pre-trained transformer models to classify sentiment and replacing non-natural language elements with meta-tokens. In this work, we benchmark five different sentiment analysis tools (two pre-trained transformer models and three machine learning tools) on 2 gold-standard sentiment analysis datasets. We find that pre-trained transformers outperform the best machine learning tool on only one of the two datasets, and that even on that dataset the performance difference is a few percentage points. Therefore, we recommend that software engineering researchers should not just consider predictive performance when selecting a sentiment analysis tool because the best-performing sentiment analysis tools perform very similarly to each other (within 4 percentage points). Meanwhile, we find that meta-tokenization does not improve the predictive performance of sentiment analysis tools. Both of our findings can be used by software engineering researchers who seek to apply sentiment analysis tools to software engineering data.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"8 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141256010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards graph-anonymization of software analytics data: empirical study on JIT defect prediction 实现软件分析数据的图匿名化：JIT 缺陷预测实证研究

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-06-01 DOI: 10.1007/s10664-024-10464-6

Akshat Malik, Bram Adams, Ahmed Hassan

{"title":"Towards graph-anonymization of software analytics data: empirical study on JIT defect prediction","authors":"Akshat Malik, Bram Adams, Ahmed Hassan","doi":"10.1007/s10664-024-10464-6","DOIUrl":"https://doi.org/10.1007/s10664-024-10464-6","url":null,"abstract":"As the usage of software analytics for understanding different organizational practices becomes prevalent, it is important that data for these practices is shared across different organizations to build a common understanding of software systems and processes. Yet, organizations are hesitant to share this data and trained models with one another due to concerns around privacy, e.g., because of the risk of reverse engineering the training data of the models. To facilitate data sharing, tabular anonymization techniques like MORPH, LACE and LACE2 have been proposed to provide privacy to defect prediction data. However, said techniques treat data points as individual elements, and lose the context between different features when performing anonymization. We study the effect of four anonymization techniques, i.e., Random Add/Delete, Random Switch, k-DA and Generalization, on the privacy score and performance in six large, long-lived projects. To measure privacy, we use the IPR metric, which is a measure of the inability of an attacker to extract information about sensitive attributes from the anonymized data. We find that all four graph anonymization techniques are able to provide privacy scores higher than 65% in all the datasets, while Random Add/ Delete and Random Switch are even able to achieve privacy scores of 80% and greater in all datasets. For techniques achieving privacy scores of 65%, the AUC and Recall decreased by a median of 1.45% and 5.35%, respectively. For techniques with privacy scores 80% or greater, the AUC and Recall of privatized models decreased by a median of 6.44% and 20.29%, respectively. The state-of-the-art tabular techniques like MORPH, LACE and LACE2 provide high privacy scores (89%-99%); however, they have a higher impact on performance with a median decrease of 21.15% in AUC and 80.34% in Recall. Furthermore, since privacy scores 65% or greater are adequate for sharing, the graph anonymization techniques are able to provide more configurable results where one can make trade-offs between privacy and performance. When compared to unsupervised techniques like a JIT variant of ManualDown, the GA techniques perform comparable or significantly better for AUC, G-Mean and FPR metrics. Our work shows that graph anonymization can be an effective way of providing privacy while preserving model performance.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"88 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141190200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine learning experiment management tools: a mixed-methods empirical study 机器学习实验管理工具：混合方法实证研究

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-05-29 DOI: 10.1007/s10664-024-10444-w

Samuel Idowu, Osman Osman, Daniel Strüber, Thorsten Berger

{"title":"Machine learning experiment management tools: a mixed-methods empirical study","authors":"Samuel Idowu, Osman Osman, Daniel Strüber, Thorsten Berger","doi":"10.1007/s10664-024-10444-w","DOIUrl":"https://doi.org/10.1007/s10664-024-10444-w","url":null,"abstract":"Machine Learning (ML) experiment management tools support ML practitioners and software engineers when building intelligent software systems. By managing large numbers of ML experiments comprising many different ML assets, they not only facilitate engineering ML models and ML-enabled systems, but also managing their evolution—for instance, tracing system behavior to concrete experiments when the model performance drifts. However, while ML experiment management tools have become increasingly popular, little is known about their effectiveness in practice, as well as their actual benefits and challenges. We present a mixed-methods empirical study of experiment management tools and the support they provide to users. First, our survey of 81 ML practitioners sought to determine the benefits and challenges of ML experiment management and of the existing tool landscape. Second, a controlled experiment with 15 student developers investigated the effectiveness of ML experiment management tools. We learned that 70% of our survey respondents perform ML experiments using specialized tools, while out of those who do not use such tools, 52% are unaware of experiment management tools or of their benefits. The controlled experiment showed that experiment management tools offer valuable support to users to systematically track and retrieve ML assets. Using ML experiment management tools reduced error rates and increased completion rates. By presenting a user’s perspective on experiment management tools, and the first controlled experiment in this area, we hope that our results foster the adoption of these tools in practice, as well as they direct tool builders and researchers to improve the tool landscape overall.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"56 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141171486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Do Agile scaling approaches make a difference? an empirical comparison of team effectiveness across popular scaling approaches 敏捷扩展方法有区别吗？对各种流行扩展方法的团队效率进行实证比较

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-05-29 DOI: 10.1007/s10664-024-10481-5

Christiaan Verwijs, Daniel Russo

{"title":"Do Agile scaling approaches make a difference? an empirical comparison of team effectiveness across popular scaling approaches","authors":"Christiaan Verwijs, Daniel Russo","doi":"10.1007/s10664-024-10481-5","DOIUrl":"https://doi.org/10.1007/s10664-024-10481-5","url":null,"abstract":"With the prevalent use of Agile methodologies, organizations are grappling with the challenge of scaling development across numerous teams. This has led to the emergence of diverse scaling strategies, from complex ones such as “SAFe\", to more simplified methods e.g., “LeSS\", with some organizations devising their unique approaches. While there have been multiple studies exploring the organizational challenges associated with different scaling approaches, so far, no one has compared these strategies based on empirical data derived from a uniform measure. This makes it hard to draw robust conclusions about how different scaling approaches affect Agile team effectiveness. Thus, the objective of this study is to assess the effectiveness of Agile teams across various scaling approaches, including “SAFe\", “LeSS\", “Scrum of Scrums\", and custom methods, as well as those not using scaling. This study focuses initially on responsiveness, stakeholder concern, continuous improvement, team autonomy, management approach, and overall team effectiveness, followed by an evaluation based on stakeholder satisfaction regarding value, responsiveness, and release frequency. To achieve this, we performed a comprehensive survey involving 15,078 members of 4,013 Agile teams to measure their effectiveness, combined with satisfaction surveys from 1,841 stakeholders of 529 of those teams. We conducted a series of inferential statistical analyses, including Analysis of Variance and multiple linear regression, to identify any significant differences, while controlling for team experience and organizational size. The findings of the study revealed some significant differences, but their magnitude and effect size were considered too negligible to have practical significance. In conclusion, the choice of Agile scaling strategy does not markedly influence team effectiveness, and organizations are advised to choose a method that best aligns with their previous experiences with Agile, organizational culture, and management style.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"68 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141171483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The broken windows theory applies to technical debt 破窗理论适用于技术债务

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-05-24 DOI: 10.1007/s10664-024-10456-6

William Levén, Hampus Broman, Terese Besker, Richard Torkar

{"title":"The broken windows theory applies to technical debt","authors":"William Levén, Hampus Broman, Terese Besker, Richard Torkar","doi":"10.1007/s10664-024-10456-6","DOIUrl":"https://doi.org/10.1007/s10664-024-10456-6","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Context:</h3>The term technical debt (TD) describes the aggregation of sub-optimal solutions that serve to impede the evolution and maintenance of a system. Some claim that the broken windows theory (BWT), a concept borrowed from criminology, also applies to software development projects. The theory states that the presence of indications of previous crime (such as a broken window) will increase the likelihood of further criminal activity; TD could be considered the broken windows of software systems.<h3 data-test=\"abstract-sub-heading\">Objective:</h3>To empirically investigate the causal relationship between the TD density of a system and the propensity of developers to introduce new TD during the extension of that system.<h3 data-test=\"abstract-sub-heading\">Method:</h3>The study used a mixed-methods research strategy consisting of a controlled experiment with an accompanying survey and follow-up interviews. The experiment had a total of 29 developers of varying experience levels completing system extension tasks in already existing systems with high or low TD density.<h3 data-test=\"abstract-sub-heading\">Results:</h3>The analysis revealed significant effects of TD level on the subjects’ tendency to re-implement (rather than reuse) functionality, choose non-descriptive variable names, and introduce other code smells identified by the software tool SonarQube, all with at least (95%) credible intervals.<h3 data-test=\"abstract-sub-heading\">Coclusions:</h3>Three separate significant results along with a validating qualitative result combine to form substantial evidence of the BWT’s existence in software engineering contexts. This study finds that existing TD can have a major impact on developers propensity to introduce new TD of various types during development.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"282 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141152952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Two is better than one: digital siblings to improve autonomous driving testing 二胜于一：数字兄弟姐妹改善自动驾驶测试

IF 4.1 2区计算机科学

Empirical Software Engineering Pub Date : 2024-05-17 DOI: 10.1007/s10664-024-10458-4

Matteo Biagiola, Andrea Stocco, Vincenzo Riccio, Paolo Tonella

{"title":"Two is better than one: digital siblings to improve autonomous driving testing","authors":"Matteo Biagiola, Andrea Stocco, Vincenzo Riccio, Paolo Tonella","doi":"10.1007/s10664-024-10458-4","DOIUrl":"https://doi.org/10.1007/s10664-024-10458-4","url":null,"abstract":"Simulation-based testing represents an important step to ensure the reliability of autonomous driving software. In practice, when companies rely on third-party general-purpose simulators, either for in-house or outsourced testing, the generalizability of testing results to real autonomous vehicles is at stake. In this paper, we enhance simulation-based testing by introducing the notion of digital siblings—a multi-simulator approach that tests a given autonomous vehicle on multiple general-purpose simulators built with different technologies, that operate collectively as an ensemble in the testing process. We exemplify our approach on a case study focused on testing the lane-keeping component of an autonomous vehicle. We use two open-source simulators as digital siblings, and we empirically compare such a multi-simulator approach against a digital twin of a physical scaled autonomous vehicle on a large set of test cases. Our approach requires generating and running test cases for each individual simulator, in the form of sequences of road points. Then, test cases are migrated between simulators, using feature maps to characterize the exercised driving conditions. Finally, the joint predicted failure probability is computed, and a failure is reported only in cases of agreement among the siblings. Our empirical evaluation shows that the ensemble failure predictor by the digital siblings is superior to each individual simulator at predicting the failures of the digital twin. We discuss the findings of our case study and detail how our approach can help researchers interested in automated testing of autonomous driving software.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"2015 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141062188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0