Empirical Software Engineering最新文献

筛选
英文 中文
A literature review and existing challenges on software logging practices 关于软件日志记录实践的文献综述和现有挑战
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-18 DOI: 10.1007/s10664-024-10452-w
Mohamed Amine Batoun, Mohammed Sayagh, Roozbeh Aghili, Ali Ouni, Heng Li
{"title":"A literature review and existing challenges on software logging practices","authors":"Mohamed Amine Batoun, Mohammed Sayagh, Roozbeh Aghili, Ali Ouni, Heng Li","doi":"10.1007/s10664-024-10452-w","DOIUrl":"https://doi.org/10.1007/s10664-024-10452-w","url":null,"abstract":"<p>Software logging is the practice of recording different events and activities that occur within a software system, which are useful for different activities such as failure prediction and anomaly detection. While previous research focused on improving different aspects of logging practices, the goal of this paper is to conduct a systematic literature review and the existing challenges of practitioners in software logging practices. In this paper, we focus on the logging practices that cover the steps from the instrumentation of a software system, the storage of logs, up to the preprocessing steps that prepare log data for further follow-up analysis. Our systematic literature review (SLR) covers 204 research papers and a quantitative and qualitative analysis of 20,766 and 149 questions on StackOverflow (SO). We observe that 53% of the studies focus on improving the techniques that preprocess logs for analysis (e.g., the practices of log parsing, log clustering and log mining), 37% focus on how to create new log statements, and 10% focus on how to improve log file storage. Our analysis of SO topics reveals that five out of seven identified high-level topics are not covered by the literature and are related to dependency configuration of logging libraries, infrastructure related configuration, scattered logging, context-dependant usage of logs and handling log files.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"158 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Engineering recommender systems for modelling languages: concept, tool and evaluation 建模语言工程推荐系统:概念、工具和评估
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-18 DOI: 10.1007/s10664-024-10483-3
Lissette Almonte, Esther Guerra, Iván Cantador, Juan de Lara
{"title":"Engineering recommender systems for modelling languages: concept, tool and evaluation","authors":"Lissette Almonte, Esther Guerra, Iván Cantador, Juan de Lara","doi":"10.1007/s10664-024-10483-3","DOIUrl":"https://doi.org/10.1007/s10664-024-10483-3","url":null,"abstract":"<p>Recommender systems (RSs) are ubiquitous in all sorts of online applications, in areas like shopping, media broadcasting, travel and tourism, among many others. They are also common to help in software engineering tasks, including software modelling, where we are recently witnessing proposals to enrich modelling languages and environments with RSs. Modelling recommenders assist users in building models by suggesting items based on previous solutions to similar problems in the same domain. However, building a RS for a modelling language requires considerable effort and specialised knowledge. To alleviate this problem, we propose an automated, model-driven approach to create RSs for modelling languages. The approach provides a domain-specific language called <span>Droid</span> to configure every aspect of the RS: the type of the recommended modelling elements, the gathering and preprocessing of training data, the recommendation method, and the metrics used to evaluate the created RS. The RS so configured can be deployed as a service, and we offer out-of-the-box integration with Eclipse modelling editors. Moreover, the language is extensible with new data sources and recommendation methods. To assess the usefulness of our proposal, we report on two evaluations. The first one is an offline experiment measuring the precision, completeness and diversity of recommendations generated by several methods. The second is a user study – with 40 participants – to assess the perceived quality of the recommendations. The study also contributes with a novel evaluation methodology and metrics for RSs in model-driven engineering.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"345 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OpenSCV: an open hierarchical taxonomy for smart contract vulnerabilities OpenSCV:针对智能合约漏洞的开放式分层分类法
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-18 DOI: 10.1007/s10664-024-10446-8
Fernando Richter Vidal, Naghmeh Ivaki, Nuno Laranjeiro
{"title":"OpenSCV: an open hierarchical taxonomy for smart contract vulnerabilities","authors":"Fernando Richter Vidal, Naghmeh Ivaki, Nuno Laranjeiro","doi":"10.1007/s10664-024-10446-8","DOIUrl":"https://doi.org/10.1007/s10664-024-10446-8","url":null,"abstract":"<p>Smart contracts are nowadays at the core of most blockchain systems. Like all computer programs, smart contracts are subject to the presence of residual faults, including severe security vulnerabilities. However, the key distinction lies in how these vulnerabilities are addressed. In smart contracts, when a vulnerability is identified, the affected contract must be terminated within the blockchain, as due to the immutable nature of blockchains, it is impossible to patch a contract once deployed. In this context, research efforts have been focused on proactively preventing the deployment of smart contracts containing vulnerabilities, mainly through the development of vulnerability detection tools. Along with these efforts, several heterogeneous vulnerability classification schemes appeared (e.g., most notably DASP and SWC). At the time of writing, these are mostly outdated initiatives, even though new smart contract vulnerabilities are consistently uncovered. In this paper, we propose OpenSCV, a new and Open hierarchical taxonomy for Smart Contract vulnerabilities, which is open to community contributions and matches the current state of the practice while being prepared to handle future modifications and evolution. The taxonomy was built based on the analysis of the existing research on vulnerability classification, community-maintained classification schemes, and research on smart contract vulnerability detection. We show how OpenSCV covers the announced detection ability of the current vulnerability detection tools and highlight its usefulness in smart contract vulnerability research. To validate OpenSCV, we performed an expert-based analysis wherein we invited multiple experts engaged in smart contract security research to participate in a questionnaire. The feedback from these experts indicated that the categories in OpenSCV are representative, clear, easily understandable, comprehensive, and highly useful. Regarding the vulnerabilities, the experts confirmed that they are easily understandable.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"174 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An empirical study of challenges in machine learning asset management 机器学习资产管理挑战的实证研究
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-15 DOI: 10.1007/s10664-024-10474-4
Zhimin Zhao, Yihao Chen, Abdul Ali Bangash, Bram Adams, Ahmed E. Hassan
{"title":"An empirical study of challenges in machine learning asset management","authors":"Zhimin Zhao, Yihao Chen, Abdul Ali Bangash, Bram Adams, Ahmed E. Hassan","doi":"10.1007/s10664-024-10474-4","DOIUrl":"https://doi.org/10.1007/s10664-024-10474-4","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">\u0000<i>Context:</i>\u0000</h3><p>In machine learning (ML) applications, assets include not only the ML models themselves, but also the datasets, algorithms, and deployment tools that are essential in the development, training, and implementation of these models. Efficient management of ML assets is critical to ensure optimal resource utilization, consistent model performance, and a streamlined ML development lifecycle. This practice contributes to faster iterations, adaptability, reduced time from model development to deployment, and the delivery of reliable and timely outputs.</p><h3 data-test=\"abstract-sub-heading\">\u0000<i>Objective:</i>\u0000</h3><p>Despite research on ML asset management, there is still a significant knowledge gap on operational challenges, such as model versioning, data traceability, and collaboration issues, faced by asset management tool users. These challenges are crucial because they could directly impact the efficiency, reproducibility, and overall success of machine learning projects. Our study aims to bridge this empirical gap by analyzing user experience, feedback, and needs from Q &amp;A posts, shedding light on the real-world challenges they face and the solutions they have found.</p><h3 data-test=\"abstract-sub-heading\">\u0000<i>Method:</i>\u0000</h3><p>We examine 15, 065 Q &amp;A posts from multiple developer discussion platforms, including Stack Overflow, tool-specific forums, and GitHub/GitLab. Using a mixed-method approach, we classify the posts into knowledge inquiries and problem inquiries. We then apply BERTopic to extract challenge topics and compare their prevalence. Finally, we use the open card sorting approach to summarize solutions from solved inquiries, then cluster them with BERTopic, and analyze the relationship between challenges and solutions.</p><h3 data-test=\"abstract-sub-heading\">\u0000<i>Results:</i>\u0000</h3><p>We identify 133 distinct topics in ML asset management-related inquiries, grouped into 16 macro-topics, with software environment and dependency, model deployment and service, and model creation and training emerging as the most discussed. Additionally, we identify 79 distinct solution topics, classified under 18 macro-topics, with software environment and dependency, feature and component development, and file and directory management as the most proposed.</p><h3 data-test=\"abstract-sub-heading\">\u0000<i>Conclusions:</i>\u0000</h3><p>This study highlights critical areas within ML asset management that need further exploration, particularly around prevalent macro-topics identified as pain points for ML practitioners, emphasizing the need for collaborative efforts between academia, industry, and the broader research community.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"13 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can instability variations warn developers when open-source projects boost? 不稳定性变化能否在开源项目启动时向开发人员发出警告?
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-14 DOI: 10.1007/s10664-024-10482-4
Rafael Capilla, Victor Salamanca, Alejandro Valdezate, Gregorio Robles
{"title":"Can instability variations warn developers when open-source projects boost?","authors":"Rafael Capilla, Victor Salamanca, Alejandro Valdezate, Gregorio Robles","doi":"10.1007/s10664-024-10482-4","DOIUrl":"https://doi.org/10.1007/s10664-024-10482-4","url":null,"abstract":"<p>Although architecture instability has been studied and measured using a variety of metrics, a deeper analysis of which project parts are less stable and how such instability varies over time is still needed. While having more information on architecture instability is, in general, useful for any software development project, it is especially important in Open Source Software (OSS) projects where the supervision of the development process is more difficult to achieve. In particular, we are interested when OSS projects grow from a small controlled environment (i.e., the <i>cathedral</i> phase) to a community-driven project (i.e., the <i>bazaar</i> phase). In such a transition, the project often explodes in terms of software size and number of contributing developers. Hence, the complexity of the newly added features, and the frequency of the commits and files modified may cause significant variations of the instability of the structure of the classes and packages. Consequently, in this article we analyze the instability in OSS projects, especially during that sensitive phase where they become community-driven. Our results show that instability metrics can be easily obtained in such type of transitions. We also observed from our case studies that instability metrics can help finding out the balance between adding new functionality and performing refactoring. As a conclusions we state that instability metrics offer relevant information in the transition phase from the cathedral to the bazaar.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141521545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs 关于 GDPR 人工智能支持的 DPA 完整性检查的多方案研究
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-14 DOI: 10.1007/s10664-024-10491-3
Muhammad Ilyas Azeem, Sallam Abualhaija
{"title":"A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs","authors":"Muhammad Ilyas Azeem, Sallam Abualhaija","doi":"10.1007/s10664-024-10491-3","DOIUrl":"https://doi.org/10.1007/s10664-024-10491-3","url":null,"abstract":"<p>Specifying legal requirements for software systems to ensure their compliance with the applicable regulations is a major concern of requirements engineering. Personal data which is collected by an organization is often shared with other organizations to perform certain processing activities. In such cases, the General Data Protection Regulation (GDPR) requires issuing a data processing agreement (DPA) which regulates the processing and further ensures that personal data remains protected. Violating GDPR can lead to huge fines reaching to billions of Euros. Software systems involving personal data processing must adhere to the legal obligations stipulated both at a general level in GDPR as well as the obligations outlined in DPAs highlighting specific business. In other words, a DPA is yet another source from which requirements engineers can elicit legal requirements. However, the DPA must be complete according to GDPR to ensure that the elicited requirements cover the complete set of obligations. Therefore, checking the completeness of DPAs is a prerequisite step towards developing a compliant system. Analyzing DPAs with respect to GDPR entirely manually is time consuming and requires adequate legal expertise. In this paper, we propose an automation strategy that addresses the completeness checking of DPAs against GDPR provisions as a text classification problem. Specifically, we pursue ten alternative solutions which are enabled by different technologies, namely traditional machine learning, deep learning, language modeling, and few-shot learning. The goal of our work is to empirically examine how these different technologies fare in the legal domain. We computed F<span>(_2)</span> score on a set of 30 real DPAs. Our evaluation shows that best-performing solutions yield F<span>(_2)</span> score of 86.7% and 89.7% are based on pre-trained BERT and RoBERTa language models. Our analysis further shows that other alternative solutions based on deep learning (e.g., BiLSTM) and few-shot learning (e.g., SetFit) can achieve comparable accuracy, yet are more efficient to develop.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"1 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141521636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Common challenges of deep reinforcement learning applications development: an empirical study 深度强化学习应用开发的常见挑战:实证研究
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-14 DOI: 10.1007/s10664-024-10500-5
Mohammad Mehdi Morovati, Florian Tambon, Mina Taraghi, Amin Nikanjam, Foutse Khomh
{"title":"Common challenges of deep reinforcement learning applications development: an empirical study","authors":"Mohammad Mehdi Morovati, Florian Tambon, Mina Taraghi, Amin Nikanjam, Foutse Khomh","doi":"10.1007/s10664-024-10500-5","DOIUrl":"https://doi.org/10.1007/s10664-024-10500-5","url":null,"abstract":"<p>Machine Learning (ML) is increasingly being adopted in different industries. Deep Reinforcement Learning (DRL) is a subdomain of ML used to produce intelligent agents. Despite recent developments in DRL technology, the main challenges that developers face in the development of DRL applications are still unknown. To fill this gap, in this paper, we conduct a large-scale empirical study of <b>927</b> DRL-related posts extracted from Stack Overflow, the most popular Q &amp;A platform in the software community. Through the process of labeling and categorizing extracted posts, we created a taxonomy of common challenges encountered in the development of DRL applications, along with their corresponding popularity levels. This taxonomy has been validated through a survey involving 65 DRL developers. Results show that at least <span>(45%)</span> of developers experienced 18 of the 21 challenges identified in the taxonomy. The most frequent source of difficulty during the development of DRL applications are <i>Comprehension</i>, <i>API usage</i>, and <i>Design problems</i>, while <i>Parallel processing</i>, and <i>DRL libraries/frameworks</i> are classified as the most difficult challenges to address, with respect to the time required to receive an accepted answer. We hope that the research community will leverage this taxonomy to develop efficient strategies to address the identified challenges and improve the quality of DRL applications</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"97 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141521544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Studying the explanations for the automated prediction of bug and non-bug issues using LIME and SHAP 研究使用 LIME 和 SHAP 自动预测错误和非错误问题的原因
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-13 DOI: 10.1007/s10664-024-10469-1
Lukas Schulte, Benjamin Ledel, Steffen Herbold
{"title":"Studying the explanations for the automated prediction of bug and non-bug issues using LIME and SHAP","authors":"Lukas Schulte, Benjamin Ledel, Steffen Herbold","doi":"10.1007/s10664-024-10469-1","DOIUrl":"https://doi.org/10.1007/s10664-024-10469-1","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Context</h3><p>The identification of bugs within issues reported to an issue tracking system is crucial for triage. Machine learning models have shown promising results for this task. However, we have only limited knowledge of how such models identify bugs. Explainable AI methods like LIME and SHAP can be used to increase this knowledge.</p><h3 data-test=\"abstract-sub-heading\">Objective</h3><p>We want to understand if explainable AI provides explanations that are reasonable to us as humans and align with our assumptions about the model’s decision-making. We also want to know if the quality of predictions is correlated with the quality of explanations.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>We conduct a study where we rate LIME and SHAP explanations based on their quality of explaining the outcome of an issue type prediction model. For this, we rate the quality of the explanations, i.e., if they align with our expectations and help us understand the underlying machine learning model.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>We found that both LIME and SHAP give reasonable explanations and that correct predictions are well explained. Further, we found that SHAP outperforms LIME due to a lower ambiguity and a higher contextuality that can be attributed to the ability of the deep SHAP variant to capture sentence fragments.</p><h3 data-test=\"abstract-sub-heading\">Conclusion</h3><p>We conclude that the model finds explainable signals for both bugs and non-bugs. Also, we recommend that research dealing with the quality of explanations for classification tasks reports and investigates rater agreement, since the rating of explanations is highly subjective.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"52 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141521631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How far are we with automated machine learning? characterization and challenges of AutoML toolkits 自动机器学习进展如何? AutoML 工具包的特点和挑战
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-13 DOI: 10.1007/s10664-024-10450-y
Md Abdullah Al Alamin, Gias Uddin
{"title":"How far are we with automated machine learning? characterization and challenges of AutoML toolkits","authors":"Md Abdullah Al Alamin, Gias Uddin","doi":"10.1007/s10664-024-10450-y","DOIUrl":"https://doi.org/10.1007/s10664-024-10450-y","url":null,"abstract":"<p>Automated Machine Learning aka AutoML toolkits are low/no-code software that aim to democratize ML system application development by ensuring rapid prototyping of ML models and by enabling collaboration across different stakeholders in ML system design (e.g., domain experts, data scientists, etc.). It is thus important to know the state of current AutoML toolkits and the challenges ML practitioners face while using those toolkits. In this paper, we first offer a characterization of currently available AutoML toolits by analyzing 37 top AutoML tools and platforms. We find that the top AutoML platforms are mostly cloud-based. Most of the tools are optimized for the adoption of shallow ML models. Second, we present an empirical study of 14.3K AutoML related posts from Stack Overflow (SO) that we analyzed using topic modelling algorithm LDA (Latent Dirichlet Allocation) to understand the challenges of ML practitioners while using the AutoML toolkits. We find 13 topics in the AutoML related discussions in SO. The 13 topics are grouped into four categories: MLOps (43% of all questions), Model (28% questions), Data (27% questions), and Documentation (2% questions). Most questions are asked during Model training (29%) and Data preparation (25%) phases. AutoML practitioners find the MLOps topic category most challenging. Topics related to the MLOps category are the most prevalent and popular for cloud-based AutoML toolkits. Based on our study findings, we provide 15 recommendations to improve the adoption and development of AutoML toolkits.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"61 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141521640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An empirical study of fault localization in Python programs Python 程序故障定位实证研究
IF 4.1 2区 计算机科学
Empirical Software Engineering Pub Date : 2024-06-13 DOI: 10.1007/s10664-024-10475-3
Mohammad Rezaalipour, Carlo A. Furia
{"title":"An empirical study of fault localization in Python programs","authors":"Mohammad Rezaalipour, Carlo A. Furia","doi":"10.1007/s10664-024-10475-3","DOIUrl":"https://doi.org/10.1007/s10664-024-10475-3","url":null,"abstract":"<p>Despite its massive popularity as a programming language, especially in novel domains like data science programs, there is comparatively little research about fault localization that targets Python. Even though it is plausible that several findings about programming languages like C/C++ and Java—the most common choices for fault localization research—carry over to other languages, whether the dynamic nature of Python and how the language is used in practice affect the capabilities of classic fault localization approaches remain open questions to investigate. This paper is the first multi-family large-scale empirical study of fault localization on real-world Python programs and faults. Using Zou et al.’s recent large-scale empirical study of fault localization in Java (Zou et al. 2021) as the basis of our study, we investigated the effectiveness (i.e., localization accuracy), efficiency (i.e., runtime performance), and other features (e.g., different entity granularities) of seven well-known fault-localization techniques in four families (spectrum-based, mutation-based, predicate switching, and stack-trace based) on 135 faults from 13 open-source Python projects from the <span>BugsInPy</span> curated collection (Widyasari et al. 2020). The results replicate for Python several results known about Java, and shed light on whether Python’s peculiarities affect the capabilities of fault localization. The replication package that accompanies this paper includes detailed data about our experiments, as well as the tool <span>FauxPy</span> that we implemented to conduct the study.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"68 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信