Information and Software Technology最新文献_第6页

Data preprocessing for machine learning based code smell detection: A systematic literature review 基于机器学习的代码气味检测的数据预处理：系统的文献综述

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-25 DOI: 10.1016/j.infsof.2025.107752

Fábio do Rosario Santos, Ricardo Choren

{"title":"Data preprocessing for machine learning based code smell detection: A systematic literature review","authors":"Fábio do Rosario Santos, Ricardo Choren","doi":"10.1016/j.infsof.2025.107752","DOIUrl":"10.1016/j.infsof.2025.107752","url":null,"abstract":"<div><h3>Context:</h3><div>Detecting code smells using Machine Learning presents inherent challenges due to the unbalanced nature of the problem and susceptibility to interpretation biases. It is a data-driven process for code quality assurance that aims to detect if a given piece of code presents a fundamental design principles violation that negatively impacts design quality. Researchers in the field have been advised to carefully analyze the internal mechanisms of forecasting models before interpreting the results generated by them.</div></div><div><h3>Objective:</h3><div>The review aims to summarize and synthesize studies that utilized Data Preprocessing techniques for Machine Learning-based code smell detection. And also, to investigate the relationship between Data Preprocessing and more advanced Machine Learning techniques, i.e., Ensemble Methods, Deep Learning, and Transfer Learning.</div></div><div><h3>Method:</h3><div>To obtain insights into Data Preprocessing for Machine Learning-based code smell detection solutions, we employed a systematic approach, identifying and analyzing 69 studies published up to November 2023.</div></div><div><h3>Results:</h3><div>In Data Preprocessing, Data Balancing techniques, Feature Selection techniques, and Filtering emerged as prominent strategies. SMOTE was the most frequently used Data Balancing technique, while Autoencoder, Chi-square, Gain Ratio, Information Gain, PCA, and CFS were notable choices for Feature Selection. Tokenization and Syntax Trees were commonly paired with Deep Learning or Transfer Learning methods. Normalization and Standardization were implemented for Data Scaling. Regarding Machine Learning techniques used with Data Preprocessing, 46% of the combinations occurred with at least one Ensemble Method. Deep Learning was employed in 37% of cases. Data Balancing techniques combined with Deep Learning (32%) or Ensemble Methods (19%) were used most.</div></div><div><h3>Conclusion:</h3><div>The findings of this SLR are an integrated and comprehensive source of information regarding data preparation practices, challenges, and solutions for Machine Learning-based code smell detection, emphasizing the continuous endeavor towards more resilient, contextually sensitive, and developer-informed strategies within this dynamic field.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107752"},"PeriodicalIF":3.8,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143879411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A systematic literature review on task recommendation systems for crowdsourced software engineering 众包软件工程任务推荐系统的文献综述

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-22 DOI: 10.1016/j.infsof.2025.107753

Shashiwadana Nirmani , Mojtaba Shahin , Hourieh Khalajzadeh , Xiao Liu

{"title":"A systematic literature review on task recommendation systems for crowdsourced software engineering","authors":"Shashiwadana Nirmani , Mojtaba Shahin , Hourieh Khalajzadeh , Xiao Liu","doi":"10.1016/j.infsof.2025.107753","DOIUrl":"10.1016/j.infsof.2025.107753","url":null,"abstract":"<div><h3>Context:</h3><div>Crowdsourced Software Engineering (CSE) offers outsourcing work to software practitioners by leveraging a global online workforce. However, these software practitioners struggle to identify suitable tasks due to the variety of options available. Hence, there have been a growing number of studies on introducing recommendation systems to recommend CSE tasks to software practitioners.</div></div><div><h3>Objective:</h3><div>The goal of this study is to analyze the existing CSE task recommendation systems, investigating their extracted data, recommendation methods, key advantages and limitations, recommended task types, the use of human factors in recommendations, popular platforms, and features used to make recommendations.</div></div><div><h3>Methods:</h3><div>This SLR was conducted according to the Kitchenham and Charters’ guidelines. We used manual and automatic search strategies without putting any time limitation for searching the relevant papers.</div></div><div><h3>Results:</h3><div>We selected 65 primary studies for data extraction, analysis, and synthesis based on our predefined inclusion and exclusion criteria. Based on our data analysis results, we classified the extracted information into four categories according to the data acquisition sources: Software Practitioner’s Profile, Task or Project, Previous Contributions, and Direct Data Collection. We also organized the proposed recommendation systems into a taxonomy and identified key advantages, such as increased performance, accuracy, and optimized solutions. In addition, we identified the limitations of these systems, such as inadequate or biased recommendations and lack of generalizability. Our results revealed that human factors play a major role in CSE task recommendation. Further, we identified five popular task types recommended, popular platforms, and their features used in task recommendation. We also provided recommendations for future research directions.</div></div><div><h3>Conclusion:</h3><div>This SLR provides insights into current trends, gaps, and future research directions in CSE task recommendation systems such as the need for comprehensive evaluation, standardized evaluation metrics, and benchmarking in future studies, transferring knowledge from other platforms to address cold start problem.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107753"},"PeriodicalIF":3.8,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143886889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fundamental requirements of Digital Twins for production system in Oil and Gas Industry: A systematic literature review 石油天然气行业生产系统数字孪生系统的基本要求：系统文献综述

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-15 DOI: 10.1016/j.infsof.2025.107742

Ricardo C. Belo, Marcelo S. Pimenta, Tarciso T. Salvador, Rafael H. Petry, Mara Abel

{"title":"Fundamental requirements of Digital Twins for production system in Oil and Gas Industry: A systematic literature review","authors":"Ricardo C. Belo, Marcelo S. Pimenta, Tarciso T. Salvador, Rafael H. Petry, Mara Abel","doi":"10.1016/j.infsof.2025.107742","DOIUrl":"10.1016/j.infsof.2025.107742","url":null,"abstract":"<div><h3>Context:</h3><div>The oil and gas industry is adopting Digital Twins as a significant step in a continuous digital transformation. A Digital Twin can provide intelligent support to main activities related directly or indirectly to oil and gas production, like operations monitoring, process optimization, failure prediction, simulation of what-if scenarios, and safety improvement.</div></div><div><h3>Situation:</h3><div>Specifications of requirements of a Digital Twin (DT) in the oil and gas domain found in the literature are usually presented informally, utilizing natural and often ambiguous language. Most of the requirements need to be extracted from descriptions of DT characteristics and functionality presented in articles.</div></div><div><h3>Objective:</h3><div>This systematic literature review aims to summarize the existing evidence concerning the requirements of Digital Twins tailored explicitly for oil and gas production systems. By thoroughly analyzing published literature, the study seeks to uncover the requirements, properties, and constraints essential for the successful implementation of Digital Twins in this domain.</div></div><div><h3>Method:</h3><div>Through a systematic literature review, the study focused on rigorously identifying common functionality, ubiquitous characteristics, and some emerging trends related to Digital Twin requirements in oil and gas production systems.</div></div><div><h3>Results:</h3><div>From the initial 939 articles, the review selected 94 relevant studies, focusing on described requirements and on application-specific features of Digital Twins. Among the selected papers, 28 were analyzed and reviewed, focusing on specific requirements for Digital Twin for production systems within the industry, shedding light on 17 functional and 7 non-functional requirements common to many DT specifications and implementations.</div></div><div><h3>Conclusion:</h3><div>Our findings underscore the importance of comprehensively understanding and outlining the essential requirements for Digital Twins within the intricate landscape of production systems in the industry. By elucidating key features and properties of DT, this study contributes significantly to enhancing the efficacy and implementation of new Digital Twins, or the evaluation of existing Digital Twins.</div><div>As a result, we have identified some important requirements, specifically in the O&G domain. We analyzed some issues related to the software needs of DTs in the O&G domain, highlighting which are the requirements of a DT usually specified or informally described. This study allows us to identify primary studies in both DT for O&G and Requirements Engineering (RE) fields. Even though the requirements described here have been collected from DT works in the O&G domain, many of these requirements are also applicable to other domains, like many areas of engineering and manufacturing. Finally, it aims to offer a clear understanding of ","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107742"},"PeriodicalIF":3.8,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143855884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SemiRALD: A semi-supervised hybrid language model for robust Anomalous Log Detection 半监督混合语言模型用于鲁棒异常日志检测

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-11 DOI: 10.1016/j.infsof.2025.107743

Yicheng Sun , Jacky Wai Keung , Zhen Yang , Shuo Liu , Hi Kuen Yu

{"title":"SemiRALD: A semi-supervised hybrid language model for robust Anomalous Log Detection","authors":"Yicheng Sun , Jacky Wai Keung , Zhen Yang , Shuo Liu , Hi Kuen Yu","doi":"10.1016/j.infsof.2025.107743","DOIUrl":"10.1016/j.infsof.2025.107743","url":null,"abstract":"<div><h3>Context:</h3><div>Deep learning-based Anomalous Log Detection (DALD) tools are critical for software reliability, but current approaches face challenges, including information loss during log parsing, reliance on large labeled datasets, and fragility in low-resource scenarios.</div></div><div><h3>Objective:</h3><div>To overcome the above limitations, we propose SemiRALD, a semi-supervised learning-based robust ALD approach that leverages Large Language Model (LLM) for log parsing, enhancing both flexibility and accuracy. It utilizes a hybrid language model to repeatedly fit the samples with generate pseudo-labels, thereby training DALD models with limited resources and facilitating efficient anomaly detection tasks.</div></div><div><h3>Method:</h3><div>In detail, SemiRALD utilizes ChatGPT and in-context learning for automated log parsing, thereby improving the log integrity during log parsing. Subsequently, it harnesses a semi-supervised learning framework and our proposed hybrid language model to remedy the performance degeneration caused by low-resource restriction in practice. Semi-supervised learning requires only a small amount of labeled data throughout the entire process, while the hybrid language model is built on the architecture of RoBERTa and an attention-based BiLSTM.</div></div><div><h3>Results:</h3><div>Experiments on the HDFS and BGL datasets demonstrate that SemiRALD achieves an average F1-score improvement of 7.3% and 8.2%, respectively, over seven benchmark models. On small-scale datasets (0.1% of the original size), SemiRALD outperforms competitors by 31.4% and 46.0% in F1-score, respectively. Its consistent performance across diverse datasets highlights its generalizability and robustness.</div></div><div><h3>Conclusion:</h3><div>SemiRALD is capable of handling anomaly detection tasks in both large-scale and low-resource datasets, delivering significant advancements in anomaly log detection and offering robust, adaptable solutions to address prevalent challenges in the field of software reliability engineering.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"183 ","pages":"Article 107743"},"PeriodicalIF":3.8,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Validation of information architecture: Cross-methodological comparison of tree testing variants and prototype user testing 信息架构的验证：树形测试变体和原型用户测试的跨方法比较

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-10 DOI: 10.1016/j.infsof.2025.107740

Eduard Kuric , Peter Demcak , Matus Krajcovic

{"title":"Validation of information architecture: Cross-methodological comparison of tree testing variants and prototype user testing","authors":"Eduard Kuric , Peter Demcak , Matus Krajcovic","doi":"10.1016/j.infsof.2025.107740","DOIUrl":"10.1016/j.infsof.2025.107740","url":null,"abstract":"<div><h3>Context:</h3><div>Tree testing is an established user testing method applied by software professionals to validate that an information architecture is logically navigable by users. We identify a methodological gap caused by previously unexamined non-uniformity between tree testing methods and software.</div></div><div><h3>Objective:</h3><div>To reveal the role of the user interface representations in tree testing, this research compares the results of 3 commonly-used tree testing variants. To assess how indicative they are of the user’s interaction with an information architecture implemented in an actual user interface, and to issue methodological recommendations, comparison with varied high-fidelity prototypes was performed.</div></div><div><h3>Methods:</h3><div>Two between-subject studies were conducted to obtain a new dataset of users navigating an information architecture in tree testing and in interactive user interface prototypes. Data from 180 participants and 1800 task completions between 6 experimental conditions—3 tree testing and 3 prototype user interface variants—was evaluated quantitatively and qualitatively.</div></div><div><h3>Results:</h3><div>Significant differences were found between results yielded by different tree testing method variants, and in how well they approximate user navigation in the same information architecture in high-fidelity prototypes. Implications for selection of the tree testing variant are proposed in the context of evaluated information architecture, with plausible broader applicability for tree testing methodology. Evidence supports the tree testing variant with highest visibility of previous navigation choices and direct controls over their reversal as the most accurate.</div></div><div><h3>Conclusion:</h3><div>Presented findings can contribute to the design of software information architecture based on more accurate early validation, owing to tree testing that simulates less artificial user behavior more reflective of the user’s navigation in the eventual user interface. We hope this will further the discussion and research leading to more holistic tree testing methodologies in the future.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"183 ","pages":"Article 107740"},"PeriodicalIF":3.8,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143817124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RaxCS: Towards cross-language code summarization with contrastive pre-training and retrieval augmentation 基于对比预训练和检索增强的跨语言代码摘要

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-10 DOI: 10.1016/j.infsof.2025.107741

Kaiyuan Yang , Junfeng Wang , Zihua Song

{"title":"RaxCS: Towards cross-language code summarization with contrastive pre-training and retrieval augmentation","authors":"Kaiyuan Yang , Junfeng Wang , Zihua Song","doi":"10.1016/j.infsof.2025.107741","DOIUrl":"10.1016/j.infsof.2025.107741","url":null,"abstract":"<div><h3>Context:</h3><div>Code summarization is the task of generating a concise natural language description of the code snippet. Recent efforts have been made to boost the performance of code summarization language from various perspectives, e.g., retrieving external information or introducing large transformer-based models, and thus has achieved promising performance for one specific programming language. While dealing with rapidly expanded cross-language source code datasets, existing approaches suffer from two issues, (1) the difficulty of building a universe code representation for multiple languages; (2) less-well performance for low-resource language.</div></div><div><h3>Objective:</h3><div>To cope with these issues, we propose a novel code summarization approach named RaxCS, which aims to perform code summarization across multiple languages and improve accuracy for low-resource languages by leveraging cross-language knowledge.</div></div><div><h3>Methods:</h3><div>We exploit the pre-trained models with the contrastive learning objective to build a unified code representation towards multiple languages. To fully mine the external knowledge across programming languages, we design a hybrid retrieval module to search functionally equivalent code and its corresponding comment to serve as preliminary information. Finally, we employ a decode-only transformer model to fuse contextual information, which guides the process of generating summaries.</div></div><div><h3>Results:</h3><div>Extensive experiments demonstrate (1) RaxCS outperforms the state-of-the-art on cross-language code summarization (i.e., RaxCS scores 4.39% higher in terms of BLEU metric and 8.65% in terms of BERTScore). (2) For low-resource languages, RaxCS can boost the code summarization performance by a significant magnification (e.g., 6.93% in terms of BLEU for ruby) with cross-language retrieval.</div></div><div><h3>Conclusion:</h3><div>This paper introduces a cross-language code summarization model, which utilizes contrastive pre-training and cross-language retrieval. Both are beneficial for incorporating cross-language knowledge to advance code summarization performance. The experimental results demonstrate that RaxCS is effective in generating accurate code summaries, particularly for low-resource languages.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"183 ","pages":"Article 107741"},"PeriodicalIF":3.8,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143820618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Special issue on causal modeling and inference in SE SE因果建模与推理专刊

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-10 DOI: 10.1016/j.infsof.2025.107754

Julien Siebert , Adam Trendowicz , Gregor Gössler , Hironori Washizaki , Michael Kläs , Martin Shepperd

引用次数: 0

VDMAF: Cross-language source code vulnerability detection using multi-head attention fusion VDMAF：使用多头注意力融合的跨语言源代码漏洞检测

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-09 DOI: 10.1016/j.infsof.2025.107739

Yang Li , Qin Luo , Peng Wu , Hongdi Zheng

{"title":"VDMAF: Cross-language source code vulnerability detection using multi-head attention fusion","authors":"Yang Li , Qin Luo , Peng Wu , Hongdi Zheng","doi":"10.1016/j.infsof.2025.107739","DOIUrl":"10.1016/j.infsof.2025.107739","url":null,"abstract":"<div><h3>Context:</h3><div>Detecting potential vulnerabilities is critical for ensuring the stability and reliability of software systems. Traditional static detection methods fall short in accuracy and efficiency. Furthermore, existing deep learning-based vulnerability detection models typically rely on single sequence or graph embedding methods, neglecting the semantic and structured information present in the code. With the diversification of software development environments, systems often involve multiple programming languages. This limits the effectiveness of existing vulnerability detection methods when handling cross-language code.</div></div><div><h3>Objective:</h3><div>To solve these problems, we propose a more effective and general vulnerability detection framework, VDMAF(Cross-Language Source Code Vulnerability Detection Using Multi-Head Attention Fusion).</div></div><div><h3>Methods:</h3><div>The method extracts unified and standardized feature representations. It uses a multi-head attention module to fuse sequence features and graph structural features. First, an improved global consistent labeling mechanism is introduced, which improves data representation through threshold-based label augmentation. Second, the method uses sequence embedding to extract local semantic features of the code. The code is converted into a unified, standardized graph structure. Then, a graph neural network is used to extract features. Finally, the sequence and graph features are fused using the multi-head attention module, followed by classification with a bidirectional LSTM-based recurrent neural network.</div></div><div><h3>Results:</h3><div>VDMAF has been evaluated on three vulnerability datasets across different programming languages and granularities, demonstrating better performance across all metrics compared to baseline models, with F1 scores of 98.9%, 65.3%, and 56.8%.</div></div><div><h3>Conclusion:</h3><div>The proposed VDMAF outperforms state-of-the-art models, exhibiting better generality and scalability, thus showing greater potential in vulnerability detection tasks.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"183 ","pages":"Article 107739"},"PeriodicalIF":3.8,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unraveling the pain points of domain modeling 揭示领域建模的痛点

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-09 DOI: 10.1016/j.infsof.2025.107736

Isadora Valle , Tiago Prince Sales , Eduardo Guerra , Maya Daneva , Renata Guizzardi , Luiz Olavo Bonino da Silva Santos , Henderik A. Proper , Giancarlo Guizzardi

{"title":"Unraveling the pain points of domain modeling","authors":"Isadora Valle , Tiago Prince Sales , Eduardo Guerra , Maya Daneva , Renata Guizzardi , Luiz Olavo Bonino da Silva Santos , Henderik A. Proper , Giancarlo Guizzardi","doi":"10.1016/j.infsof.2025.107736","DOIUrl":"10.1016/j.infsof.2025.107736","url":null,"abstract":"<div><div>Conceptual models offer numerous benefits but require significant investments, requiring modelers to strive to balance costs and benefits. Understanding the modeling process and the frustrations experienced by modelers can provide valuable insights for this assessment. While research acknowledges certain instances of modelers’ dissatisfaction, its scope often limits detailed examination. This study seeks to identify and analyze the main pain points associated with domain modeling through a five-phase empirical study using a multi-method approach. We identified <strong>71</strong> pain points, synthesized them to <strong>41</strong>, and prioritized <strong>16</strong> as the most significant and prevalent in domain modeling. We then refined, documented, and exemplified the prioritized pain points, analyzed their potential causes, and discussed their practical implications. Our findings provide valuable insights for improving modelers’ experiences and optimizing the modeling process.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"183 ","pages":"Article 107736"},"PeriodicalIF":3.8,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143843731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Don’t settle for the first! How many GitHub Copilot solutions should you check? 不要满足于前者！你应该检查多少个GitHub Copilot解决方案？

IF 3.8 2区计算机科学

Information and Software Technology Pub Date : 2025-04-08 DOI: 10.1016/j.infsof.2025.107737

Julian Oertel , Jil Klünder , Regina Hebig

{"title":"Don’t settle for the first! How many GitHub Copilot solutions should you check?","authors":"Julian Oertel , Jil Klünder , Regina Hebig","doi":"10.1016/j.infsof.2025.107737","DOIUrl":"10.1016/j.infsof.2025.107737","url":null,"abstract":"<div><h3>Context:</h3><div>With the integration of generative artificial intelligence (GenAI) tools such as GitHub Copilot into development processes, developers can be supported when writing code.</div></div><div><h3>Objectives:</h3><div>As GitHub Copilot has a feature to provide up to ten solutions at once, we explore, how developers should approach those solutions with the goal of providing recommendations to achieve suitable trade-offs in finding correct solutions and checking solutions.</div></div><div><h3>Methods:</h3><div>In this study, we analyze a total of 2025 coding problems provided by LeetCode and 17048 solutions to solve these problems generated by GitHub Copilot in Python. We focus on three key issues: firstly, whether it is beneficial to consider multiple solutions; secondly, the impact of the position of a solution; and thirdly, the number of solutions that should be checked by a developer.</div></div><div><h3>Results:</h3><div>Overall, our results point to the following observations: (1) solutions are not less likely to be correct if they appear at later positions; (2) when looking for a solution to a common problem, checking four to five solutions is generally enough; (3) novel or difficult problems are unlikely to be solved by GitHub Copilot; (4) skipping the first solution is advised when considering only one solution, as the first solution is less likely to be correct; and (5) checking all solutions is necessary to not miss correct solutions, but the effort is usually not justified.</div></div><div><h3>Conclusion:</h3><div>Based on our study, we conclude that there is potential for improvement in better supporting developers. For instance, there are few cases where ten generated solutions provide more value than fewer solutions. Depending on the use scenario, it could be more useful if GitHub Copilot allowed developers to request a single, comprehensive solution.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"183 ","pages":"Article 107737"},"PeriodicalIF":3.8,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0