Robson Keemps , Kleinner Farias , Rafael Kunst , Carlos Carbonera , Willian Bolzan
{"title":"SmellDSL: A domain-specific language to assist developers in specifying code smell patterns","authors":"Robson Keemps , Kleinner Farias , Rafael Kunst , Carlos Carbonera , Willian Bolzan","doi":"10.1016/j.infsof.2025.107760","DOIUrl":"10.1016/j.infsof.2025.107760","url":null,"abstract":"<div><h3>Context:</h3><div>The current literature has widely investigated <em>code smell patterns</em> over the years, which describe specific source code characteristics that indicate potential problems or areas for improvements. Empirical studies suggest that (i) metric-based strategies for code smell detection are not effective and overload the developers with false positives; (ii) code smell specifications are informal, ambiguous, and not supported by traditional IDEs like Eclipse platform; and (iii) the identification of code smells depends on the perception of software development teams.</div></div><div><h3>Objective:</h3><div>This article, therefore, proposes SmellDSL, a tool-supported domain-specific language to assist developers when specifying code smell patterns. SmellDSL benefits developers by introducing Eclipse built-in constructs that enable the specification of team-sensitive code smell patterns. Developers can write rules to specify single or composite architectural problems (<em>e.g., Misplaced Concerns</em>) and suggest code refactorings regarding severe architectural degradation symptoms.</div></div><div><h3>Method:</h3><div>We conducted an empirical study with 35 developers who specified eight code smells using SmellDSL, generating 280 evaluation scenarios.</div></div><div><h3>Results:</h3><div>The main results, supported by statistical tests, suggest that SmellDSL requires low effort to specify code smell patterns and promotes a high rate of correctly code smell specifications.</div></div><div><h3>Conclusion:</h3><div>We contribute with a domain-specific language for the specification of code smell patterns, empirical evidence on its usefulness, and draw worth-investigating research challenges by the research community.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107760"},"PeriodicalIF":3.8,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143946830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando Uyaguari , Silvia T. Acuña , John W. Castro , Oscar Dieste , Natalia Juristo
{"title":"Reliability of systematic literature reviews on test-driven development","authors":"Fernando Uyaguari , Silvia T. Acuña , John W. Castro , Oscar Dieste , Natalia Juristo","doi":"10.1016/j.infsof.2025.107762","DOIUrl":"10.1016/j.infsof.2025.107762","url":null,"abstract":"<div><h3>Context</h3><div>Test-driven development (TDD) is a software development technique studied empirically over the last few decades. There are several systematic literature reviews (SLRs) on TDD. The reliability of these studies should not be taken for granted because SLRs are highly dependent on the context and researcher decision-making.</div></div><div><h3>Objective</h3><div>This study determines, analyses and synthesizes the limited overlap between SLRs on TDD and its influence on the conclusions and results with respect to the code quality and developer productivity response variables.</div></div><div><h3>Method</h3><div>A tertiary study was conducted to source SLRs on TDD from the scientific literature, and the primary studies referenced in each SLR were analysed. We compared SLRs with similar objectives, SLRs with similar response variables, and all SLRs. We analysed the differences between the selected primary studies and their impact on the conclusions and results.</div></div><div><h3>Results</h3><div>The overlap between SLRs with similar response variables (54 %) is greater than between SLRs with similar objectives (36 %). Only three per cent of the primary studies are included in all eight analysed SLRs. Conclusions regarding external quality and productivity may vary across the SLRs on TDD. While we found that SLR results are similar, these results may differ when authors classify primary studies by experiments and case studies.</div></div><div><h3>Conclusion</h3><div>SLRs with similar response variables tend to be more repeatable than SLRs with similar objectives and SLRs addressing the same topic. The SLR authors’ criteria with respect to the consistency of evidence may influence the conclusions of SLRs on TDD. The results of SLRs where all primary studies count equally appear to be consistent. The SLR authors’ criteria for selecting primary studies may influence the results classified by case studies and experiments.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107762"},"PeriodicalIF":3.8,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143917469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring the use of LLMs for the selection phase in systematic literature studies","authors":"Lukas Thode , Umar Iftikhar , Daniel Mendez","doi":"10.1016/j.infsof.2025.107757","DOIUrl":"10.1016/j.infsof.2025.107757","url":null,"abstract":"<div><h3>Context:</h3><div>Systematic literature studies, such as secondary studies, are crucial to aggregate evidence. An essential part of these studies is the selection phase of relevant studies. This, however, is time-consuming, resource-intensive, and error-prone as it highly depends on manual labor and domain expertise. The increasing popularity of Large Language Models (LLMs) raises the question to what extent these manual study selection tasks could be supported in an automated manner.</div></div><div><h3>Objectives:</h3><div>In this manuscript, we report on our effort to explore and evaluate the use of state-of-the-art LLMs to automate the selection phase in systematic literature studies.</div></div><div><h3>Method:</h3><div>We evaluated LLMs for the selection phase using two published systematic literature studies in software engineering as ground truth. Three prompts were designed and applied across five LLMs to the studies’ titles and abstracts based on their inclusion and exclusion criteria. Additionally, we analyzed combining two LLMs to replicate a practical selection phase. We analyzed recall and precision and reflected upon the accuracy of the LLMs, and whether the ground truth studies were conducted by early career scholars or by more advanced ones.</div></div><div><h3>Results:</h3><div>Our results show a high average recall of up to 98% combined with a precision of 27% in a single LLM approach and an average recall of 99% with a precision of 27% in a two-model approach replicating a two-reviewer procedure. Further the Llama 2 models showed the highest average recall 98% across all prompt templates and datasets while GPT4-turbo had the highest average precision 72%.</div></div><div><h3>Conclusions:</h3><div>Our results demonstrate how LLMs could support a selection phase in the future. We recommend a two LLM-approach to archive a higher recall. However, we also critically reflect upon how further studies are required using other models and prompts on more datasets to strengthen the confidence in our presented approach.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107757"},"PeriodicalIF":3.8,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143943003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingbo Yang , Xin Ji , Wenjun Wu , Xingchuang Liao , Kui Zhang , Linxiao Dong , Nan Xiang , Ren Jian
{"title":"BERT4Anno: An annotation misuse detection method for Java","authors":"Jingbo Yang , Xin Ji , Wenjun Wu , Xingchuang Liao , Kui Zhang , Linxiao Dong , Nan Xiang , Ren Jian","doi":"10.1016/j.infsof.2025.107763","DOIUrl":"10.1016/j.infsof.2025.107763","url":null,"abstract":"<div><h3>Context</h3><div>Developers leverage Java annotations to implement functions such as creating objects and operating databases. However, mastering annotations is challenging, and misused annotations might cause an application to crash. Although state-of-the-art techniques attempt to solve this problem, they do not provide conclusions on Java annotation misuse types, nor do they leverage project-level information, which results in low efficiency in detecting annotation misuses.</div></div><div><h3>Objective</h3><div>To summarize Java annotation misuse types and provide a more efficient method for detecting misused annotations.</div></div><div><h3>Method</h3><div>Firstly, to categorize Java annotation misuses, we conduct an empirical study and curate 321 annotation misuse questions from Stack Overflow. Secondly, to better detect these misuses, we propose a BERT-based method, BERT4Anno, which takes project structure and resource configuration into account—factors often neglected by state-of-the-art methods. In BERT4Anno, a novel Annotation Usage Project Representation (AUPR) technique is designed to leverage the information of the interconnections among source code, configuration and project structure. Moreover, an AUPR-based Named Entity Recognition (ANER) task by fine-tuning BERT is devised to learn annotation usage knowledge. With the knowledge, the fine-tuned model can detect misused annotations. Finally, to evaluate our proposed method, two datasets, mainly curated from GitHub and comprising 404 Java projects/files with annotation misuse instances, are used for the experiments.</div></div><div><h3>Results</h3><div>The Java annotation misuses are categorized into 14 types based on how the curated questions violate the correct annotation usage knowledge. The comparison experiment demonstrates the superior performance of our method over state-of-the-art baselines in terms of precision, recall, and F1 score, while our visualization technique provides insightful interpretations of the mechanism underlying the model’s outperformance.</div></div><div><h3>Conclusion</h3><div>By leveraging the project-level information, our proposed method can predict the appropriate types and positions of annotations and subsequently identify the misused annotations, making the detection more efficient.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107763"},"PeriodicalIF":3.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143931707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Omar Haggag , Alessandro Pedace , Shidong Pan , John Grundy
{"title":"An analysis of privacy regulations and user concerns of finance mobile applications","authors":"Omar Haggag , Alessandro Pedace , Shidong Pan , John Grundy","doi":"10.1016/j.infsof.2025.107756","DOIUrl":"10.1016/j.infsof.2025.107756","url":null,"abstract":"<div><h3>Context:</h3><div>Financial applications handle sensitive data, including personal details, banking information, and transaction histories, making them prime targets for cyber-attacks. As privacy concerns grow, users and regulators are increasingly analyzing how these apps manage data in different legal contexts.</div></div><div><h3>Objective:</h3><div>This study examines user privacy concerns and assesses the impact of privacy regulations on mobile financial applications in Germany, Australia, and the United States. It aims to evaluate how laws such as the GDPR in the EU, the Privacy Act in Australia, and various U.S. state and federal laws shape app privacy policies. Additionally, the study explores the readability and accessibility of privacy policies.</div></div><div><h3>Methods:</h3><div>User reviews from app stores were analyzed to identify recurring privacy issues and regional differences in concerns. The study also reviewed privacy laws in the EU, Australia, and the U.S. to assess their influence on financial app policies. To analyze the user-friendliness of privacy documents, a readability analysis was conducted using the Flesch Reading Ease score and estimated reading times.</div></div><div><h3>Results:</h3><div>The findings revealed that users are highly concerned about the handling of their data, with significant demand for greater transparency and more robust privacy protections. Regional differences in privacy concerns were identified, with varying levels of engagement with privacy issues in each region. The study also found significant discrepancies in the readability of privacy policies, with many policies proving too complex for the average user to understand.</div></div><div><h3>Conclusion:</h3><div>The study concludes that financial app developers need to simplify their privacy policies and improve transparency to build user trust. It also emphasizes the need for stronger regulatory frameworks to address evolving privacy challenges. Recommendations are made for developers and policymakers to enhance data protection and improve user experience in financial services.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107756"},"PeriodicalIF":3.8,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143891892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ana Carolina Moises de Souza, Daniela Soares Cruzes, Letizia Jaccheri, Tangni Cunningham Dahl-Jørgensen
{"title":"Promoting social sustainability within software development through the lens of organizational readiness for change theory","authors":"Ana Carolina Moises de Souza, Daniela Soares Cruzes, Letizia Jaccheri, Tangni Cunningham Dahl-Jørgensen","doi":"10.1016/j.infsof.2025.107755","DOIUrl":"10.1016/j.infsof.2025.107755","url":null,"abstract":"<div><h3>Context:</h3><div>Software’s negative impact on society underscores the need to integrate social sustainability into software development. However, effective implementation and practitioners’ readiness for this change remain unclear, requiring further investigation.</div></div><div><h3>Objective:</h3><div>This research aims to understand the conditions that promote organizational readiness for change in the integration of social sustainability into software development from the perspective of software practitioners.</div></div><div><h3>Methods:</h3><div>We conducted multiple case studies containing three cases: (A) an exploratory study with 11 practitioners from four organizations; (B) the proposal and validation of a Walkthrough intervention with 9 students (pilot) and 19 practitioners (questionnaire); and (C) a focus group with 6 practitioners in one organization providing feedback on the Walkthrough.</div></div><div><h3>Results:</h3><div>Four facilitators and barriers were identified as key preconditions for social sustainability integration. Statistical analysis showed that the perceived usefulness of the Walkthrough was significantly higher than intentional behavior indicating strong perceived value despite moderate intention to adopt the practices.</div></div><div><h3>Conclusion:</h3><div>This study identified the key determinants that promote organizational readiness to integrate social sustainability into software development. By proposing a conceptual model, it contributes to helping organizations leverage facilitators, overcome barriers, and offer actionable recommendations for both practice and research.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107755"},"PeriodicalIF":3.8,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143891910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giuseppe Colavito, Filippo Lanubile, Nicole Novielli
{"title":"Benchmarking large language models for automated labeling: The case of issue report classification","authors":"Giuseppe Colavito, Filippo Lanubile, Nicole Novielli","doi":"10.1016/j.infsof.2025.107758","DOIUrl":"10.1016/j.infsof.2025.107758","url":null,"abstract":"<div><h3>Context:</h3><div>Issue labeling is a fundamental task for software development as it is critical for the effective management of software projects. This practice involves assigning a label to issues, such as <em>bug</em> or feature request, denoting a task relevant to the project management. To date, large language models (LLMs) have been proposed to automate this task, including both fine-tuned BERT-like models and zero-shot GPT-like models.</div></div><div><h3>Objectives:</h3><div>In this paper, we investigate which LLMs offer the best trade-off between performance, response time, hardware requirements, and quality of the responses for issue report classification.</div></div><div><h3>Methods:</h3><div>We design and execute a comprehensive benchmark study to assess 22 generative decoder-only LLMs and 2 baseline BERT-like encoder-only models, which we evaluate on two different datasets of GitHub issues.</div></div><div><h3>Results:</h3><div>Generative LLMs demonstrate potential for zero-shot classification. However, their performance varies significantly across datasets and they require substantial computational resources for deployment. In contrast, BERT-like models show more consistent performance and lower resource requirements.</div></div><div><h3>Conclusions:</h3><div>Based on the empirical evidence provided in this study, we discuss implications for researchers and practitioners. In particular, our results suggest that fine-tuning BERT-like encoder-only models enables achieving consistent, state-of-the-art performance across datasets even in presence of a small amount of labeled data available for training.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107758"},"PeriodicalIF":3.8,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143891909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data preprocessing for machine learning based code smell detection: A systematic literature review","authors":"Fábio do Rosario Santos, Ricardo Choren","doi":"10.1016/j.infsof.2025.107752","DOIUrl":"10.1016/j.infsof.2025.107752","url":null,"abstract":"<div><h3>Context:</h3><div>Detecting code smells using Machine Learning presents inherent challenges due to the unbalanced nature of the problem and susceptibility to interpretation biases. It is a data-driven process for code quality assurance that aims to detect if a given piece of code presents a fundamental design principles violation that negatively impacts design quality. Researchers in the field have been advised to carefully analyze the internal mechanisms of forecasting models before interpreting the results generated by them.</div></div><div><h3>Objective:</h3><div>The review aims to summarize and synthesize studies that utilized Data Preprocessing techniques for Machine Learning-based code smell detection. And also, to investigate the relationship between Data Preprocessing and more advanced Machine Learning techniques, i.e., Ensemble Methods, Deep Learning, and Transfer Learning.</div></div><div><h3>Method:</h3><div>To obtain insights into Data Preprocessing for Machine Learning-based code smell detection solutions, we employed a systematic approach, identifying and analyzing 69 studies published up to November 2023.</div></div><div><h3>Results:</h3><div>In Data Preprocessing, Data Balancing techniques, Feature Selection techniques, and Filtering emerged as prominent strategies. SMOTE was the most frequently used Data Balancing technique, while Autoencoder, Chi-square, Gain Ratio, Information Gain, PCA, and CFS were notable choices for Feature Selection. Tokenization and Syntax Trees were commonly paired with Deep Learning or Transfer Learning methods. Normalization and Standardization were implemented for Data Scaling. Regarding Machine Learning techniques used with Data Preprocessing, 46% of the combinations occurred with at least one Ensemble Method. Deep Learning was employed in 37% of cases. Data Balancing techniques combined with Deep Learning (32%) or Ensemble Methods (19%) were used most.</div></div><div><h3>Conclusion:</h3><div>The findings of this SLR are an integrated and comprehensive source of information regarding data preparation practices, challenges, and solutions for Machine Learning-based code smell detection, emphasizing the continuous endeavor towards more resilient, contextually sensitive, and developer-informed strategies within this dynamic field.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107752"},"PeriodicalIF":3.8,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143879411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A systematic literature review on task recommendation systems for crowdsourced software engineering","authors":"Shashiwadana Nirmani , Mojtaba Shahin , Hourieh Khalajzadeh , Xiao Liu","doi":"10.1016/j.infsof.2025.107753","DOIUrl":"10.1016/j.infsof.2025.107753","url":null,"abstract":"<div><h3>Context:</h3><div>Crowdsourced Software Engineering (CSE) offers outsourcing work to software practitioners by leveraging a global online workforce. However, these software practitioners struggle to identify suitable tasks due to the variety of options available. Hence, there have been a growing number of studies on introducing recommendation systems to recommend CSE tasks to software practitioners.</div></div><div><h3>Objective:</h3><div>The goal of this study is to analyze the existing CSE task recommendation systems, investigating their extracted data, recommendation methods, key advantages and limitations, recommended task types, the use of human factors in recommendations, popular platforms, and features used to make recommendations.</div></div><div><h3>Methods:</h3><div>This SLR was conducted according to the Kitchenham and Charters’ guidelines. We used manual and automatic search strategies without putting any time limitation for searching the relevant papers.</div></div><div><h3>Results:</h3><div>We selected 65 primary studies for data extraction, analysis, and synthesis based on our predefined inclusion and exclusion criteria. Based on our data analysis results, we classified the extracted information into four categories according to the data acquisition sources: Software Practitioner’s Profile, Task or Project, Previous Contributions, and Direct Data Collection. We also organized the proposed recommendation systems into a taxonomy and identified key advantages, such as increased performance, accuracy, and optimized solutions. In addition, we identified the limitations of these systems, such as inadequate or biased recommendations and lack of generalizability. Our results revealed that human factors play a major role in CSE task recommendation. Further, we identified five popular task types recommended, popular platforms, and their features used in task recommendation. We also provided recommendations for future research directions.</div></div><div><h3>Conclusion:</h3><div>This SLR provides insights into current trends, gaps, and future research directions in CSE task recommendation systems such as the need for comprehensive evaluation, standardized evaluation metrics, and benchmarking in future studies, transferring knowledge from other platforms to address cold start problem.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107753"},"PeriodicalIF":3.8,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143886889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ricardo C. Belo, Marcelo S. Pimenta, Tarciso T. Salvador, Rafael H. Petry, Mara Abel
{"title":"Fundamental requirements of Digital Twins for production system in Oil and Gas Industry: A systematic literature review","authors":"Ricardo C. Belo, Marcelo S. Pimenta, Tarciso T. Salvador, Rafael H. Petry, Mara Abel","doi":"10.1016/j.infsof.2025.107742","DOIUrl":"10.1016/j.infsof.2025.107742","url":null,"abstract":"<div><h3>Context:</h3><div>The oil and gas industry is adopting Digital Twins as a significant step in a continuous digital transformation. A Digital Twin can provide intelligent support to main activities related directly or indirectly to oil and gas production, like operations monitoring, process optimization, failure prediction, simulation of what-if scenarios, and safety improvement.</div></div><div><h3>Situation:</h3><div>Specifications of requirements of a Digital Twin (DT) in the oil and gas domain found in the literature are usually presented informally, utilizing natural and often ambiguous language. Most of the requirements need to be extracted from descriptions of DT characteristics and functionality presented in articles.</div></div><div><h3>Objective:</h3><div>This systematic literature review aims to summarize the existing evidence concerning the requirements of Digital Twins tailored explicitly for oil and gas production systems. By thoroughly analyzing published literature, the study seeks to uncover the requirements, properties, and constraints essential for the successful implementation of Digital Twins in this domain.</div></div><div><h3>Method:</h3><div>Through a systematic literature review, the study focused on rigorously identifying common functionality, ubiquitous characteristics, and some emerging trends related to Digital Twin requirements in oil and gas production systems.</div></div><div><h3>Results:</h3><div>From the initial 939 articles, the review selected 94 relevant studies, focusing on described requirements and on application-specific features of Digital Twins. Among the selected papers, 28 were analyzed and reviewed, focusing on specific requirements for Digital Twin for production systems within the industry, shedding light on 17 functional and 7 non-functional requirements common to many DT specifications and implementations.</div></div><div><h3>Conclusion:</h3><div>Our findings underscore the importance of comprehensively understanding and outlining the essential requirements for Digital Twins within the intricate landscape of production systems in the industry. By elucidating key features and properties of DT, this study contributes significantly to enhancing the efficacy and implementation of new Digital Twins, or the evaluation of existing Digital Twins.</div><div>As a result, we have identified some important requirements, specifically in the O&G domain. We analyzed some issues related to the software needs of DTs in the O&G domain, highlighting which are the requirements of a DT usually specified or informally described. This study allows us to identify primary studies in both DT for O&G and Requirements Engineering (RE) fields. Even though the requirements described here have been collected from DT works in the O&G domain, many of these requirements are also applicable to other domains, like many areas of engineering and manufacturing. Finally, it aims to offer a clear understanding of ","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"184 ","pages":"Article 107742"},"PeriodicalIF":3.8,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143855884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}