Xiaolu Zhang, Tahmid Rafi, Yuejun Guan, Shuqing Li, Michael R. Lyu
{"title":"Understanding the privacy-realisticness dilemma of the metaverse","authors":"Xiaolu Zhang, Tahmid Rafi, Yuejun Guan, Shuqing Li, Michael R. Lyu","doi":"10.1007/s10515-025-00516-6","DOIUrl":"10.1007/s10515-025-00516-6","url":null,"abstract":"<div><p>Metaverse is a form of next-generation human–computer interaction and social networks based on virtual and augmented reality. Both the research and industry community have invested much in this area to develop useful applications and enhance user experience. Meanwhile, the expanded human–computer interface which enables the immersive experience in the Metaverse will also inevitably expand the interface of potential privacy leaks. This dilemma between immersive user experience and higher privacy risks has not been well studied and it is not clear how different users would make decisions when facing such a dilemma. In this research work, we systematically studied this dilemma in different usage scenarios of the Metaverse and performed a study on 177 users to understand the factors that may affect users’ decision making. From the study, we found that user preference on immersive experience and privacy protection can be very different in different usage scenarios and we expect our study results can provide some insights and guidance for the design of privacy protection mechanisms in Metaverse platforms and applications.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144135273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measuring the impact of predictive models on the software project: A cost, service time, and risk evaluation of a metric-based defect severity prediction model","authors":"Umamaheswara Sharma B, Ravichandra Sadam","doi":"10.1007/s10515-025-00519-3","DOIUrl":"10.1007/s10515-025-00519-3","url":null,"abstract":"<div><p>In a critical software system, the testers have to spend an enormous amount of time and effort maintaining the software due to the continuous occurrence of defects. To reduce the time and effort of a tester, prior works in the literature are limited to using documented defect reports to automatically predict the severity of the defective software modules. In contrast, in this work, we propose a metric-based software defect severity prediction (SDSP) model that is built using a decision-tree incorporated self-training semi-supervised learning approach to classify the severity of the defective software modules. Empirical analysis of the proposed model on the AEEEM datasets suggests using the proposed approach as it successfully assigns suitable severity class labels to the unlabelled modules. On the other hand, numerous research studies have addressed the methodological aspects of SDSP models, but the gap in estimating the performance of a developed prediction using suitable measures remains unattempt. For this, we propose the risk factor, per cent of the saved budget, loss in the saved budget, per cent of remaining edits, per cent of remaining edits, remaining service time, and gratuitous service time, to interpret the predictions in terms of project objectives. Empirical analysis of the proposed approach shows the benefit of using the proposed measures in addition to the traditional measures.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zexian Zhang, Lin Zhu, Shuang Yin, Wenhua Hu, Shan Gao, Haoxuan Chen, Fuyang Li
{"title":"The impact of feature selection and feature reduction techniques for code smell detection: A comprehensive empirical study","authors":"Zexian Zhang, Lin Zhu, Shuang Yin, Wenhua Hu, Shan Gao, Haoxuan Chen, Fuyang Li","doi":"10.1007/s10515-025-00524-6","DOIUrl":"10.1007/s10515-025-00524-6","url":null,"abstract":"<div><p>Code smell detection using machine/deep learning methods aims to classify code instances as smelly or non-smelly based on extracted features. Accurate detection relies on optimizing feature sets by focusing on relevant features while discarding those that are redundant or irrelevant. However, prior studies on feature selection and reduction techniques for code smell detection have yielded inconsistent results, possibly due to limited exploration of available techniques. To address this gap, we comprehensively analyze 33 feature selection and 6 feature reduction techniques across seven classification models and six code smell datasets. And we apply the Scott-Knott effect size difference test for comparing performance and McNemar’s test for assessing prediction diversity. The results show that (1) Not all feature selection and reduction techniques significantly improve detection performance. (2) Feature extraction techniques generally perform worse than feature selection techniques. (3) Probabilistic significance is recommended as a “generic” feature selection technique due to its higher consistency in identifying smelly instances. (4) High-frequency features selected by the top feature selection techniques vary by dataset, highlighting their specific relevance for identifying the corresponding code smells. Based on these findings, we provide implications for further code smell detection research.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144073611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structural contrastive learning based automatic bug triaging","authors":"Yi Tao, Jie Dai, Lingna Ma, Zhenhui Ren, Fei Wang","doi":"10.1007/s10515-025-00517-5","DOIUrl":"10.1007/s10515-025-00517-5","url":null,"abstract":"<div><p>Bug triaging is crucial for software maintenance, as it matches developers with bug reports they are most qualified to handle. This task has gained importance with the growth of the open-source community. Traditionally, methods have emphasized semantic classification of bug reports, but recent approaches focus on the associations between bugs and developers. Leveraging latent patterns from bug-fixing records can enhance triaging predictions; however, the limited availability of these records presents a significant challenge. This scarcity highlights a broader issue in supervised learning: the inadequacy of labeled data and the underutilization of unlabeled data. To address these limitations, we propose a novel framework named SCL-BT (Structural Contrastive Learning-based Bug Triaging). This framework improves the utilization of labeled heterogeneous associations through edge perturbation and leverages unlabeled homogeneous associations via hypergraph sampling. These processes are integrated with a graph convolutional network backbone to enhance the prediction of associations and, consequently, bug triaging accuracy. Experimental results demonstrate that SCL-BT significantly outperforms existing models on public datasets. Specifically, on the Google Chromium dataset, SCL-BT surpasses the GRCNN method by 18.64<span>(%)</span> in terms of the Top-9 Hit Ratio metric. The innovative approach of SCL-BT offers valuable insights for the research of automatic bug-triaging.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144073610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haichi Wang, Ruiguo Yu, Dong Wang, Yiheng Du, Yingquan Zhao, Junjie Chen, Zan Wang
{"title":"An empirical study of test case prioritization on the Linux Kernel","authors":"Haichi Wang, Ruiguo Yu, Dong Wang, Yiheng Du, Yingquan Zhao, Junjie Chen, Zan Wang","doi":"10.1007/s10515-025-00522-8","DOIUrl":"10.1007/s10515-025-00522-8","url":null,"abstract":"<div><p>The Linux kernel is a complex and constantly evolving system, where each code change can impact different components of the system. Regression testing ensures that new changes do not affect existing functionality or introduce new defects. However, due to the complexity of the Linux kernel, maintenance remains challenging. While practices like Continuous Integration (CI) facilitate rapid commits through automated regression testing, each CI process still incurs substantial costs due to the extensive number of test cases. Traditional software testing employs test case prioritization (TCP) techniques to prioritize test cases, thus enabling the early detection of defects. Due to the unique characteristics of the Linux kernel, it remains unclear whether the existing TCP techniques are suitable for its regression testing. In this paper, we present the first empirical study by comparing various TCP techniques in Linux kernel context. Specifically, we examined a total of 17 TCP techniques, including similarity-based, information-retrieval-based, and coverage-based techniques. The experimental results demonstrate that: (1) Similarity-based TCP techniques perform best on the Linux kernel, achieving a mean APFD (Average Percentage of Faults Detected) value of 0.7583 and requiring significantly less time; (2) The majority of TCP techniques show relatively stable performance across multiple commits, where similarity-based TCP techniques are more stable with a maximum decrease of 3.03% and 3.92% in terms of mean and median APFD values, respectively; (3) More than half of the studied techniques are significantly affected by flaky tests, with both mean and median APFD values ranging from -29.9% to -63.5%. This work takes the first look at the adoption of TCP techniques in the Linux kernel, confirming its potential for effective and efficient prioritization.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143938562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"iALBMAD: an improved agile-based layered approach for mobile app development","authors":"Anil Patidar, Ugrasen Suman","doi":"10.1007/s10515-025-00520-w","DOIUrl":"10.1007/s10515-025-00520-w","url":null,"abstract":"<div><p>The demand to acquire improved efficiency, agility, and adaptability led to rapid evolution in mobile app development (MAD). Agile approaches are recognized for being cooperative and iterative, but there are still issues in handling a range of MAD necessities. The objective here is to blend the best practices of several prominent agile approaches and non-agile approaches to form an innovative and improved MAD approach, which we refer to as the improved Agile and Lean-based MAD Approach (iALBMAD), and this approach was the improved upon our previous work, ALBMAD. Here, three aspects of improvement concerning the discovery of suitable app attributes and best practices at various MAD activities and strengthening requirement gathering activities are exploited. For this to be accomplished, first we determined different app attributes that affect the MAD, agile and non-agile best practices, and machine learning (ML) functioning in MAD from the accessible literature. Now, we have equipped ALBMAD with all these gained aspects as per their applicability and offered it to 18 MAD experts to obtain suggestions for its improvement. Considering the experts’ opinions, a three-layered approach, namely, iALBMAD, was developed. In iALBMAD, automation and an iterative cycle are established to meet finished needs; these revisions may boost the quality of requirements and minimize time. Specific and experts validated best practices and app attributes suitable for each activity of iALBMAD are offered, which will assist less-skilled developers. Thirteen users verified the usability of six teams’ apps created using three different approaches, and the results show that the iALBMAD performs better than other approaches. The suggested approach and the discoveries will provide insightful information for individuals and MAD firms aiming to improve the way of MAD.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143930134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knowledge-guided large language models are trustworthy API recommenders","authors":"Hongwei Wei, Xiaohong Su, Weining Zheng, Wenxing Tao, Hailong Yu, Yuqian Kuang","doi":"10.1007/s10515-025-00518-4","DOIUrl":"10.1007/s10515-025-00518-4","url":null,"abstract":"<div><p><b>A</b>pplication <b>P</b>rogramming <b>I</b>nterface (API) recommendation aims to recommend APIs for developers that meet their functional requirements, which can compensate for developers’ lack of API knowledge. In team-based software development, developers often need to implement functionality based on specific interface parameter types predefined by the software architect. Therefore, we propose <b>API</b> <b>R</b>ecommendation under specific <b>I</b>nterface <b>P</b>arameter Types (APIRIP), a special variant of the API recommendation task that requires the recommended APIs to conform to the interface parameter types. To realize APIRIP, we enlist the support of <b>L</b>arge <b>L</b>anguage <b>M</b>odels (LLMs). However, LLMs are susceptible to the phenomenon known as hallucination, wherein they may recommend untrustworthy API sequences. Instances of this include recommending fictitious APIs, APIs whose calling conditions cannot be satisfied, or API sequences that fail to conform to the interface parameter types. To mitigate these issues, we propose a <b>K</b>nowledge-<b>g</b>uided framework <b>for</b> <b>LLM</b>-based API Recommendation (KG4LLM), which incorporates knowledge-guided data augmentation and beam search. The core idea of KG4LLM is to leverage API knowledge derived from the <b>J</b>ava <b>D</b>evelopment <b>K</b>it (JDK) documentation to enhance the trustworthiness of LLM-generated recommendations. Experimental results demonstrate that KG4LLM can improve the trustworthiness of recommendation results provided by LLM and outperform advanced LLMs in the APIRIP task.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143913818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yonghui Liu, Xiao Chen, Yue Liu, Pingfan Kong, Tegawendé F. Bissyandé, Jacques Klein, Xiaoyu Sun, Li Li, Chunyang Chen, John Grundy
{"title":"A comparative study between android phone and TV apps","authors":"Yonghui Liu, Xiao Chen, Yue Liu, Pingfan Kong, Tegawendé F. Bissyandé, Jacques Klein, Xiaoyu Sun, Li Li, Chunyang Chen, John Grundy","doi":"10.1007/s10515-025-00514-8","DOIUrl":"10.1007/s10515-025-00514-8","url":null,"abstract":"<div><p>Smart TVs have surged in popularity, leading developers to create TV versions of mobile apps. Understanding the relationship between TV and mobile apps is key to building consistent, secure, and optimized cross-platform experiences while addressing TV-specific SDK challenges. Despite extensive research on mobile apps, TV apps have been given little attention, leaving the relationship between phone and TV apps unexplored. Our study addresses this gap by compiling an extensive collection of 3445 Android phone/TV app pairs from the Google Play Store, launching the first comparative analysis of its kind. We examined these pairs across multiple dimensions, including non-code elements, code structure, security, and privacy aspects. Our findings reveal that while these app pairs could get identified with the same package names, they deploy different artifacts with varying functionality across platforms. TV apps generally exhibit less complexity in terms of hardware-dependent features and code volume but maintain significant shared resource files and components with their phone versions. Interestingly, some categories of TV apps show similar or even severe security and privacy concerns compared to their mobile counterparts. This research aims to assist developers and researchers in understanding phone-TV app relationships, highlight domain-specific concerns necessitating TV-specific tools, and provide insights for migrating apps from mobile to TV platforms.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving prompt tuning-based software vulnerability assessment by fusing source code and vulnerability description","authors":"Jiyu Wang, Xiang Chen, Wenlong Pei, Shaoyu Yang","doi":"10.1007/s10515-025-00525-5","DOIUrl":"10.1007/s10515-025-00525-5","url":null,"abstract":"<div><p>To effectively allocate resources for vulnerability remediation, it is crucial to prioritize vulnerability fixes based on vulnerability severity. With the increasingnumber of vulnerabilities in recent years, there is an urgent need for automated methods for software vulnerability assessment (SVA). Most of the previous SVA studies mainly rely on traditional machine learning methods. Recently, fine-tuning pre-trained language models has emerged as an intuitive method for improving performance. However, there is a gap between pre-training and fine-tuning, and their performance heavily depends on the dataset’s quality of the downstream task. Therefore, we propose a prompt tuning-based method PT-SVA. Different from the fine-tuning paradigm, the prompt-tuning paradigm involves adding prompts to make the training process similar to pre-training, thereby better adapting to downstream tasks. Moreover, previous research aimed to automatically predict severity by only analyzing either the vulnerability descriptions or the source code of the vulnerability. Therefore, we further consider both types of vulnerability information for designing hybrid prompts (i.e., a combination of hard and soft prompts). To evaluate PT-SVA, we construct the SVA dataset based on the CVSS V3 standard, while previous SVA studies only consider the CVSS V2 standard. Experimental results show that PT-SVA outperforms ten state-of-the-art SVA baselines, such as by 13.7% to 42.1% in terms of MCC. Finally, our ablation experiments confirm the effectiveness of PT-SVA’s design, specifically in replacing fine-tuning with prompt tuning, incorporating both types of vulnerability information, and adopting hybrid prompts. Our promising results indicate that prompt tuning-based SVA is a promising direction and needs more follow-up studies.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mashal Afzal Memon, Gian Luca Scoccia, Marco Autili
{"title":"A systematic mapping study on automated negotiation for autonomous intelligent systems","authors":"Mashal Afzal Memon, Gian Luca Scoccia, Marco Autili","doi":"10.1007/s10515-025-00515-7","DOIUrl":"10.1007/s10515-025-00515-7","url":null,"abstract":"<div><p>Autonomous intelligent systems are known as artificial intelligence software entities that can act on their own and can take decisions without any human intervention. The communication between such systems to reach an agreement for problem-solving is known as automated negotiation. This study aims to systematically identify and analyze the literature on automated negotiation from four distinct viewpoints: (1) the existing literature on negotiation with focus on automation, (2) the specific purpose and application domain of the studies published in the domain of automated negotiation, (3) the input, and techniques used to model the negotiation process, and (4) the limitations of the state of the art and future research directions. For this purpose, we performed a systematic mapping study (SMS) starting from 73,760 potentially relevant studies belonging to 24 conference proceedings and 22 journal issues. Through a precise selection procedure, we identified 50 primary studies, published from the year 2000 onward, which were analyzed by applying a classification framework. As a result, we provide: (a) a classification framework to analyze the automated negotiation literature according to several parameters (e.g., focus of the paper, inputs required to carry on the negotiation process, techniques applied, and type of agents involved in the negotiation), (b) an up-to-date map of the literature specifying the purpose and application domain of each study, (c) a list of techniques used to automate the negotiation process and the list of input to carry out the negotiation, and (d) a discussion about promising challenges and their consequences for future research. We also provide a replication package to help researchers replicate and verify our systematic mapping study. The results and findings will benefit researchers and practitioners in identifying the research gap and conducting further research to bring dedicated solutions for automated negotiation.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10515-025-00515-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}