IEEE Transactions on Software Engineering最新文献_第7页

R2API: A Novel Method for Web API Recommendation by Using HGNNs With Multi-Task Learning R2API：一种基于hgnn和多任务学习的web API推荐新方法

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-24 DOI: 10.1109/TSE.2025.3592214

Yihui Wang;Xinrou Kang;Xun Li;Shanquan Gao

{"title":"R2API: A Novel Method for Web API Recommendation by Using HGNNs With Multi-Task Learning","authors":"Yihui Wang;Xinrou Kang;Xun Li;Shanquan Gao","doi":"10.1109/TSE.2025.3592214","DOIUrl":"10.1109/TSE.2025.3592214","url":null,"abstract":"Mashup is an application that implements specific functions by integrating one or more web APIs, which are capable of providing services or data on the Internet, thus avoiding the behavior of repeatedly building wheels. With the number of web APIs on various platforms being vast, identifying the suitable web APIs for mashups has become a challenging problem for developers. In this case, researchers propose many methods to recommend available web APIs to mashup developers according to their requirements. Given that the high-order interactions between data are crucial for the recommendation tasks, this work proposes a novel web API recommendation method called R2API. R2API constructs a series of homogeneous hypergraphs from historical data and then utilizes multiple HGNNs (Hypergraph Neural Networks) to learn the vectors for nodes in the hypergraphs. HGNN excels in capturing the high-order interactions between data while effectively mitigating the over-smoothing problem. To reduce the impact of noise and atypical features in historical data and enhance the quality of node vectors, R2API adopts a multi-task joint training strategy to train all HGNNs simultaneously. Meanwhile, R2API assigns semantic vectors to nodes in the hypergraphs during HGNN training to further improve the quality of node vectors. When faced with a specific requirement, R2API identifies its related mashup nodes in the hypergraphs and learns the requirement vector based on the vectors of these nodes, so as to complete the work of web API recommendation. Experiments conducted on the ProgrammableWeb and GitHub datasets show that R2API achieves superior performance compared to baseline methods.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 9","pages":"2549-2565"},"PeriodicalIF":5.6,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144702040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CGP-Tuning: Structure-Aware Soft Prompt Tuning for Code Vulnerability Detection cgp调优：针对代码漏洞检测的结构感知软提示调优

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-23 DOI: 10.1109/TSE.2025.3591934

Ruijun Feng;Hammond Pearce;Pietro Liguori;Yulei Sui

{"title":"CGP-Tuning: Structure-Aware Soft Prompt Tuning for Code Vulnerability Detection","authors":"Ruijun Feng;Hammond Pearce;Pietro Liguori;Yulei Sui","doi":"10.1109/TSE.2025.3591934","DOIUrl":"10.1109/TSE.2025.3591934","url":null,"abstract":"Large language models (LLMs) have been proposed as powerful tools for detecting software vulnerabilities, where task-specific fine-tuning is typically employed to provide vulnerability-specific knowledge to the LLMs. However, existing fine-tuning techniques often treat source code as plain text, losing the graph-based structural information inherent in code. Graph-enhanced soft prompt tuning addresses this by translating the structural information into contextual cues that the LLM can understand. However, current methods are primarily designed for general graph-related tasks and focus more on adjacency information, they fall short in preserving the rich semantic information (e.g., control/data flow) within code graphs. They also fail to ensure computational efficiency while capturing graph-text interactions in their cross-modal alignment module. This paper presents <bold>CGP-Tuning, a new code graph-enhanced, structure-aware soft prompt tuning method for vulnerability detection. CGP-Tuning introduces type-aware embeddings to capture the rich semantic information within code graphs, along with an efficient cross-modal alignment module that achieves linear computational costs while incorporating graph-text interactions. It is evaluated on the latest <italic>DiverseVul dataset and three advanced open-source code LLMs, CodeLlama, CodeGemma, and Qwen2.5-Coder. Experimental results show that CGP-Tuning delivers model-agnostic improvements and maintains practical inference speed, surpassing the best graph-enhanced soft prompt tuning baseline by an average of four percentage points and outperforming non-tuned zero-shot prompting by 15 percentage points.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 9","pages":"2533-2548"},"PeriodicalIF":5.6,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144694072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ACFix: Guiding LLMs With Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts ACFix：利用挖掘的常见RBAC实践指导法学硕士在智能合约中基于上下文感知的访问控制漏洞修复

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-22 DOI: 10.1109/TSE.2025.3590108

Lyuye Zhang;Kaixuan Li;Kairan Sun;Daoyuan Wu;Ye Liu;Haoye Tian;Yang Liu

{"title":"ACFix: Guiding LLMs With Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts","authors":"Lyuye Zhang;Kaixuan Li;Kairan Sun;Daoyuan Wu;Ye Liu;Haoye Tian;Yang Liu","doi":"10.1109/TSE.2025.3590108","DOIUrl":"10.1109/TSE.2025.3590108","url":null,"abstract":"Smart contracts are susceptible to various security issues, among which access control (AC) vulnerabilities are particularly critical. While existing research has proposed multiple detection tools, automatic and appropriate repair of AC vulnerabilities in smart contracts remains a challenge. Unlike commonly supported vulnerability types by existing repair tools, such as reentrancy, which are usually fixed by template-based approaches, the main obstacle of repairing AC vulnerabilities lies in identifying the appropriate roles or permissions amid a long list of non-AC-related source code to generate proper patch code, a task that demands human-level intelligence. In this paper, we employ the state-of-the-art GPT-4 model and enhance it with a novel approach called <sc>ACFix. The key insight is that we can mine common AC practices for major categories of code functionality and use them to guide LLMs in fixing code with similar functionality. To this end, <sc>ACFix involves offline and online phases. In the offline phase, <sc>ACFix mines a taxonomy of common Role-based Access Control practices from 344,251 on-chain contracts, categorizing 49 role-permission pairs from the top 1,000 unique samples. In the online phase, <sc>ACFix tracks AC-related elements across the contract and uses this context information along with a Chain-of-Thought pipeline to guide LLMs in identifying the most appropriate role-permission pair for the subject contract and subsequently generating a suitable patch. To evaluate <sc>ACFix, we built the first benchmark dataset of 118 real-world AC vulnerabilities, and our evaluation revealed that <sc>ACFix successfully repaired 94.92% of them, a major improvement compared to the baseline GPT-4 at only 52.54%. We also conducted a human study to understand the value of <sc>ACFix’s repairs and their differences from human repairs.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 9","pages":"2512-2532"},"PeriodicalIF":5.6,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144684611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Subgraphs as First-Class Citizens in Incident Management for Large-Scale Online Systems: An Evolution-Aware Framework 子图作为大规模在线系统事件管理中的一等公民：一个进化感知框架

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-17 DOI: 10.1109/TSE.2025.3590221

Zilong He;Pengfei Chen;Yu Luo;Qiuyu Yan;Hongyang Chen;Guangba Yu;Fangyuan Li;Xiaoyun Li;Zibin Zheng

{"title":"Subgraphs as First-Class Citizens in Incident Management for Large-Scale Online Systems: An Evolution-Aware Framework","authors":"Zilong He;Pengfei Chen;Yu Luo;Qiuyu Yan;Hongyang Chen;Guangba Yu;Fangyuan Li;Xiaoyun Li;Zibin Zheng","doi":"10.1109/TSE.2025.3590221","DOIUrl":"https://doi.org/10.1109/TSE.2025.3590221","url":null,"abstract":"With the ever-increasing scale and complexity of modern online systems, incidents are becoming inevitable, which seriously decreases the system availability and user satisfaction. To enhance incident management, many machine learning based techniques are proposed to automate incident detection and diagnosis. However, previous studies have mostly ignored the impact of evolution on the practicality of an incident management framework. Specifically, (1) The scale of modern online systems is continually evolving, but most state-of-the-art techniques are overly dependent on a continuous modelling of the entire system, and thus are less practical for online systems evolved to tens of thousands of services; (2) The volume of telemetry data is massively growing, while the number of incident records for learning is scarce and slowly generated (sometimes from zero), but prior techniques usually neglect this extreme imbalance in data volume evolution, and cannot support the life-cycle evolution (i.e., cold start and continual learning) of their developed models; (3) Prior techniques usually require operators to manually select a set of telemetry as inputs for incident diagnosis, but ignore how to automatically evolve this selection to continually improve diagnosis performance. These gaps stem from the unawareness of evolution, including the evolution of the target online system and the evolution of the built incident management models. To fill these gaps, we propose an evolution-aware incident management framework <sc>Gem. Specifically, considering the evolution of system scale and data volume, <sc>Gem continually refines the enormous real-time collected telemetry data into individual compact yet expressive graph-based representations, namely issue impact subgraphs, and treat them as the first-class citizens in incident management. Centered around these subgraphs, we design a couple of lifelong learning based graph analysis techniques to learn and evolve models for incident detection and diagnosis. We evaluate <sc>Gem using real-world data collected from the WeChat online system, the largest instant messaging software in China. The results confirm the effectiveness of <sc>Gem. Moreover, <sc>Gem is successfully deployed in WeChat, easing the burden of operators in handling a flood of issues and related telemetry data.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 9","pages":"2494-2511"},"PeriodicalIF":5.6,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145078594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prompt Alchemy: Automatic Prompt Refinement for Enhancing Code Generation 提示炼金术：用于增强代码生成的自动提示细化

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-16 DOI: 10.1109/TSE.2025.3589634

Sixiang Ye;Zeyu Sun;Guoqing Wang;Liwei Guo;Qingyuan Liang;Zheng Li;Yong Liu

{"title":"Prompt Alchemy: Automatic Prompt Refinement for Enhancing Code Generation","authors":"Sixiang Ye;Zeyu Sun;Guoqing Wang;Liwei Guo;Qingyuan Liang;Zheng Li;Yong Liu","doi":"10.1109/TSE.2025.3589634","DOIUrl":"https://doi.org/10.1109/TSE.2025.3589634","url":null,"abstract":"Code generation has gained increasing attention as a task to automate software development by transforming high-level descriptions into executable code. While large language models (LLMs) are effective in generating code, their performance heavily relies on the quality of input prompts. Current prompt engineering methods involve manual effort in designing prompts, which can be time-consuming and yield inconsistent results, potentially constraining the efficacy of LLMs in practical applications. This paper introduces Prochemy, a novel approach for automatically refining prompts iteratively to enhance code generation. Prochemy addresses the limitations of manual prompt engineering by automating the optimization process, ensuring prompt consistency during inference, and aligning with multi-agent systems. It iteratively refines prompts based on model performance, using an optimized final prompt to improve consistency and reliability across tasks. We evaluate Prochemy on both natural language-based code generation and code translation tasks using three series of LLMs. Results show that when combining Prochemy with existing approaches, it outperforms baseline prompting methods. It achieves improvements of 5.0% (GPT-3.5-Turbo) and 1.9% (GPT-4o) over zero-shot baselines on HumanEval. For the state-of-the-art LDB, Prochemy + LDB outperforms standalone methods by 1.2–1.8%. For code translation, Prochemy elevates GPT-4o’s performance on Java-to-Python (AVATAR) from 74.5 to 84.1 (+12.9%) and Python-to-Java from 66.8 to 78.2 (+17.1%). Furthermore, considering that the o1-mini model integrates prompt engineering techniques, Prochemy can continue to show good performance among it, further validating its effectiveness in code generation and translation tasks. Additionally, Prochemy is designed to be plug-and-play, optimizing prompts with minimal human intervention and seamlessly bridging the gap between simple prompts and complex frameworks.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 9","pages":"2472-2493"},"PeriodicalIF":5.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145078570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detecting Information Flow Security Vulnerabilities by Analysis Coupling 利用分析耦合检测信息流安全漏洞

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-16 DOI: 10.1109/TSE.2025.3589647

Frederik Reiche;Ralf Reussner;Robert Heinrich

{"title":"Detecting Information Flow Security Vulnerabilities by Analysis Coupling","authors":"Frederik Reiche;Ralf Reussner;Robert Heinrich","doi":"10.1109/TSE.2025.3589647","DOIUrl":"https://doi.org/10.1109/TSE.2025.3589647","url":null,"abstract":"Security vulnerabilities originating from insecure information flows can violate the confidentiality of data, thereby negatively impacting individuals and service providers. This challenge gave rise to design-level analyses and source code analyses investigating information flow-related vulnerabilities. Architectural analysis, a type of design-level analysis, can detect security vulnerabilities by inspecting architectural models enriched with specifications of security-relevant information. However, the implementation may not comply with the architectural specification during software evolution. This non-compliance can result in the architectural analysis missing vulnerabilities. Consequently, vulnerabilities in the deployed system can be exploited, but the software engineers are left assuming the system to be secure. In this article, we address this problem of specification-related non-compliance by proposing a coupling approach that enables architectural analyses to use the values of security characteristics which are supplied from the implementation and retrieved by static source code analysis. Our coupling approach makes two contributions: a coupling process and the conditions necessary for the coupling (called integration conditions). In our coupling process, each process step performs transformations between the involved input and output models of the analyses. To enable the coupling, we define necessary integration conditions that must hold between the (meta)models of the analyses in the coupling. We generalize from specific analyses by specifying the integration conditions based on reference metamodels. In our evaluation, we inspect (1) the coverage of the reference metamodels by the metamodels of coupled analyses, (2) the coverage of the integration conditions by successful couplings, and (3) the accuracy of the coupled analysis in finding architectural vulnerabilities originating from a non-compliant implementation. The results of our case study show that the reference metamodels and the integration conditions are covered. We detect 60 true positive vulnerabilities and 5 false positive vulnerabilities. Upon this evidence, we conclude that the architectural analysis in the coupling is accurate in detecting vulnerabilities originating from non-compliant information flows in the implementation.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 10","pages":"2710-2743"},"PeriodicalIF":5.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11082015","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145315521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Just-in-Time Prediction of Software Architectural Changes Through Commit-Level Analyses 通过委员会级分析及时预测软件架构变更

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-10 DOI: 10.1109/TSE.2025.3587849

Wenjing Zhan;Ran Mo;Yingjie Jiang;Dongyu Wang

{"title":"Just-in-Time Prediction of Software Architectural Changes Through Commit-Level Analyses","authors":"Wenjing Zhan;Ran Mo;Yingjie Jiang;Dongyu Wang","doi":"10.1109/TSE.2025.3587849","DOIUrl":"10.1109/TSE.2025.3587849","url":null,"abstract":"During software evolution, commits with various purposes, such as bug fixes, feature additions, improvements, etc., are continuously applied to software systems. This could drift software architecture from its planned design, and even cause architectural decay, that negatively affects software maintenance. Although prior studies have presented that even daily code commits could induce architectural changes, and the commit-level analysis has been widely used for multiple software comprehension and maintenance tasks, there is little work analyzing the architectural changes at the commit level. To bridge this gap, we conduct a study investigating the relationships between commits and architectural changes. Through our evaluation of thirty projects, we have shown that the architecture remains stable after most of the commits. However, there still exists a large portion of commits (27% of all studied commits) that have induced architectural changes, which deserve more attention. This further suggests the importance of analyzing architectural changes at the commit level. Meanwhile, we present a suite of commit-level metrics strongly correlated with architectural changes. Finally, we propose prediction models that can effectively forecast how much of the architecture would be changed after a commit.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 8","pages":"2285-2304"},"PeriodicalIF":5.6,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144603465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Impact of Prompt Programming on Function-Level Code Generation 提示式编程对函数级代码生成的影响

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-10 DOI: 10.1109/TSE.2025.3587794

Ranim Khojah;Francisco Gomes de Oliveira Neto;Mazen Mohamad;Philipp Leitner

{"title":"The Impact of Prompt Programming on Function-Level Code Generation","authors":"Ranim Khojah;Francisco Gomes de Oliveira Neto;Mazen Mohamad;Philipp Leitner","doi":"10.1109/TSE.2025.3587794","DOIUrl":"10.1109/TSE.2025.3587794","url":null,"abstract":"Large Language Models (LLMs) are increasingly used by software engineers for code generation. However, limitations of LLMs such as irrelevant or incorrect code have highlighted the need for prompt programming (or prompt engineering) where engineers apply specific prompt techniques (e.g., chain-of-thought or input-output examples) to improve the generated code. While some prompt techniques have been studied, the impact of different techniques — and their interactions — on code generation is still not fully understood. In this study, we introduce CodePromptEval, a dataset of 7072 prompts designed to evaluate five prompt techniques (few-shot, persona, chain-of-thought, function signature, list of packages) and their effect on the correctness, similarity, and quality of complete functions generated by three LLMs (GPT-4o, Llama3, and Mistral). Our findings show that while certain prompt techniques significantly influence the generated code, combining multiple techniques does not necessarily improve the outcome. Additionally, we observed a trade-off between correctness and quality when using prompt techniques. Our dataset and replication package enable future research on improving LLM-generated code and evaluating new prompt techniques.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 8","pages":"2381-2395"},"PeriodicalIF":5.6,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11077752","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144603464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Practitioners’ Expectations on Log Anomaly Detection 从业者对日志异常检测的期望

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-08 DOI: 10.1109/TSE.2025.3586700

Xiaoxue Ma;Yishu Li;Jacky Keung;Xiao Yu;Huiqi Zou;Zhen Yang;Federica Sarro;Earl T. Barr

引用次数: 0

Open Source, Hidden Costs: A Systematic Literature Review on OSS License Management 开源，隐藏的成本：关于OSS许可证管理的系统文献综述

IF 5.6 1区计算机科学

IEEE Transactions on Software Engineering Pub Date : 2025-07-07 DOI: 10.1109/TSE.2025.3586411

Boyuan Li;Chengwei Liu;Lingling Fan;Sen Chen;Zhenlin Zhang;Zheli Liu

{"title":"Open Source, Hidden Costs: A Systematic Literature Review on OSS License Management","authors":"Boyuan Li;Chengwei Liu;Lingling Fan;Sen Chen;Zhenlin Zhang;Zheli Liu","doi":"10.1109/TSE.2025.3586411","DOIUrl":"10.1109/TSE.2025.3586411","url":null,"abstract":"Integrating third-party software components is a common practice in modern software development, offering significant advantages in terms of efficiency and innovation. However, this practice is fraught with risks related to software licensing. A lack of understanding may lead to disputes, which can pose serious legal and operational challenges. To these ends, both academia and industry have conducted various investigations and proposed solutions and tools to deal with these challenges. However, significant limitations still remain. Moreover, the rapid evolution of open-source software (OSS) licenses, as well as the rapidly incorporated generative software engineering techniques, such as large language models for code (CodeLLMs), are placing greater demands on the systematic management of software license risks. To unveil the severe challenges and explore possible future directions, we conduct the first systematic literature review (SLR) on 80 carefully selected OSS license-related papers, classifying existing research into three key categories, i.e., license identification, license risk assessment, and license risk mitigation. Based on these, we discuss challenges in existing solutions, conclude the opportunities to shed light on future research directions and offer practical recommendations for practitioners. We hope this thorough review will help bridge the gaps between academia and industry and accelerate the ecosystem-wide governance of legitimate software risks within the software engineering community.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 9","pages":"2432-2454"},"PeriodicalIF":5.6,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144578245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0