Siyu Jiang, Zhenhang He, Yuwen Chen, Mingrong Zhang, Le Ma
{"title":"Mobile Application Online Cross-Project Just-in-Time Software Defect Prediction Framework","authors":"Siyu Jiang, Zhenhang He, Yuwen Chen, Mingrong Zhang, Le Ma","doi":"10.1145/3664607","DOIUrl":null,"url":null,"abstract":"<p>As mobile applications evolve rapidly, their fast iterative update nature leads to an increase in software defects. Just-In-Time Software Defect Prediction (JIT-SDP) offers immediate feedback on code changes. For new applications without historical data, researchers have proposed Cross-Project JIT-SDP (CP JIT-SDP). Existing CP JIT-SDP approaches are designed for offline scenarios where target data is available in advance. However, target data in real-world applications usually arrives online in a streaming manner, making online CP JIT-SDP face cross-project distribution differences and target project data concept drift challenges in online scenarios. These challenges often co-exist during application development, and their interactions cause model performance to degrade. To address these issues, we propose an online CP JIT-SDP framework called COTL. Specifically, COTL consists of two stages: offline and online. In offline stage, the cross-domain structure preserving projection algorithm is used to reduce the cross-project distribution differences. In online stage, target data arrives sequentially over time. By reducing the differences in marginal and conditional distributions between offline and online data for target project, concept drift is mitigated and classifier weights are updated online. Experimental results on 15 mobile application benchmark datasets show that COTL outperforms 13 benchmark methods on four performance metrics.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"151 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3664607","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
As mobile applications evolve rapidly, their fast iterative update nature leads to an increase in software defects. Just-In-Time Software Defect Prediction (JIT-SDP) offers immediate feedback on code changes. For new applications without historical data, researchers have proposed Cross-Project JIT-SDP (CP JIT-SDP). Existing CP JIT-SDP approaches are designed for offline scenarios where target data is available in advance. However, target data in real-world applications usually arrives online in a streaming manner, making online CP JIT-SDP face cross-project distribution differences and target project data concept drift challenges in online scenarios. These challenges often co-exist during application development, and their interactions cause model performance to degrade. To address these issues, we propose an online CP JIT-SDP framework called COTL. Specifically, COTL consists of two stages: offline and online. In offline stage, the cross-domain structure preserving projection algorithm is used to reduce the cross-project distribution differences. In online stage, target data arrives sequentially over time. By reducing the differences in marginal and conditional distributions between offline and online data for target project, concept drift is mitigated and classifier weights are updated online. Experimental results on 15 mobile application benchmark datasets show that COTL outperforms 13 benchmark methods on four performance metrics.
期刊介绍:
Designing and building a large, complex software system is a tremendous challenge. ACM Transactions on Software Engineering and Methodology (TOSEM) publishes papers on all aspects of that challenge: specification, design, development and maintenance. It covers tools and methodologies, languages, data structures, and algorithms. TOSEM also reports on successful efforts, noting practical lessons that can be scaled and transferred to other projects, and often looks at applications of innovative technologies. The tone is scholarly but readable; the content is worthy of study; the presentation is effective.