Satyabrata Pradhan, Venky Nanniyur, Pavan K. Vissapragada
{"title":"On the Defect Prediction for Large Scale Software Systems – From Defect Density to Machine Learning","authors":"Satyabrata Pradhan, Venky Nanniyur, Pavan K. Vissapragada","doi":"10.1109/QRS51102.2020.00056","DOIUrl":null,"url":null,"abstract":"As the software industry transitions to software-as-a-service (SAAS) model, there has been tremendous competitive pressure on companies to improve software quality at a much faster rate than before. The software defect prediction (SDP) plays an important role in this effort by enabling predictive quality management during the entire software development lifecycle (SDLC). The SDP has traditionally used defect density and other parametric models. However, recent advances in machine learning and artificial intelligence (ML/AI) have created a renewed interest in ML-based defect prediction among academic researchers and industry practitioners. Published studies on this subject have focused on two areas, i.e. model attributes and ML algorithms, to develop SDP models for small to medium sized software (mostly opensource). However, as we present in this paper, ML-based SDP for large scale software with hundreds of millions of lines of code (LOC) needs to address challenges in additional areas called \"Data Definition\" and \"SDP Lifecycle.\" We have proposed solutions for these challenges and used the example of a large-scale software (IOS-XE) developed by Cisco Systems to show the validity of our solutions.","PeriodicalId":301814,"journal":{"name":"2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS51102.2020.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
As the software industry transitions to software-as-a-service (SAAS) model, there has been tremendous competitive pressure on companies to improve software quality at a much faster rate than before. The software defect prediction (SDP) plays an important role in this effort by enabling predictive quality management during the entire software development lifecycle (SDLC). The SDP has traditionally used defect density and other parametric models. However, recent advances in machine learning and artificial intelligence (ML/AI) have created a renewed interest in ML-based defect prediction among academic researchers and industry practitioners. Published studies on this subject have focused on two areas, i.e. model attributes and ML algorithms, to develop SDP models for small to medium sized software (mostly opensource). However, as we present in this paper, ML-based SDP for large scale software with hundreds of millions of lines of code (LOC) needs to address challenges in additional areas called "Data Definition" and "SDP Lifecycle." We have proposed solutions for these challenges and used the example of a large-scale software (IOS-XE) developed by Cisco Systems to show the validity of our solutions.