{"title":"Separating the Wheat from the Chaff: Using Indexing and Sub-Sequence Mining Techniques to Identify Related Crashes During Bug Triage","authors":"Kedrian James, Yufei Du, Sanjeev Das, F. Monrose","doi":"10.1109/QRS57517.2022.00014","DOIUrl":null,"url":null,"abstract":"Bug triaging entails a laborious process wherein triagers spend time examining new bug reports, localizing the bugs, and assigning them to the appropriate developer(s) to fix the bugs. In recent years, the adoption of automated software testing techniques (e.g., fuzzing) further complicates the process because bug hunters can submit an overwhelming number of reports in a short period. To lessen these pain points, we present an approach that extracts a fingerprint from crash information within a bug report, and returns a group of bugs with similar behaviors. Our approach uses symptoms of the crash to create a robust fingerprint, and leverages MinHashing and Locality Sensitive Hashing to match crashes, as well as a sequential pattern mining algorithm to find frequent closed sequences among bugs. Our evaluation shows that our approach outperforms contemporary approaches (e.g., finding previously unknown duplicates among 81 CVEs), and saves triagers time and effort.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS57517.2022.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Bug triaging entails a laborious process wherein triagers spend time examining new bug reports, localizing the bugs, and assigning them to the appropriate developer(s) to fix the bugs. In recent years, the adoption of automated software testing techniques (e.g., fuzzing) further complicates the process because bug hunters can submit an overwhelming number of reports in a short period. To lessen these pain points, we present an approach that extracts a fingerprint from crash information within a bug report, and returns a group of bugs with similar behaviors. Our approach uses symptoms of the crash to create a robust fingerprint, and leverages MinHashing and Locality Sensitive Hashing to match crashes, as well as a sequential pattern mining algorithm to find frequent closed sequences among bugs. Our evaluation shows that our approach outperforms contemporary approaches (e.g., finding previously unknown duplicates among 81 CVEs), and saves triagers time and effort.