Xiaojuan Wang, Wenyu Zhang, Shanyan Lai, Chunyang Ye, Hui Zhou
{"title":"The Use of Pretrained Model for Matching App Reviews and Bug Reports","authors":"Xiaojuan Wang, Wenyu Zhang, Shanyan Lai, Chunyang Ye, Hui Zhou","doi":"10.1109/QRS57517.2022.00034","DOIUrl":null,"url":null,"abstract":"Matching APP reviews with bug reports can help APP developers to quickly identify new bugs from the users’ feedback. Existing solutions represent the semantics of APP reviews and bug reports via carefully designed features and models, the performance of which however depends heavily on the manually designed model and the training data set. Large-scale pretrained models can well capture the semantics of text and have demonstrated their success in many NLP tasks. Inspired by this, we explore the effect of various pretrained models on the matching accuracy of app review and bug report. We conduct a systematic study to analyze the factors of four major pretrained models (including T5, Sentence T5, Sentence MiniLM, Sentence BERT and so on) on the matching accuracy. We find that the accuracy of Sentence T5 and Sentence MiniLM in four open source applications is significantly greater than that of the state-of-the-art approach DeepMatcher. Based on the findings, we design a novel approach to match the APP reviews with bug reports based on the pretrained model Sentence T5 and Sentence MiniLM to calculate the sentence similarity. We test it on four open source applications and the results show that our method outperforms the existing solution. On average, the precision of Sentence T5 and Sentence MiniLM are increased by 17% and 13%, respectively, and the hit ratio are increased by 15% and 14%, respectively.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS57517.2022.00034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Matching APP reviews with bug reports can help APP developers to quickly identify new bugs from the users’ feedback. Existing solutions represent the semantics of APP reviews and bug reports via carefully designed features and models, the performance of which however depends heavily on the manually designed model and the training data set. Large-scale pretrained models can well capture the semantics of text and have demonstrated their success in many NLP tasks. Inspired by this, we explore the effect of various pretrained models on the matching accuracy of app review and bug report. We conduct a systematic study to analyze the factors of four major pretrained models (including T5, Sentence T5, Sentence MiniLM, Sentence BERT and so on) on the matching accuracy. We find that the accuracy of Sentence T5 and Sentence MiniLM in four open source applications is significantly greater than that of the state-of-the-art approach DeepMatcher. Based on the findings, we design a novel approach to match the APP reviews with bug reports based on the pretrained model Sentence T5 and Sentence MiniLM to calculate the sentence similarity. We test it on four open source applications and the results show that our method outperforms the existing solution. On average, the precision of Sentence T5 and Sentence MiniLM are increased by 17% and 13%, respectively, and the hit ratio are increased by 15% and 14%, respectively.