{"title":"Are All Duplicates Value-Neutral? An Empirical Analysis of Duplicate Issue Reports","authors":"Mingyang Li, Lin Shi, Qing Wang","doi":"10.1109/QRS.2019.00043","DOIUrl":null,"url":null,"abstract":"In open source communities, there are numerous duplicate issue reports, considered as useless and negligible by developers. Conversely, some researches argued that duplicates deliver complementary information that could benefit issueresolving. Considering all duplicates as value-neutral will result in either overestimation or underestimation of valuable information. It is necessary to be aware of whether all duplicates are redundant or beneficial. In this paper, we investigate whether duplicates have the same impacts on issue resolving and identification cost. We divide duplicates into three categories according to the statuses of master reports when duplicates are submitted. The results show duplicates in different categories play different roles in issue-resolving, and identification cost is also significantly different. Our study reveals duplicates are different, but almost are paid equal attentions. It is promising to propose new approaches and tools to resolve the problem.","PeriodicalId":122665,"journal":{"name":"2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS.2019.00043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In open source communities, there are numerous duplicate issue reports, considered as useless and negligible by developers. Conversely, some researches argued that duplicates deliver complementary information that could benefit issueresolving. Considering all duplicates as value-neutral will result in either overestimation or underestimation of valuable information. It is necessary to be aware of whether all duplicates are redundant or beneficial. In this paper, we investigate whether duplicates have the same impacts on issue resolving and identification cost. We divide duplicates into three categories according to the statuses of master reports when duplicates are submitted. The results show duplicates in different categories play different roles in issue-resolving, and identification cost is also significantly different. Our study reveals duplicates are different, but almost are paid equal attentions. It is promising to propose new approaches and tools to resolve the problem.