Improving the accuracy of duplicate bug report detection using textual similarity measures

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI:10.1145/2597073.2597088

A. Lazar, Sarah Ritchey, Bonita Sharif

引用次数: 71

Abstract

The paper describes an improved method for automatic duplicate bug report detection based on new textual similarity features and binary classification. Using a set of new textual features, inspired from recent text similarity research, we train several binary classification models. A case study was conducted on three open source systems: Eclipse, Open Office, and Mozilla to determine the effectiveness of the improved method. A comparison is also made with current state-of-the-art approaches highlighting similarities and differences. Results indicate that the accuracy of the proposed method is better than previously reported research with respect to all three systems.

查看原文本刊更多论文

使用文本相似性度量提高重复错误报告检测的准确性

本文提出了一种基于文本相似度特征和二元分类的重复错误自动检测方法。利用一组新的文本特征，从最近的文本相似度研究中获得灵感，我们训练了几个二元分类模型。在三个开源系统:Eclipse、open Office和Mozilla上进行了一个案例研究，以确定改进方法的有效性。还与当前最先进的方法进行了比较，突出了相似性和差异性。结果表明，所提出的方法的准确性优于先前报道的研究，就所有三个系统而言。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

自引率

0.00%

发文量