An Approach to Detecting Duplicate Bug Reports using N-gram Features and Cluster Chrinkage Technique

IJSRP, Volume 4, Issue 5, May 2014 Edition [ISSN 2250-3153]

An Approach to Detecting Duplicate Bug Reports using N-gram Features and Cluster Chrinkage Technique

Phuc Nhan Minh

Abstract: Duplicate bug report describes problems for which there is already a report in a bug repository. For many open source projects, the number of duplicate reports represents a significant percentage of the repository, so automatic identification of duplicate reports are very important and need let’s avoid wasting time a triager spends in searching for duplicate bug reports of any incoming report. In this paper we want to present a novel approach which it can help better of duplicate bug report identification. The proposed approach has two novel features: firstly, use n-gram features for the task of duplicate bug report detection. Secondly, apply cluster shrinkage technique to improve the detection performance. We tested our approach on three popular open source projects: Apache, Argo UML, and SVN. We have also conducted empirical studies. The experimental results show that the proposed scheme can effectively improve the detection performance compared with previous methods.

[VIEW FULL PAPER]

[DOWNLOAD]

[Reference this Paper] [BACK]

Reference this Research Paper (copy & paste below code):

Phuc Nhan Minh (2018); An Approach to Detecting Duplicate Bug Reports using N-gram Features and Cluster Chrinkage Technique; Int J Sci Res Publ 4(5) (ISSN: 2250-3153). http://www.ijsrp.org/research-paper-0514.php?rp=P292616