Identifying Similar Software Datasets through Fuzzy Inference System

2012 10th International Conference on Frontiers of Information Technology Pub Date : 2012-12-17 DOI:10.1109/FIT.2012.40

S. Anwar, Z. Rana, M. Awais

引用次数: 0

Abstract

Similar software have similar software measurements. Defect data from one software can be used to anticipate defects in a similar software. Although, not many defect datasets are made public in software engineering domain, PROMISE repository is a reasonable collection of software data. This paper presents a two step approach to identify similar software and applies the proposed technique to find similar datasets in PROMISE repository. As step 1, the approach generates associations rules for each dataset to determine dataset's behavior in terms of frequent patterns. As step 2, overlap between the association rules is calculated using Fuzzy Inference Systems (FIS). The FIS generated for the study have been expert-based as well as auto-generated. Similarity between 28 dataset pairs has been found KC2 and PC1 turned out to be most similar datasets with 86% similarity using Mamdani, 92% with Sugeno models. Results from expert-based and auto generated FIS have been comparable.

查看原文本刊更多论文

利用模糊推理系统识别相似软件数据集

类似的软件有类似的软件度量。来自一个软件的缺陷数据可以用来预测类似软件中的缺陷。虽然在软件工程领域中没有很多缺陷数据集是公开的，但是PROMISE存储库是一个合理的软件数据集合。本文提出了一种两步识别相似软件的方法，并应用该方法在PROMISE存储库中找到相似的数据集。作为步骤1，该方法为每个数据集生成关联规则，以确定数据集在频繁模式方面的行为。第二步，使用模糊推理系统(FIS)计算关联规则之间的重叠。为研究生成的FIS是基于专家的，也是自动生成的。28对数据集之间的相似性发现KC2和PC1是最相似的数据集，使用Mamdani模型相似度为86%，使用Sugeno模型相似度为92%。基于专家和自动生成的FIS的结果具有可比性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 10th International Conference on Frontiers of Information Technology

自引率

0.00%

发文量