A. Abdel-Halim, Mustafa Abdel-Azim Mustafa, Khaled El-Bahnasy
{"title":"A Proposed Rough Set Based Case Base Partitioning Approach to Enhance Indexing Using Unique Combinations","authors":"A. Abdel-Halim, Mustafa Abdel-Azim Mustafa, Khaled El-Bahnasy","doi":"10.1109/ICICIS46948.2019.9014691","DOIUrl":null,"url":null,"abstract":"In this research, a rough set-based case base partitioning approach is proposed for enhancing the process of discovering all unique feature combinations (UFCs) for each decision in a case base, which were used as an index for the case base with high accuracy. Discovering all UFCs is an NP-hard problem, which requires-in principle-to verify an exponential number of feature combinations for uniqueness on all data values. UFC-discovery techniques depend on entire case base as a single search space (SS), causing poor flexibility and making parallelization and distribution hard. Moreover, high complexities of some decisions may impede the whole process, and could be an obstacle for computing all UFCs for decisions with low complexities. Achieving efficiency and scalability in this context is a tremendous challenge by itself. The proposed approach divides case base into independent, clean and complete SSs; one for each decision. Each decision's SS is free from useless rules; and is used independently to discover all UFCs for the decision. The approach is designed and implemented using MapReduce to be applicable to large case bases. The validity of the proposed approach is proved mathematically. Experimental evaluation showed that SSs were created successfully and that the accuracy and results of UFC-discovery technique were not affected after partitioning. Applying UFC-discovery technique on decisions' SSs sequentially using single machine performed at 80.8% better than before partitioning. It is possible to reach 96.5% reduction in execution time when applying UFC-discovery technique on all decisions' SSs simultaneously and parallely.","PeriodicalId":200604,"journal":{"name":"2019 Ninth International Conference on Intelligent Computing and Information Systems (ICICIS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Ninth International Conference on Intelligent Computing and Information Systems (ICICIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICIS46948.2019.9014691","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this research, a rough set-based case base partitioning approach is proposed for enhancing the process of discovering all unique feature combinations (UFCs) for each decision in a case base, which were used as an index for the case base with high accuracy. Discovering all UFCs is an NP-hard problem, which requires-in principle-to verify an exponential number of feature combinations for uniqueness on all data values. UFC-discovery techniques depend on entire case base as a single search space (SS), causing poor flexibility and making parallelization and distribution hard. Moreover, high complexities of some decisions may impede the whole process, and could be an obstacle for computing all UFCs for decisions with low complexities. Achieving efficiency and scalability in this context is a tremendous challenge by itself. The proposed approach divides case base into independent, clean and complete SSs; one for each decision. Each decision's SS is free from useless rules; and is used independently to discover all UFCs for the decision. The approach is designed and implemented using MapReduce to be applicable to large case bases. The validity of the proposed approach is proved mathematically. Experimental evaluation showed that SSs were created successfully and that the accuracy and results of UFC-discovery technique were not affected after partitioning. Applying UFC-discovery technique on decisions' SSs sequentially using single machine performed at 80.8% better than before partitioning. It is possible to reach 96.5% reduction in execution time when applying UFC-discovery technique on all decisions' SSs simultaneously and parallely.