Optimized Representation for Classifying Qualitative Data

2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications Pub Date : 2010-04-11 DOI:10.1109/DBKDA.2010.26

M. Cadot, A. Lelu

引用次数: 2

Abstract

Extracting knowledge out of qualitative data is an ever-growing issue in our networking world. Opposite to the widespread trend consisting of extending general classification methods to zero/one-valued qualitative variables, we explore here another path: we first build a specific representation for these data, respectful of the non-occurrence as well as presence of an item, and making the interactions between variables explicit. Combinatorics considerations in our Midova expansion method limit the proliferation of itemsets when building level k+1 on level k, and limit the maximal level K. We validate our approach on three of the public access datasets of University of California, Irvine, repository: our generalization accuracy is equal or better than the best reported one, to our knowledge, on Breast Cancer and TicTacToe datasets, honorable on Monks-2 near-parity problem.

查看原文本刊更多论文

定性数据分类的优化表示

在我们的网络世界中，从定性数据中提取知识是一个日益增长的问题。与将一般分类方法扩展到零值/一值定性变量的普遍趋势相反，我们在这里探索了另一条路径:我们首先为这些数据建立一个特定的表示，尊重不发生和存在的项目，并使变量之间的相互作用显式。我们在加州大学欧文分校的三个公共访问数据集上验证了我们的方法:据我们所知，在乳腺癌和TicTacToe数据集上，我们的泛化精度等于或优于报道的最佳数据集，在Monks-2近奇偶性问题上，我们的精度是最高的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications

自引率

0.00%

发文量