A distributed graph based approach for rough classifications considering dominance relations between overlapping classes

2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA) Pub Date : 2015-10-20 DOI:10.1109/SITA.2015.7358388

Khalil Laghmari, M. Ramdani, C. Marsala

{"title":"A distributed graph based approach for rough classifications considering dominance relations between overlapping classes","authors":"Khalil Laghmari, M. Ramdani, C. Marsala","doi":"10.1109/SITA.2015.7358388","DOIUrl":null,"url":null,"abstract":"Several data from real world applications involves overlapping classes. Data is allowed to belong to multiple classes with different membership degrees. In this paper, we explore a different concept characterizing social networks, documents, and most of biological and chemical datasets: data could have multiple classes, but dominant classes are better noticed than dominated classes. For example, a document could discuss economy and politics, but it would be more focused on politics. A molecule could have multiple odors, but experts could notice some odors better than others. We are interested in this type of data, where a dominance relation exists between classes. Experts could easily make mistakes because dominated classes are hardly noticed. Data incoherence is a serious problem but not the only one. There is too much irrelevant and redundant attributes. Unfortunately this increases the computational time of generating classifiers. Our first challenge is to find an adapted model to overlapping classes considering dominance relations. The second challenge is to find the most relevant attributes. Finally the third challenge is to ensure that the approach gives results in an acceptable time. We address those challenges by taking advantage of the rough set theory, which is suited for incoherent data and allows multiple classes and attributes selection. The proposed approach works in a parallel and decentralized way to reduce the computational time. We tested it on real chemical data and the collected results are very promising.","PeriodicalId":174405,"journal":{"name":"2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SITA.2015.7358388","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Several data from real world applications involves overlapping classes. Data is allowed to belong to multiple classes with different membership degrees. In this paper, we explore a different concept characterizing social networks, documents, and most of biological and chemical datasets: data could have multiple classes, but dominant classes are better noticed than dominated classes. For example, a document could discuss economy and politics, but it would be more focused on politics. A molecule could have multiple odors, but experts could notice some odors better than others. We are interested in this type of data, where a dominance relation exists between classes. Experts could easily make mistakes because dominated classes are hardly noticed. Data incoherence is a serious problem but not the only one. There is too much irrelevant and redundant attributes. Unfortunately this increases the computational time of generating classifiers. Our first challenge is to find an adapted model to overlapping classes considering dominance relations. The second challenge is to find the most relevant attributes. Finally the third challenge is to ensure that the approach gives results in an acceptable time. We address those challenges by taking advantage of the rough set theory, which is suited for incoherent data and allows multiple classes and attributes selection. The proposed approach works in a parallel and decentralized way to reduce the computational time. We tested it on real chemical data and the collected results are very promising.

查看原文本刊更多论文

一种考虑重叠类间优势关系的基于分布式图的粗略分类方法

来自实际应用程序的一些数据涉及到重叠的类。允许数据属于具有不同隶属度的多个类。在本文中，我们探索了描述社交网络、文档和大多数生物和化学数据集的不同概念:数据可以有多个类别，但主导类别比主导类别更受关注。例如，一份文件可以讨论经济和政治，但它更关注政治。一个分子可以有多种气味，但专家可以更好地注意到某些气味。我们对这类数据很感兴趣，因为这类数据在类别之间存在优势关系。专家很容易犯错误，因为占主导地位的班级几乎没有人注意到。数据不连贯是一个严重的问题，但不是唯一的问题。有太多不相关和冗余的属性。不幸的是，这增加了生成分类器的计算时间。我们的第一个挑战是找到一个适用于考虑优势关系的重叠类的模型。第二个挑战是找到最相关的属性。最后，第三个挑战是确保该方法在可接受的时间内产生结果。我们通过利用粗糙集理论来解决这些挑战，粗糙集理论适用于不连贯的数据，并允许多个类和属性选择。提出的方法以并行和分散的方式工作，以减少计算时间。我们对实际化学数据进行了测试，收集到的结果非常有希望。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA)

自引率

0.00%

发文量