基于混合多属性关系的信息挖掘文档聚类方法

S. Tejasree, B. Chandramohan
{"title":"基于混合多属性关系的信息挖掘文档聚类方法","authors":"S. Tejasree, B. Chandramohan","doi":"10.17762/ijcnis.v14i1s.5596","DOIUrl":null,"url":null,"abstract":"Text clustering has been widely utilized with the aim of partitioning speci?c documents’ collection into different subsets using homogeneity/heterogeneity criteria. It has also become a very complicated area of research, including pattern recognition, information retrieval, and text mining. In the applications of enterprises, information mining faces challenges due to the complex distribution of data by an enormous number of different sources. Most of these information sources are from different domains which create difficulties in identifying the relationships among the information. In this case, a single method for clustering limits related information, while enhancing computational overheadsand processing times. Hence, identifying suitable clustering models for unsupervised learning is a challenge, specifically in the case of MultipleAttributesin data distributions. In recent works attribute relation based solutions are given significant importance to suggest the document clustering. To enhance further, in this paper, Hybrid Multi Attribute Relation Methods (HMARs) are presented for attribute selections and relation analyses of co-clustering of datasets. The proposed HMARs allowanalysis of distributed attributes in documents in the form of probabilistic attribute relations using modified Bayesian mechanisms. It also provides solutionsfor identifying most related attribute model for the multiple attribute documents clustering accurately. An experimental evaluation is performed to evaluate the clustering purity and normalization of the information utilizing UCI Data repository which shows 25% better when compared with the previous techniques.","PeriodicalId":232613,"journal":{"name":"Int. J. Commun. Networks Inf. Secur.","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hybrid Multi Attribute Relation Method for Document Clustering for Information Mining\",\"authors\":\"S. Tejasree, B. Chandramohan\",\"doi\":\"10.17762/ijcnis.v14i1s.5596\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text clustering has been widely utilized with the aim of partitioning speci?c documents’ collection into different subsets using homogeneity/heterogeneity criteria. It has also become a very complicated area of research, including pattern recognition, information retrieval, and text mining. In the applications of enterprises, information mining faces challenges due to the complex distribution of data by an enormous number of different sources. Most of these information sources are from different domains which create difficulties in identifying the relationships among the information. In this case, a single method for clustering limits related information, while enhancing computational overheadsand processing times. Hence, identifying suitable clustering models for unsupervised learning is a challenge, specifically in the case of MultipleAttributesin data distributions. In recent works attribute relation based solutions are given significant importance to suggest the document clustering. To enhance further, in this paper, Hybrid Multi Attribute Relation Methods (HMARs) are presented for attribute selections and relation analyses of co-clustering of datasets. The proposed HMARs allowanalysis of distributed attributes in documents in the form of probabilistic attribute relations using modified Bayesian mechanisms. It also provides solutionsfor identifying most related attribute model for the multiple attribute documents clustering accurately. An experimental evaluation is performed to evaluate the clustering purity and normalization of the information utilizing UCI Data repository which shows 25% better when compared with the previous techniques.\",\"PeriodicalId\":232613,\"journal\":{\"name\":\"Int. J. Commun. Networks Inf. Secur.\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Commun. Networks Inf. Secur.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17762/ijcnis.v14i1s.5596\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Commun. Networks Inf. Secur.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17762/ijcnis.v14i1s.5596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

文本聚类已被广泛应用于分区的目的。使用同质性/异质性标准将C文档收集到不同的子集。它也成为一个非常复杂的研究领域,包括模式识别、信息检索和文本挖掘。在企业应用中,由于大量不同来源的数据分布复杂,信息挖掘面临着挑战。这些信息源大多来自不同的领域,这给识别信息之间的关系带来了困难。在这种情况下,聚类的单一方法限制了相关信息,同时提高了计算开销和处理时间。因此,为无监督学习确定合适的聚类模型是一个挑战,特别是在数据分布中的multiattributesin情况下。近年来,基于属性关系的聚类方法得到了广泛的应用。为此,本文提出了混合多属性关系方法(HMARs),用于数据集共聚类的属性选择和关系分析。提出的HMARs允许使用改进的贝叶斯机制以概率属性关系的形式分析文档中的分布式属性。为准确识别多属性文档聚类中最相关的属性模型提供了解决方案。利用UCI数据存储库对信息的聚类纯度和归一化进行了实验评估,结果表明,与以前的技术相比,UCI数据存储库的聚类纯度和归一化程度提高了25%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hybrid Multi Attribute Relation Method for Document Clustering for Information Mining
Text clustering has been widely utilized with the aim of partitioning speci?c documents’ collection into different subsets using homogeneity/heterogeneity criteria. It has also become a very complicated area of research, including pattern recognition, information retrieval, and text mining. In the applications of enterprises, information mining faces challenges due to the complex distribution of data by an enormous number of different sources. Most of these information sources are from different domains which create difficulties in identifying the relationships among the information. In this case, a single method for clustering limits related information, while enhancing computational overheadsand processing times. Hence, identifying suitable clustering models for unsupervised learning is a challenge, specifically in the case of MultipleAttributesin data distributions. In recent works attribute relation based solutions are given significant importance to suggest the document clustering. To enhance further, in this paper, Hybrid Multi Attribute Relation Methods (HMARs) are presented for attribute selections and relation analyses of co-clustering of datasets. The proposed HMARs allowanalysis of distributed attributes in documents in the form of probabilistic attribute relations using modified Bayesian mechanisms. It also provides solutionsfor identifying most related attribute model for the multiple attribute documents clustering accurately. An experimental evaluation is performed to evaluate the clustering purity and normalization of the information utilizing UCI Data repository which shows 25% better when compared with the previous techniques.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信