具有排他项隶属度的模糊共聚类划分质量研究

Katsuhiro Honda, Takaya Nakano, S. Ubukata, A. Notsu
{"title":"具有排他项隶属度的模糊共聚类划分质量研究","authors":"Katsuhiro Honda, Takaya Nakano, S. Ubukata, A. Notsu","doi":"10.1109/ICIEV.2015.7334058","DOIUrl":null,"url":null,"abstract":"Bag-of-Words data analysis is a fundamental issue in web data mining for Big Data utilization, and Co-clustering is often applied to cooccurrence information analysis in such problems of document-keyword association research. In probabilistic partition models such as Multinomial Mixtures and Fuzzy c-Means-type ones, different partition constraints are forced to rows (objects) and columns (items), and then item memberships may not be useful in revealing item partitions. A possible approach in clarifying the interpretability of item partitions is additional penalization for exclusive item memberships, which was shown to emphasize cluster-wise representative items in document analysis. In this paper, the utility of the penalization approach is further studied through comparisons of partition qualities with several benchmark data sets. Several experimental results show that the additional penalty may sometime contribute to slightly improving the partition quality in addition to improvement of interpretability of co-cluster partitions.","PeriodicalId":367355,"journal":{"name":"2015 International Conference on Informatics, Electronics & Vision (ICIEV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A study on partition quality of Fuzzy Co-clustering with exclusive item memberships\",\"authors\":\"Katsuhiro Honda, Takaya Nakano, S. Ubukata, A. Notsu\",\"doi\":\"10.1109/ICIEV.2015.7334058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bag-of-Words data analysis is a fundamental issue in web data mining for Big Data utilization, and Co-clustering is often applied to cooccurrence information analysis in such problems of document-keyword association research. In probabilistic partition models such as Multinomial Mixtures and Fuzzy c-Means-type ones, different partition constraints are forced to rows (objects) and columns (items), and then item memberships may not be useful in revealing item partitions. A possible approach in clarifying the interpretability of item partitions is additional penalization for exclusive item memberships, which was shown to emphasize cluster-wise representative items in document analysis. In this paper, the utility of the penalization approach is further studied through comparisons of partition qualities with several benchmark data sets. Several experimental results show that the additional penalty may sometime contribute to slightly improving the partition quality in addition to improvement of interpretability of co-cluster partitions.\",\"PeriodicalId\":367355,\"journal\":{\"name\":\"2015 International Conference on Informatics, Electronics & Vision (ICIEV)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Informatics, Electronics & Vision (ICIEV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIEV.2015.7334058\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Informatics, Electronics & Vision (ICIEV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIEV.2015.7334058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

词袋数据分析是面向大数据利用的web数据挖掘的基础问题,在文档-关键词关联研究这类问题中,协聚类常用于协现信息分析。在诸如多项式混合和模糊c均值类型的概率分区模型中,不同的分区约束被强制用于行(对象)和列(项目),然后项目成员关系可能对揭示项目分区没有用处。澄清项目划分的可解释性的一种可能方法是对排他性项目成员进行额外惩罚,这被证明在文件分析中强调集群明智的代表性项目。在本文中,通过与几个基准数据集的分区质量比较,进一步研究了惩罚方法的效用。一些实验结果表明,除了提高共簇分区的可解释性外,额外的惩罚有时可能有助于略微提高分区质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A study on partition quality of Fuzzy Co-clustering with exclusive item memberships
Bag-of-Words data analysis is a fundamental issue in web data mining for Big Data utilization, and Co-clustering is often applied to cooccurrence information analysis in such problems of document-keyword association research. In probabilistic partition models such as Multinomial Mixtures and Fuzzy c-Means-type ones, different partition constraints are forced to rows (objects) and columns (items), and then item memberships may not be useful in revealing item partitions. A possible approach in clarifying the interpretability of item partitions is additional penalization for exclusive item memberships, which was shown to emphasize cluster-wise representative items in document analysis. In this paper, the utility of the penalization approach is further studied through comparisons of partition qualities with several benchmark data sets. Several experimental results show that the additional penalty may sometime contribute to slightly improving the partition quality in addition to improvement of interpretability of co-cluster partitions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信