A New Graph-Based Algorithm for Clustering Documents

Airel Pérez Suárez, José Francisco Martínez Trinidad, J. A. Carrasco-Ochoa, J. Medina-Pagola
{"title":"A New Graph-Based Algorithm for Clustering Documents","authors":"Airel Pérez Suárez, José Francisco Martínez Trinidad, J. A. Carrasco-Ochoa, J. Medina-Pagola","doi":"10.1109/ICDMW.2008.69","DOIUrl":null,"url":null,"abstract":"In this paper a new algorithm, called CStar, for document clustering is presented. This algorithm improves recently developed algorithms like generalized star (GStar) and ACONS algorithms, originally proposed for reducing some drawbacks presented in previous Star-like algorithms.The CStar algorithm uses the condensed star-shaped sub-graph concept defined by ACONS, but defines a new heuristic that allows to construct a new cover of the thresholded similarity graph and to reduce the drawbacks presented in GStar and ACONS algorithms. The experimentation over standard document collections shows that our proposal outperforms previously defined algorithms and other related algorithms used to document clustering.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Data Mining Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2008.69","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

In this paper a new algorithm, called CStar, for document clustering is presented. This algorithm improves recently developed algorithms like generalized star (GStar) and ACONS algorithms, originally proposed for reducing some drawbacks presented in previous Star-like algorithms.The CStar algorithm uses the condensed star-shaped sub-graph concept defined by ACONS, but defines a new heuristic that allows to construct a new cover of the thresholded similarity graph and to reduce the drawbacks presented in GStar and ACONS algorithms. The experimentation over standard document collections shows that our proposal outperforms previously defined algorithms and other related algorithms used to document clustering.
基于图的文档聚类新算法
本文提出了一种新的文档聚类算法CStar。该算法改进了最近开发的算法,如广义星(GStar)和ACONS算法,最初是为了减少以前的类星算法中存在的一些缺陷而提出的。CStar算法使用了ACONS定义的精简星形子图概念,但定义了一种新的启发式方法,允许构造阈值相似图的新覆盖,并减少了GStar和ACONS算法中存在的缺陷。在标准文档集合上的实验表明,我们的建议优于先前定义的算法和用于文档聚类的其他相关算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信