{"title":"基于自适应加权相似度的文本文档聚类改进元启发式模型","authors":"Gugulothu Venkanna, K. F. Bharati","doi":"10.1142/s0218488523500356","DOIUrl":null,"url":null,"abstract":"This paper intends to develop a novel framework for text document clustering with the aid of a new improved meta-heuristic algorithm. Initially, the features are selected from the text document by subjecting each word under Term Frequency-Inverse Document Frequency (TF-IDF) computation. Subsequently, centroid selection plays a vital role in cluster formation, which is done using a new Improved Lion Algorithm (LA) termed as Cross over probability-based LA model (CP-LA). As a novelty, this paper introduced a new inter and intracluster similarity model. Moreover, this centroid selection is made in such a way that the proposed adaptive weighted similarity should be minimal. Based on the characteristics of the document, the weights are automatically adapted with the similarity measure. The proposed adaptive weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents. Finally, the superiority of the proposed over other models is proved under different performance measures.","PeriodicalId":50283,"journal":{"name":"International Journal of Uncertainty Fuzziness and Knowledge-Based Systems","volume":"37 1","pages":"0"},"PeriodicalIF":1.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved Meta-Heuristic Model for Text Document Clustering by Adaptive Weighted Similarity\",\"authors\":\"Gugulothu Venkanna, K. F. Bharati\",\"doi\":\"10.1142/s0218488523500356\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper intends to develop a novel framework for text document clustering with the aid of a new improved meta-heuristic algorithm. Initially, the features are selected from the text document by subjecting each word under Term Frequency-Inverse Document Frequency (TF-IDF) computation. Subsequently, centroid selection plays a vital role in cluster formation, which is done using a new Improved Lion Algorithm (LA) termed as Cross over probability-based LA model (CP-LA). As a novelty, this paper introduced a new inter and intracluster similarity model. Moreover, this centroid selection is made in such a way that the proposed adaptive weighted similarity should be minimal. Based on the characteristics of the document, the weights are automatically adapted with the similarity measure. The proposed adaptive weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents. Finally, the superiority of the proposed over other models is proved under different performance measures.\",\"PeriodicalId\":50283,\"journal\":{\"name\":\"International Journal of Uncertainty Fuzziness and Knowledge-Based Systems\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Uncertainty Fuzziness and Knowledge-Based Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/s0218488523500356\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Uncertainty Fuzziness and Knowledge-Based Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0218488523500356","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Improved Meta-Heuristic Model for Text Document Clustering by Adaptive Weighted Similarity
This paper intends to develop a novel framework for text document clustering with the aid of a new improved meta-heuristic algorithm. Initially, the features are selected from the text document by subjecting each word under Term Frequency-Inverse Document Frequency (TF-IDF) computation. Subsequently, centroid selection plays a vital role in cluster formation, which is done using a new Improved Lion Algorithm (LA) termed as Cross over probability-based LA model (CP-LA). As a novelty, this paper introduced a new inter and intracluster similarity model. Moreover, this centroid selection is made in such a way that the proposed adaptive weighted similarity should be minimal. Based on the characteristics of the document, the weights are automatically adapted with the similarity measure. The proposed adaptive weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents. Finally, the superiority of the proposed over other models is proved under different performance measures.
期刊介绍:
The International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems is a forum for research on various methodologies for the management of imprecise, vague, uncertain or incomplete information. The aim of the journal is to promote theoretical or methodological works dealing with all kinds of methods to represent and manipulate imperfectly described pieces of knowledge, excluding results on pure mathematics or simple applications of existing theoretical results. It is published bimonthly, with worldwide distribution to researchers, engineers, decision-makers, and educators.