{"title":"Correlated Topic Modeling for Short Texts in Spherical Embedding Spaces.","authors":"Hafsa Ennajari, Nizar Bouguila, Jamal Bentahar","doi":"10.1109/TPAMI.2025.3550032","DOIUrl":null,"url":null,"abstract":"<p><p>With the prevalence of short texts in various forms such as news headlines, tweets, and reviews, short text analysis has gained significant interest in recent times. However, modeling short texts remains a challenging task due to its sparse and noisy nature. In this paper, we propose a new Spherical Correlated Topic Model (SCTM), which takes into account the correlation between topics. Our model integrates word and knowledge graph embeddings to better capture the semantic relationships among short texts. We adopt the von Mises-Fisher distribution to model the high-dimensional word and entity embeddings on a hypersphere, enabling better preservation of the angular relationships between topic vectors. Moreover, knowledge graph embeddings are incorporated to further enrich the semantic meaning of short texts. Experimental results on several datasets demonstrate that our proposed SCTM model outperforms existing models in terms of both topic coherence and document classification. In addition, our model is capable of providing interpretable topics and revealing meaningful correlations among short texts.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2025.3550032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the prevalence of short texts in various forms such as news headlines, tweets, and reviews, short text analysis has gained significant interest in recent times. However, modeling short texts remains a challenging task due to its sparse and noisy nature. In this paper, we propose a new Spherical Correlated Topic Model (SCTM), which takes into account the correlation between topics. Our model integrates word and knowledge graph embeddings to better capture the semantic relationships among short texts. We adopt the von Mises-Fisher distribution to model the high-dimensional word and entity embeddings on a hypersphere, enabling better preservation of the angular relationships between topic vectors. Moreover, knowledge graph embeddings are incorporated to further enrich the semantic meaning of short texts. Experimental results on several datasets demonstrate that our proposed SCTM model outperforms existing models in terms of both topic coherence and document classification. In addition, our model is capable of providing interpretable topics and revealing meaningful correlations among short texts.