{"title":"Semantic-based topic model for public opinion analysis in sudden-onset disasters","authors":"Yulong Ma, Xinsheng Zhang, Runzhou Wang","doi":"10.1016/j.asoc.2025.112700","DOIUrl":null,"url":null,"abstract":"<div><div>Sudden-onset disasters have put forward more stringent requirements for the government to carry out public opinion analysis work. However, most existing topic models ignore the contextual semantics of disaster texts, and fail to balance the robustness and the training cost. To address these issues, a neural clustering topic model is proposed in this work. The topic probability distribution of the LDA model is integrated with the distribution semantic vector generated by a lite BERT. The fused vectors are reconstructed by a nonlinear manifold learning algorithm, and re-clustered into topics by a mini-batch based <em>k-</em>means++ algorithm. Compared to state-of-the-art models on three sudden-onset disaster datasets, the proposed model shows an increase of 1.79 % in average topic coherence and 33.87 % in topic diversity. Meanwhile, the inference time is reduced by 84.09 % on average. The visual study of the latent process of the proposed model reflects that its ability to compact intra-cluster vector distances and sparse inter-cluster vector distances is the potential reason for its better performance. It can be considered that the application of the proposed model can help the government enhance its ability to manage negative public opinions in sudden-onset disasters.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"170 ","pages":"Article 112700"},"PeriodicalIF":7.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625000110","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Sudden-onset disasters have put forward more stringent requirements for the government to carry out public opinion analysis work. However, most existing topic models ignore the contextual semantics of disaster texts, and fail to balance the robustness and the training cost. To address these issues, a neural clustering topic model is proposed in this work. The topic probability distribution of the LDA model is integrated with the distribution semantic vector generated by a lite BERT. The fused vectors are reconstructed by a nonlinear manifold learning algorithm, and re-clustered into topics by a mini-batch based k-means++ algorithm. Compared to state-of-the-art models on three sudden-onset disaster datasets, the proposed model shows an increase of 1.79 % in average topic coherence and 33.87 % in topic diversity. Meanwhile, the inference time is reduced by 84.09 % on average. The visual study of the latent process of the proposed model reflects that its ability to compact intra-cluster vector distances and sparse inter-cluster vector distances is the potential reason for its better performance. It can be considered that the application of the proposed model can help the government enhance its ability to manage negative public opinions in sudden-onset disasters.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.