tweet上的聚类和主题建模:对健康数据集的比较。

Juan Antonio Lossio-Ventura, Juandiego Morzan, Hugo Alatrista-Salas, Tina Hernandez-Boussard, Jiang Bian
{"title":"tweet上的聚类和主题建模:对健康数据集的比较。","authors":"Juan Antonio Lossio-Ventura, Juandiego Morzan, Hugo Alatrista-Salas, Tina Hernandez-Boussard, Jiang Bian","doi":"10.1109/bibm47256.2019.8983167","DOIUrl":null,"url":null,"abstract":"Twitter became the most popular form of social interactions in the healthcare domain. Thus, various teams have evaluated Twitter as an additional source where patients share information about their healthcare with the potential goal to improve their outcomes. Several existing topic modeling and document clustering applications have been adapted to assess tweets showing that the performances of the applications are negatively affected due to the nature and characteristics of tweets. Moreover, Twitter health research has become difficult to measure because of the absence of comparisons between the existing applications. In this paper, we perform an evaluation based on internal indexes of different topic modeling and document clustering applications over two Twitter health-related datasets. Our results show that Online Twitter LDA and Gibbs LDA get a better performance for extracting topics and grouping tweets. We want to provide health practitioners this comparison to select the most suitable application for their tasks.","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2019 ","pages":"1544-1547"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/bibm47256.2019.8983167","citationCount":"7","resultStr":"{\"title\":\"Clustering and topic modeling over tweets: A comparison over a health dataset.\",\"authors\":\"Juan Antonio Lossio-Ventura, Juandiego Morzan, Hugo Alatrista-Salas, Tina Hernandez-Boussard, Jiang Bian\",\"doi\":\"10.1109/bibm47256.2019.8983167\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Twitter became the most popular form of social interactions in the healthcare domain. Thus, various teams have evaluated Twitter as an additional source where patients share information about their healthcare with the potential goal to improve their outcomes. Several existing topic modeling and document clustering applications have been adapted to assess tweets showing that the performances of the applications are negatively affected due to the nature and characteristics of tweets. Moreover, Twitter health research has become difficult to measure because of the absence of comparisons between the existing applications. In this paper, we perform an evaluation based on internal indexes of different topic modeling and document clustering applications over two Twitter health-related datasets. Our results show that Online Twitter LDA and Gibbs LDA get a better performance for extracting topics and grouping tweets. We want to provide health practitioners this comparison to select the most suitable application for their tasks.\",\"PeriodicalId\":74563,\"journal\":{\"name\":\"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine\",\"volume\":\"2019 \",\"pages\":\"1544-1547\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/bibm47256.2019.8983167\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/bibm47256.2019.8983167\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/bibm47256.2019.8983167","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

Twitter成为医疗保健领域最流行的社交互动形式。因此,不同的团队已经将Twitter评估为一个额外的来源,患者可以在这里分享他们的医疗保健信息,潜在的目标是改善他们的结果。一些现有的主题建模和文档聚类应用程序已经被用于评估推文,表明由于推文的性质和特征,应用程序的性能受到负面影响。此外,由于缺乏现有应用程序之间的比较,Twitter的健康研究已经变得难以衡量。在本文中,我们对两个Twitter健康相关数据集进行了基于不同主题建模和文档聚类应用程序的内部索引的评估。结果表明,Online Twitter LDA和Gibbs LDA在提取主题和分组tweet方面具有更好的性能。我们希望为健康从业者提供这种比较,以选择最适合他们任务的应用程序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Clustering and topic modeling over tweets: A comparison over a health dataset.
Twitter became the most popular form of social interactions in the healthcare domain. Thus, various teams have evaluated Twitter as an additional source where patients share information about their healthcare with the potential goal to improve their outcomes. Several existing topic modeling and document clustering applications have been adapted to assess tweets showing that the performances of the applications are negatively affected due to the nature and characteristics of tweets. Moreover, Twitter health research has become difficult to measure because of the absence of comparisons between the existing applications. In this paper, we perform an evaluation based on internal indexes of different topic modeling and document clustering applications over two Twitter health-related datasets. Our results show that Online Twitter LDA and Gibbs LDA get a better performance for extracting topics and grouping tweets. We want to provide health practitioners this comparison to select the most suitable application for their tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信