A Survey of Multi-Label Topic Models

Sophie Burkhardt, S. Kramer
{"title":"A Survey of Multi-Label Topic Models","authors":"Sophie Burkhardt, S. Kramer","doi":"10.1145/3373464.3373474","DOIUrl":null,"url":null,"abstract":"Every day, an enormous amount of text data is produced. Sources of text data include news, social media, emails, text messages, medical reports, scientific publications and fiction. To keep track of this data, there are categories, key words, tags or labels that are assigned to each text. Automatically predicting such labels is the task of multi-label text classification. Often however, we are interested in more than just the pure classification: rather, we would like to understand which parts of a text belong to the label, which words are important for the label or which labels occur together. Because of this, topic models may be used for multi-label classification as an interpretable model that is flexible and easily extensible. This survey demonstrates the manifold possibilities and flexibility of the topic model framework for the complex setting of multi-label text classification by categorizing different variants of models.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"51 1","pages":"61-79"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3373464.3373474","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Every day, an enormous amount of text data is produced. Sources of text data include news, social media, emails, text messages, medical reports, scientific publications and fiction. To keep track of this data, there are categories, key words, tags or labels that are assigned to each text. Automatically predicting such labels is the task of multi-label text classification. Often however, we are interested in more than just the pure classification: rather, we would like to understand which parts of a text belong to the label, which words are important for the label or which labels occur together. Because of this, topic models may be used for multi-label classification as an interpretable model that is flexible and easily extensible. This survey demonstrates the manifold possibilities and flexibility of the topic model framework for the complex setting of multi-label text classification by categorizing different variants of models.
多标签主题模型综述
每天都会产生大量的文本数据。文本数据的来源包括新闻、社交媒体、电子邮件、短信、医疗报告、科学出版物和小说。为了跟踪这些数据,为每个文本分配了类别、关键词、标签或标签。自动预测这些标签是多标签文本分类的任务。然而,通常我们感兴趣的不仅仅是纯粹的分类:相反,我们想要了解文本的哪些部分属于标签,哪些单词对标签很重要,或者哪些标签一起出现。因此,主题模型可以作为灵活且易于扩展的可解释模型用于多标签分类。本研究通过对模型的不同变体进行分类,展示了主题模型框架在复杂的多标签文本分类设置中的多种可能性和灵活性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信