Improving Multimodal Data Labeling with Deep Active Learning for Post Classification in Social Networks

Dmitry Krylov, S. Poliakov, N. Khanzhina, Alexey Zabashta, A. Filchenkov, Aleksandr Farseev
{"title":"Improving Multimodal Data Labeling with Deep Active Learning for Post Classification in Social Networks","authors":"Dmitry Krylov, S. Poliakov, N. Khanzhina, Alexey Zabashta, A. Filchenkov, Aleksandr Farseev","doi":"10.1145/3476098.3485055","DOIUrl":null,"url":null,"abstract":"Automatic user post classification is an important task in the field of social network analysis. Being effectively solved, post classification could be used for thematic user feed composition or inappropriate content identification. Commonly addressed by applying various Machine Learning approaches, the task often involves manual processes related to ground truth sourcing, which is known to be a hardly-scalable and increasingly expensive procedure. At the same time, Active Learning for automatic user post classification is a promising way to bridge such a gap, as it does not require massive ground truth availability aligning our research with the real world settings. In this work, we put our focus on leveraging textual and visual data modalities for the application of user post classification and investigate how batch size and batch normalization disabling techniques could affect active deep neural network learning process. We solve the problem of automatic user post classification by employing our novel multimodal neural network architecture with multi-head tunable loss function components. We show that the proposed approach, coupled with Active Learning, allows for the achievement of a significant classification performance boost in terms of crowd assessing resources as compared to the passive learning approaches.","PeriodicalId":390904,"journal":{"name":"Multimedia Understanding with Less Labeling on Multimedia Understanding with Less Labeling","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Understanding with Less Labeling on Multimedia Understanding with Less Labeling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3476098.3485055","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Automatic user post classification is an important task in the field of social network analysis. Being effectively solved, post classification could be used for thematic user feed composition or inappropriate content identification. Commonly addressed by applying various Machine Learning approaches, the task often involves manual processes related to ground truth sourcing, which is known to be a hardly-scalable and increasingly expensive procedure. At the same time, Active Learning for automatic user post classification is a promising way to bridge such a gap, as it does not require massive ground truth availability aligning our research with the real world settings. In this work, we put our focus on leveraging textual and visual data modalities for the application of user post classification and investigate how batch size and batch normalization disabling techniques could affect active deep neural network learning process. We solve the problem of automatic user post classification by employing our novel multimodal neural network architecture with multi-head tunable loss function components. We show that the proposed approach, coupled with Active Learning, allows for the achievement of a significant classification performance boost in terms of crowd assessing resources as compared to the passive learning approaches.
基于深度主动学习的社交网络Post分类多模态数据标注改进
用户帖子自动分类是社交网络分析领域的一项重要任务。有效解决后,帖子分类可用于专题用户feed构成或不当内容识别。通常通过应用各种机器学习方法来解决,该任务通常涉及与地面真相来源相关的手动过程,这是一个难以扩展且越来越昂贵的过程。与此同时,用于自动用户帖子分类的主动学习是弥合这种差距的一种很有前途的方法,因为它不需要将我们的研究与现实世界设置相一致的大量地面事实可用性。在这项工作中,我们将重点放在利用文本和视觉数据模式来应用用户帖子分类,并研究批大小和批规范化禁用技术如何影响主动深度神经网络学习过程。我们采用具有多头可调损失函数分量的新颖多模态神经网络结构解决了用户帖子自动分类问题。我们表明,与被动学习方法相比,所提出的方法与主动学习相结合,可以在人群评估资源方面实现显着的分类性能提升。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信