利用年龄自适应深度学习方法检测不当视频内容

IF 4.3 Q1 PSYCHOLOGY, MULTIDISCIPLINARY
Iftikhar Alam, Abdul Basit, Riaz Ahmad Ziar
{"title":"利用年龄自适应深度学习方法检测不当视频内容","authors":"Iftikhar Alam,&nbsp;Abdul Basit,&nbsp;Riaz Ahmad Ziar","doi":"10.1155/2024/7004031","DOIUrl":null,"url":null,"abstract":"<p>The exponential growth of video-sharing platforms, exemplified by platforms like YouTube and Netflix, has made videos available to everyone with minimal restrictions. This proliferation, while offering a variety of content, at the same time introduces challenges, such as the increased vulnerability of children and adolescents to potentially harmful material, notably explicit content. Despite the efforts in developing content moderation tools, a research gap still exists in creating comprehensive solutions capable of reliably estimating users’ ages and accurately classifying numerous forms of inappropriate video content. This study is aimed at bridging this gap by introducing VideoTransformer, which combines the power of two existing models: AgeNet and MobileNetV2. To evaluate the effectiveness of the proposed approach, this study utilized a manually annotated video dataset collected from YouTube, covering multiple categories, including <i>safe</i>, <i>real violence</i>, <i>drugs</i>, <i>nudity</i>, <i>simulated violence</i>, <i>kissing</i>, <i>pornography</i>, and <i>terrorism</i>. In contrast to existing models, the proposed VideoTransformer model demonstrates significant performance improvements, as evidenced by two distinct accuracy evaluations. It achieves an impressive accuracy rate of (96.89%) in a 5-fold cross-validation setup, outperforming NasNet (92.6%), EfficientNet-B7 (87.87%), GoogLeNet (85.1%), and VGG-19 (92.83%). Furthermore, in a single run, it maintains a consistent accuracy rate of 90%. Additionally, the proposed model attains an <i>F</i>1-score of 90.34%, indicating a well-balanced trade-off between precision and recall. These findings highlight the potential of the proposed approach in advancing content moderation and enhancing user safety on video-sharing platforms. We envision deploying the proposed methodology in real-time video streaming to effectively mitigate the spread of inappropriate content, thereby raising online safety standards.</p>","PeriodicalId":36408,"journal":{"name":"Human Behavior and Emerging Technologies","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/7004031","citationCount":"0","resultStr":"{\"title\":\"Utilizing Age-Adaptive Deep Learning Approaches for Detecting Inappropriate Video Content\",\"authors\":\"Iftikhar Alam,&nbsp;Abdul Basit,&nbsp;Riaz Ahmad Ziar\",\"doi\":\"10.1155/2024/7004031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The exponential growth of video-sharing platforms, exemplified by platforms like YouTube and Netflix, has made videos available to everyone with minimal restrictions. This proliferation, while offering a variety of content, at the same time introduces challenges, such as the increased vulnerability of children and adolescents to potentially harmful material, notably explicit content. Despite the efforts in developing content moderation tools, a research gap still exists in creating comprehensive solutions capable of reliably estimating users’ ages and accurately classifying numerous forms of inappropriate video content. This study is aimed at bridging this gap by introducing VideoTransformer, which combines the power of two existing models: AgeNet and MobileNetV2. To evaluate the effectiveness of the proposed approach, this study utilized a manually annotated video dataset collected from YouTube, covering multiple categories, including <i>safe</i>, <i>real violence</i>, <i>drugs</i>, <i>nudity</i>, <i>simulated violence</i>, <i>kissing</i>, <i>pornography</i>, and <i>terrorism</i>. In contrast to existing models, the proposed VideoTransformer model demonstrates significant performance improvements, as evidenced by two distinct accuracy evaluations. It achieves an impressive accuracy rate of (96.89%) in a 5-fold cross-validation setup, outperforming NasNet (92.6%), EfficientNet-B7 (87.87%), GoogLeNet (85.1%), and VGG-19 (92.83%). Furthermore, in a single run, it maintains a consistent accuracy rate of 90%. Additionally, the proposed model attains an <i>F</i>1-score of 90.34%, indicating a well-balanced trade-off between precision and recall. These findings highlight the potential of the proposed approach in advancing content moderation and enhancing user safety on video-sharing platforms. We envision deploying the proposed methodology in real-time video streaming to effectively mitigate the spread of inappropriate content, thereby raising online safety standards.</p>\",\"PeriodicalId\":36408,\"journal\":{\"name\":\"Human Behavior and Emerging Technologies\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/7004031\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Human Behavior and Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/2024/7004031\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Behavior and Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/7004031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

以 YouTube 和 Netflix 等平台为例,视频共享平台的指数式增长使每个人都能在极少限制的情况下观看视频。这种激增在提供各种内容的同时也带来了挑战,例如儿童和青少年更容易接触到潜在的有害信息,尤其是露骨的内容。尽管在开发内容节制工具方面做出了努力,但在创建能够可靠估计用户年龄和准确分类多种形式的不当视频内容的全面解决方案方面,仍然存在研究空白。本研究旨在通过引入视频转换器(VideoTransformer)来弥补这一差距:AgeNet 和 MobileNetV2。为了评估所提出方法的有效性,本研究使用了从 YouTube 收集的人工注释视频数据集,涵盖多个类别,包括安全、真实暴力、毒品、裸体、模拟暴力、接吻、色情和恐怖主义。与现有模型相比,所提出的 VideoTransformer 模型在性能上有了显著提高,两个不同的准确率评估证明了这一点。在 5 倍交叉验证设置中,它的准确率达到了令人印象深刻的 96.89%,超过了 NasNet(92.6%)、EfficientNet-B7(87.87%)、GoogLeNet(85.1%)和 VGG-19(92.83%)。此外,在单次运行中,它的准确率始终保持在 90%。此外,拟议模型的 F1 分数达到 90.34%,表明精确度和召回率之间的权衡非常平衡。这些发现凸显了所提方法在推进内容审核和提高视频共享平台用户安全方面的潜力。我们设想在实时视频流中部署所提出的方法,以有效减少不当内容的传播,从而提高网络安全标准。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Utilizing Age-Adaptive Deep Learning Approaches for Detecting Inappropriate Video Content

Utilizing Age-Adaptive Deep Learning Approaches for Detecting Inappropriate Video Content

The exponential growth of video-sharing platforms, exemplified by platforms like YouTube and Netflix, has made videos available to everyone with minimal restrictions. This proliferation, while offering a variety of content, at the same time introduces challenges, such as the increased vulnerability of children and adolescents to potentially harmful material, notably explicit content. Despite the efforts in developing content moderation tools, a research gap still exists in creating comprehensive solutions capable of reliably estimating users’ ages and accurately classifying numerous forms of inappropriate video content. This study is aimed at bridging this gap by introducing VideoTransformer, which combines the power of two existing models: AgeNet and MobileNetV2. To evaluate the effectiveness of the proposed approach, this study utilized a manually annotated video dataset collected from YouTube, covering multiple categories, including safe, real violence, drugs, nudity, simulated violence, kissing, pornography, and terrorism. In contrast to existing models, the proposed VideoTransformer model demonstrates significant performance improvements, as evidenced by two distinct accuracy evaluations. It achieves an impressive accuracy rate of (96.89%) in a 5-fold cross-validation setup, outperforming NasNet (92.6%), EfficientNet-B7 (87.87%), GoogLeNet (85.1%), and VGG-19 (92.83%). Furthermore, in a single run, it maintains a consistent accuracy rate of 90%. Additionally, the proposed model attains an F1-score of 90.34%, indicating a well-balanced trade-off between precision and recall. These findings highlight the potential of the proposed approach in advancing content moderation and enhancing user safety on video-sharing platforms. We envision deploying the proposed methodology in real-time video streaming to effectively mitigate the spread of inappropriate content, thereby raising online safety standards.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Human Behavior and Emerging Technologies
Human Behavior and Emerging Technologies Social Sciences-Social Sciences (all)
CiteScore
17.20
自引率
8.70%
发文量
73
期刊介绍: Human Behavior and Emerging Technologies is an interdisciplinary journal dedicated to publishing high-impact research that enhances understanding of the complex interactions between diverse human behavior and emerging digital technologies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信