利用机器学习方法检测 Youtube 视频评论中的垃圾信息

Andrew S. Xiao , Qilian Liang
{"title":"利用机器学习方法检测 Youtube 视频评论中的垃圾信息","authors":"Andrew S. Xiao ,&nbsp;Qilian Liang","doi":"10.1016/j.mlwa.2024.100550","DOIUrl":null,"url":null,"abstract":"<div><p>Machine Learning models have the ability to streamline the process by which Youtube video comments are filtered between legitimate comments (ham) and spam. In order to integrate machine learning models into regular usage on media-sharing platforms, recent approaches have aimed to develop models trained on Youtube comments, which have emerged as valuable tools for the classification and have enabled the identification of spam content and enhancing user experience. In this paper, eight machine learning approaches are applied to spam detection for YouTube comments. The eight machine learning models include Gaussian Naive Bayes, logistic regression, K-nearest neighbors (KNN) classifier, multi-layer perceptron (MLP), support vector machine (SVM) classifier, random forest classifier, decision tree classifier, and voting classifier. All eight models perform very well, specifically random forest approach can achieve almost perfect performance with average precision of 100% and AUC-ROC of 0.9841. The computational complexity of the eight machine learning approaches are compared.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"16 ","pages":"Article 100550"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000264/pdfft?md5=5244427dfd0f509334984878d01998e5&pid=1-s2.0-S2666827024000264-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Spam detection for Youtube video comments using machine learning approaches\",\"authors\":\"Andrew S. Xiao ,&nbsp;Qilian Liang\",\"doi\":\"10.1016/j.mlwa.2024.100550\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Machine Learning models have the ability to streamline the process by which Youtube video comments are filtered between legitimate comments (ham) and spam. In order to integrate machine learning models into regular usage on media-sharing platforms, recent approaches have aimed to develop models trained on Youtube comments, which have emerged as valuable tools for the classification and have enabled the identification of spam content and enhancing user experience. In this paper, eight machine learning approaches are applied to spam detection for YouTube comments. The eight machine learning models include Gaussian Naive Bayes, logistic regression, K-nearest neighbors (KNN) classifier, multi-layer perceptron (MLP), support vector machine (SVM) classifier, random forest classifier, decision tree classifier, and voting classifier. All eight models perform very well, specifically random forest approach can achieve almost perfect performance with average precision of 100% and AUC-ROC of 0.9841. The computational complexity of the eight machine learning approaches are compared.</p></div>\",\"PeriodicalId\":74093,\"journal\":{\"name\":\"Machine learning with applications\",\"volume\":\"16 \",\"pages\":\"Article 100550\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666827024000264/pdfft?md5=5244427dfd0f509334984878d01998e5&pid=1-s2.0-S2666827024000264-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine learning with applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666827024000264\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827024000264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

机器学习模型能够简化 Youtube 视频评论在合法评论(ham)和垃圾评论之间的过滤过程。为了将机器学习模型集成到媒体共享平台的常规使用中,最近的方法旨在开发针对 Youtube 评论训练的模型,这些模型已成为有价值的分类工具,能够识别垃圾内容并提升用户体验。本文将八种机器学习方法应用于 YouTube 评论的垃圾邮件检测。这八种机器学习模型包括高斯奈维贝叶、逻辑回归、K-近邻(KNN)分类器、多层感知器(MLP)、支持向量机(SVM)分类器、随机森林分类器、决策树分类器和投票分类器。所有八个模型的表现都非常出色,特别是随机森林方法几乎达到了完美的表现,平均精度为 100%,AUC-ROC 为 0.9841。比较了八种机器学习方法的计算复杂度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Spam detection for Youtube video comments using machine learning approaches

Spam detection for Youtube video comments using machine learning approaches

Machine Learning models have the ability to streamline the process by which Youtube video comments are filtered between legitimate comments (ham) and spam. In order to integrate machine learning models into regular usage on media-sharing platforms, recent approaches have aimed to develop models trained on Youtube comments, which have emerged as valuable tools for the classification and have enabled the identification of spam content and enhancing user experience. In this paper, eight machine learning approaches are applied to spam detection for YouTube comments. The eight machine learning models include Gaussian Naive Bayes, logistic regression, K-nearest neighbors (KNN) classifier, multi-layer perceptron (MLP), support vector machine (SVM) classifier, random forest classifier, decision tree classifier, and voting classifier. All eight models perform very well, specifically random forest approach can achieve almost perfect performance with average precision of 100% and AUC-ROC of 0.9841. The computational complexity of the eight machine learning approaches are compared.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Machine learning with applications
Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications
自引率
0.00%
发文量
0
审稿时长
98 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信