SmokerViT:基于变压器的吸烟者识别方法

Ali Khan, Somaiya Khan, Bilal Hassan, Rizwan Khan, Zhonglong Zheng
{"title":"SmokerViT:基于变压器的吸烟者识别方法","authors":"Ali Khan, Somaiya Khan, Bilal Hassan, Rizwan Khan, Zhonglong Zheng","doi":"10.32604/cmc.2023.040251","DOIUrl":null,"url":null,"abstract":"Smoking has an economic and environmental impact on society due to the toxic substances it emits. Convolutional Neural Networks (CNNs) need help describing low-level features and can miss important information. Moreover, accurate smoker detection is vital with minimum false alarms. To answer the issue, the researchers of this paper have turned to a self-attention mechanism inspired by the ViT, which has displayed state-of-the-art performance in the classification task. To effectively enforce the smoking prohibition in non-smoking locations, this work presents a Vision Transformer-inspired model called SmokerViT for detecting smokers. Moreover, this research utilizes a locally curated dataset of 1120 images evenly distributed among the two classes (Smoking and NotSmoking). Further, this research performs augmentations on the smoker detection dataset to have many images with various representations to overcome the dataset size limitation. Unlike convolutional operations used in most existing works, the proposed SmokerViT model employs a self-attention mechanism in the Transformer block, making it suitable for the smoker classification problem. Besides, this work integrates the multi-layer perceptron head block in the SmokerViT model, which contains dense layers with rectified linear activation and linear kernel regularizer with L2 for the recognition task. This work presents an exhaustive analysis to prove the efficiency of the proposed SmokerViT model. The performance of the proposed SmokerViT performance is evaluated and compared with the existing methods, where it achieves an overall classification accuracy of 97.77%, with 98.21% recall and 97.35% precision, outperforming the state-of-the-art deep learning models, including convolutional neural networks (CNNs) and other vision transformer-based models.","PeriodicalId":93535,"journal":{"name":"Computers, materials & continua","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SmokerViT: A Transformer-Based Method for Smoker Recognition\",\"authors\":\"Ali Khan, Somaiya Khan, Bilal Hassan, Rizwan Khan, Zhonglong Zheng\",\"doi\":\"10.32604/cmc.2023.040251\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Smoking has an economic and environmental impact on society due to the toxic substances it emits. Convolutional Neural Networks (CNNs) need help describing low-level features and can miss important information. Moreover, accurate smoker detection is vital with minimum false alarms. To answer the issue, the researchers of this paper have turned to a self-attention mechanism inspired by the ViT, which has displayed state-of-the-art performance in the classification task. To effectively enforce the smoking prohibition in non-smoking locations, this work presents a Vision Transformer-inspired model called SmokerViT for detecting smokers. Moreover, this research utilizes a locally curated dataset of 1120 images evenly distributed among the two classes (Smoking and NotSmoking). Further, this research performs augmentations on the smoker detection dataset to have many images with various representations to overcome the dataset size limitation. Unlike convolutional operations used in most existing works, the proposed SmokerViT model employs a self-attention mechanism in the Transformer block, making it suitable for the smoker classification problem. Besides, this work integrates the multi-layer perceptron head block in the SmokerViT model, which contains dense layers with rectified linear activation and linear kernel regularizer with L2 for the recognition task. This work presents an exhaustive analysis to prove the efficiency of the proposed SmokerViT model. The performance of the proposed SmokerViT performance is evaluated and compared with the existing methods, where it achieves an overall classification accuracy of 97.77%, with 98.21% recall and 97.35% precision, outperforming the state-of-the-art deep learning models, including convolutional neural networks (CNNs) and other vision transformer-based models.\",\"PeriodicalId\":93535,\"journal\":{\"name\":\"Computers, materials & continua\",\"volume\":\"86 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers, materials & continua\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32604/cmc.2023.040251\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers, materials & continua","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32604/cmc.2023.040251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

吸烟会释放有毒物质,对社会造成经济和环境影响。卷积神经网络(cnn)需要帮助描述低级特征,并且可能会遗漏重要信息。此外,准确的吸烟者检测是至关重要的,以尽量减少假警报。为了解决这个问题,本文的研究人员转向了受ViT启发的自注意机制,该机制在分类任务中表现出了最先进的性能。为了有效地执行禁烟令在非吸烟场所,本工作提出了一个名为SmokerViT的视觉变形模型,用于检测吸烟者。此外,本研究利用了一个由1120张图像组成的本地管理数据集,这些图像均匀地分布在两个类别(吸烟和不吸烟)中。此外,本研究对吸烟者检测数据集进行了增强,使其具有不同表示的许多图像,以克服数据集大小的限制。与大多数现有工作中使用的卷积操作不同,所提出的SmokerViT模型在Transformer块中采用了自关注机制,使其适用于吸烟者分类问题。此外,本文将多层感知器头部块集成到SmokerViT模型中,该模型包含具有整流线性激活的密集层和具有L2的线性核正则化器,用于识别任务。这项工作提出了一个详尽的分析,以证明所提出的SmokerViT模型的效率。对提出的SmokerViT性能进行了评估并与现有方法进行了比较,总体分类准确率为97.77%,召回率为98.21%,精度为97.35%,优于当前最先进的深度学习模型,包括卷积神经网络(cnn)和其他基于视觉变压器的模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SmokerViT: A Transformer-Based Method for Smoker Recognition
Smoking has an economic and environmental impact on society due to the toxic substances it emits. Convolutional Neural Networks (CNNs) need help describing low-level features and can miss important information. Moreover, accurate smoker detection is vital with minimum false alarms. To answer the issue, the researchers of this paper have turned to a self-attention mechanism inspired by the ViT, which has displayed state-of-the-art performance in the classification task. To effectively enforce the smoking prohibition in non-smoking locations, this work presents a Vision Transformer-inspired model called SmokerViT for detecting smokers. Moreover, this research utilizes a locally curated dataset of 1120 images evenly distributed among the two classes (Smoking and NotSmoking). Further, this research performs augmentations on the smoker detection dataset to have many images with various representations to overcome the dataset size limitation. Unlike convolutional operations used in most existing works, the proposed SmokerViT model employs a self-attention mechanism in the Transformer block, making it suitable for the smoker classification problem. Besides, this work integrates the multi-layer perceptron head block in the SmokerViT model, which contains dense layers with rectified linear activation and linear kernel regularizer with L2 for the recognition task. This work presents an exhaustive analysis to prove the efficiency of the proposed SmokerViT model. The performance of the proposed SmokerViT performance is evaluated and compared with the existing methods, where it achieves an overall classification accuracy of 97.77%, with 98.21% recall and 97.35% precision, outperforming the state-of-the-art deep learning models, including convolutional neural networks (CNNs) and other vision transformer-based models.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信