利用卷积块注意力模块识别人类活动

IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Mohammed Zakariah , Abeer Alnuaim
{"title":"利用卷积块注意力模块识别人类活动","authors":"Mohammed Zakariah ,&nbsp;Abeer Alnuaim","doi":"10.1016/j.eij.2024.100536","DOIUrl":null,"url":null,"abstract":"<div><p>Human Activity Recognition (HAR) is crucial for the advancement of applications in smart environments, communication, IoT, security, and healthcare monitoring. Convolutional neural networks (CNNs) have made substantial contributions to human activity recognition (HAR). However, they frequently encounter difficulties in accurately discerning intricate human actions in real-time situations. This study aims to fill a significant research gap by incorporating the Convolutional Block Attention Module (CBAM) into CNN architectures. The goal is to improve the extraction of features from video sequences. The CBAM boosts the performance of the network by selectively prioritizing significant spatial and channel-wise data, resulting in improved detection of subtle activity patterns and increased stability in categorization. CBAM’s attention mechanism directly focuses and amplifies essential characteristics, which sets it apart from typical CNNs that lack a refined focus mechanism. This unique approach results in improved performance in behavior identification tests. The proposed CBAM-enhanced model has been extensively tested on benchmark datasets, yielding an accuracy of 94.23% on the HMDB51 dataset. It also achieved competitive results of 83.4% and 88.9% on the UCF-101 and UCF-50 datasets, respectively. However, there is still a lack of study in comprehending how CBAM adjusts to different CNN architectures and its suitability in varied HAR situations beyond controlled datasets. In future studies, it is imperative for researchers to investigate the integration of CBAM with other CNN frameworks, assess its efficacy in practical scenarios, and explore multi-modal sensor fusion techniques to enhance its reliability and utility. This study showcases the ability of CBAM to enhance HAR capabilities and also paves the way for future research to improve activity identification systems for wider and more practical uses.</p></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1110866524000999/pdfft?md5=ecc0aedcf9be8ae7e087777abd06f4e1&pid=1-s2.0-S1110866524000999-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Recognizing human activities with the use of Convolutional Block Attention Module\",\"authors\":\"Mohammed Zakariah ,&nbsp;Abeer Alnuaim\",\"doi\":\"10.1016/j.eij.2024.100536\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Human Activity Recognition (HAR) is crucial for the advancement of applications in smart environments, communication, IoT, security, and healthcare monitoring. Convolutional neural networks (CNNs) have made substantial contributions to human activity recognition (HAR). However, they frequently encounter difficulties in accurately discerning intricate human actions in real-time situations. This study aims to fill a significant research gap by incorporating the Convolutional Block Attention Module (CBAM) into CNN architectures. The goal is to improve the extraction of features from video sequences. The CBAM boosts the performance of the network by selectively prioritizing significant spatial and channel-wise data, resulting in improved detection of subtle activity patterns and increased stability in categorization. CBAM’s attention mechanism directly focuses and amplifies essential characteristics, which sets it apart from typical CNNs that lack a refined focus mechanism. This unique approach results in improved performance in behavior identification tests. The proposed CBAM-enhanced model has been extensively tested on benchmark datasets, yielding an accuracy of 94.23% on the HMDB51 dataset. It also achieved competitive results of 83.4% and 88.9% on the UCF-101 and UCF-50 datasets, respectively. However, there is still a lack of study in comprehending how CBAM adjusts to different CNN architectures and its suitability in varied HAR situations beyond controlled datasets. In future studies, it is imperative for researchers to investigate the integration of CBAM with other CNN frameworks, assess its efficacy in practical scenarios, and explore multi-modal sensor fusion techniques to enhance its reliability and utility. This study showcases the ability of CBAM to enhance HAR capabilities and also paves the way for future research to improve activity identification systems for wider and more practical uses.</p></div>\",\"PeriodicalId\":56010,\"journal\":{\"name\":\"Egyptian Informatics Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1110866524000999/pdfft?md5=ecc0aedcf9be8ae7e087777abd06f4e1&pid=1-s2.0-S1110866524000999-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Egyptian Informatics Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110866524000999\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866524000999","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

人类活动识别(HAR)对于推进智能环境、通信、物联网、安全和医疗监控等领域的应用至关重要。卷积神经网络(CNN)为人类活动识别(HAR)做出了巨大贡献。然而,它们在准确辨别实时情况下错综复杂的人类动作方面经常遇到困难。本研究旨在通过将卷积块注意力模块(CBAM)纳入 CNN 架构来填补这一重大研究空白。其目标是从视频序列中改进特征提取。CBAM 通过选择性地优先处理重要的空间和通道数据来提高网络的性能,从而改进对微妙活动模式的检测,并提高分类的稳定性。CBAM 的注意力机制直接聚焦并放大重要特征,这使其有别于缺乏精细聚焦机制的典型 CNN。这种独特的方法提高了行为识别测试的性能。所提出的 CBAM 增强模型已在基准数据集上进行了广泛测试,在 HMDB51 数据集上获得了 94.23% 的准确率。在 UCF-101 和 UCF-50 数据集上,它也分别取得了 83.4% 和 88.9% 的优异成绩。然而,对于 CBAM 如何适应不同的 CNN 架构,以及它在受控数据集之外的各种 HAR 情况下的适用性,仍然缺乏研究。在未来的研究中,研究人员必须调查 CBAM 与其他 CNN 框架的整合情况,评估其在实际场景中的功效,并探索多模式传感器融合技术,以提高其可靠性和实用性。本研究展示了 CBAM 增强 HAR 功能的能力,同时也为未来的研究铺平了道路,以改进活动识别系统,使其应用范围更广、更实用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Recognizing human activities with the use of Convolutional Block Attention Module

Human Activity Recognition (HAR) is crucial for the advancement of applications in smart environments, communication, IoT, security, and healthcare monitoring. Convolutional neural networks (CNNs) have made substantial contributions to human activity recognition (HAR). However, they frequently encounter difficulties in accurately discerning intricate human actions in real-time situations. This study aims to fill a significant research gap by incorporating the Convolutional Block Attention Module (CBAM) into CNN architectures. The goal is to improve the extraction of features from video sequences. The CBAM boosts the performance of the network by selectively prioritizing significant spatial and channel-wise data, resulting in improved detection of subtle activity patterns and increased stability in categorization. CBAM’s attention mechanism directly focuses and amplifies essential characteristics, which sets it apart from typical CNNs that lack a refined focus mechanism. This unique approach results in improved performance in behavior identification tests. The proposed CBAM-enhanced model has been extensively tested on benchmark datasets, yielding an accuracy of 94.23% on the HMDB51 dataset. It also achieved competitive results of 83.4% and 88.9% on the UCF-101 and UCF-50 datasets, respectively. However, there is still a lack of study in comprehending how CBAM adjusts to different CNN architectures and its suitability in varied HAR situations beyond controlled datasets. In future studies, it is imperative for researchers to investigate the integration of CBAM with other CNN frameworks, assess its efficacy in practical scenarios, and explore multi-modal sensor fusion techniques to enhance its reliability and utility. This study showcases the ability of CBAM to enhance HAR capabilities and also paves the way for future research to improve activity identification systems for wider and more practical uses.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Egyptian Informatics Journal
Egyptian Informatics Journal Decision Sciences-Management Science and Operations Research
CiteScore
11.10
自引率
1.90%
发文量
59
审稿时长
110 days
期刊介绍: The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信