{"title":"利用卷积块注意力模块识别人类活动","authors":"Mohammed Zakariah , Abeer Alnuaim","doi":"10.1016/j.eij.2024.100536","DOIUrl":null,"url":null,"abstract":"<div><p>Human Activity Recognition (HAR) is crucial for the advancement of applications in smart environments, communication, IoT, security, and healthcare monitoring. Convolutional neural networks (CNNs) have made substantial contributions to human activity recognition (HAR). However, they frequently encounter difficulties in accurately discerning intricate human actions in real-time situations. This study aims to fill a significant research gap by incorporating the Convolutional Block Attention Module (CBAM) into CNN architectures. The goal is to improve the extraction of features from video sequences. The CBAM boosts the performance of the network by selectively prioritizing significant spatial and channel-wise data, resulting in improved detection of subtle activity patterns and increased stability in categorization. CBAM’s attention mechanism directly focuses and amplifies essential characteristics, which sets it apart from typical CNNs that lack a refined focus mechanism. This unique approach results in improved performance in behavior identification tests. The proposed CBAM-enhanced model has been extensively tested on benchmark datasets, yielding an accuracy of 94.23% on the HMDB51 dataset. It also achieved competitive results of 83.4% and 88.9% on the UCF-101 and UCF-50 datasets, respectively. However, there is still a lack of study in comprehending how CBAM adjusts to different CNN architectures and its suitability in varied HAR situations beyond controlled datasets. In future studies, it is imperative for researchers to investigate the integration of CBAM with other CNN frameworks, assess its efficacy in practical scenarios, and explore multi-modal sensor fusion techniques to enhance its reliability and utility. This study showcases the ability of CBAM to enhance HAR capabilities and also paves the way for future research to improve activity identification systems for wider and more practical uses.</p></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1110866524000999/pdfft?md5=ecc0aedcf9be8ae7e087777abd06f4e1&pid=1-s2.0-S1110866524000999-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Recognizing human activities with the use of Convolutional Block Attention Module\",\"authors\":\"Mohammed Zakariah , Abeer Alnuaim\",\"doi\":\"10.1016/j.eij.2024.100536\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Human Activity Recognition (HAR) is crucial for the advancement of applications in smart environments, communication, IoT, security, and healthcare monitoring. Convolutional neural networks (CNNs) have made substantial contributions to human activity recognition (HAR). However, they frequently encounter difficulties in accurately discerning intricate human actions in real-time situations. This study aims to fill a significant research gap by incorporating the Convolutional Block Attention Module (CBAM) into CNN architectures. The goal is to improve the extraction of features from video sequences. The CBAM boosts the performance of the network by selectively prioritizing significant spatial and channel-wise data, resulting in improved detection of subtle activity patterns and increased stability in categorization. CBAM’s attention mechanism directly focuses and amplifies essential characteristics, which sets it apart from typical CNNs that lack a refined focus mechanism. This unique approach results in improved performance in behavior identification tests. The proposed CBAM-enhanced model has been extensively tested on benchmark datasets, yielding an accuracy of 94.23% on the HMDB51 dataset. It also achieved competitive results of 83.4% and 88.9% on the UCF-101 and UCF-50 datasets, respectively. However, there is still a lack of study in comprehending how CBAM adjusts to different CNN architectures and its suitability in varied HAR situations beyond controlled datasets. In future studies, it is imperative for researchers to investigate the integration of CBAM with other CNN frameworks, assess its efficacy in practical scenarios, and explore multi-modal sensor fusion techniques to enhance its reliability and utility. This study showcases the ability of CBAM to enhance HAR capabilities and also paves the way for future research to improve activity identification systems for wider and more practical uses.</p></div>\",\"PeriodicalId\":56010,\"journal\":{\"name\":\"Egyptian Informatics Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1110866524000999/pdfft?md5=ecc0aedcf9be8ae7e087777abd06f4e1&pid=1-s2.0-S1110866524000999-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Egyptian Informatics Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110866524000999\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866524000999","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Recognizing human activities with the use of Convolutional Block Attention Module
Human Activity Recognition (HAR) is crucial for the advancement of applications in smart environments, communication, IoT, security, and healthcare monitoring. Convolutional neural networks (CNNs) have made substantial contributions to human activity recognition (HAR). However, they frequently encounter difficulties in accurately discerning intricate human actions in real-time situations. This study aims to fill a significant research gap by incorporating the Convolutional Block Attention Module (CBAM) into CNN architectures. The goal is to improve the extraction of features from video sequences. The CBAM boosts the performance of the network by selectively prioritizing significant spatial and channel-wise data, resulting in improved detection of subtle activity patterns and increased stability in categorization. CBAM’s attention mechanism directly focuses and amplifies essential characteristics, which sets it apart from typical CNNs that lack a refined focus mechanism. This unique approach results in improved performance in behavior identification tests. The proposed CBAM-enhanced model has been extensively tested on benchmark datasets, yielding an accuracy of 94.23% on the HMDB51 dataset. It also achieved competitive results of 83.4% and 88.9% on the UCF-101 and UCF-50 datasets, respectively. However, there is still a lack of study in comprehending how CBAM adjusts to different CNN architectures and its suitability in varied HAR situations beyond controlled datasets. In future studies, it is imperative for researchers to investigate the integration of CBAM with other CNN frameworks, assess its efficacy in practical scenarios, and explore multi-modal sensor fusion techniques to enhance its reliability and utility. This study showcases the ability of CBAM to enhance HAR capabilities and also paves the way for future research to improve activity identification systems for wider and more practical uses.
期刊介绍:
The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.