A swin transformer-driven framework for gesture recognition to assist hearing impaired people by integrating deep learning with secretary bird optimization algorithm
{"title":"A swin transformer-driven framework for gesture recognition to assist hearing impaired people by integrating deep learning with secretary bird optimization algorithm","authors":"Mohammed S. Assiri , Mahmoud M. Selim","doi":"10.1016/j.asej.2025.103383","DOIUrl":null,"url":null,"abstract":"<div><div>Hand gestures (HG) are the key communication technique for hearing-impaired people, which poses a problem for millions of individuals globally after communicating with those who don’t have hearing impairments. The importance of technology in improving accessibility and thus raising the standard of living for persons with hearing impairments is globally acclaimed. Machine learning (ML) is a section of artificial intelligence (AI) that concentrates on developing a method that depends on data. The major problem of HG recognition is that the machine does not identify the human language straightforwardly, and human–machine interaction is required of media for communication, which is determined by machines and, in addition to humans, to assist hearing-impaired individuals and ageing people. Thus, HG recognition as a communication media is necessary to provide instructions to the computer. This paper proposes the Swin Transformer-Driven Framework for Gesture Recognition by Integrating Deep Learning with the Secretary Bird Optimization (STFGR-IDLSBO) methodology. The main intention of the STFGR-IDLSBO methodology is to develop an efficient and robust system for gesture recognition to assist hearing-impaired persons. Initially, the proposed STFGR-IDLSBO method utilizes adaptive bilateral filtering (ABF) in the image pre-processing stage to reduce noise while preserving the edges of the gestures in the captured images. Furthermore, the swin transformer (ST) is a feature extractor that effectively captures multiscale representations and spatial hierarchies from gesture images. The hybrid model integrates the convolutional neural network and bi-directional long short-term memory (CNN-BiLSTM) technique, which is employed for the gesture classification process. Finally, the secretary bird optimizer algorithm (SBOA) is utilized for the optimum hyperparameter tuning of the CNN-BiLSTM classifier. To ensure the enhanced performance of the STFGR-IDLSBO methodology, a wide range simulation investigation is performed under the Traffic Police Gesture dataset. The performance validation of the STFGR-IDLSBO technique portrayed a superior accuracy value of 99.25% over existing methods.</div></div>","PeriodicalId":48648,"journal":{"name":"Ain Shams Engineering Journal","volume":"16 6","pages":"Article 103383"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ain Shams Engineering Journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2090447925001248","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Hand gestures (HG) are the key communication technique for hearing-impaired people, which poses a problem for millions of individuals globally after communicating with those who don’t have hearing impairments. The importance of technology in improving accessibility and thus raising the standard of living for persons with hearing impairments is globally acclaimed. Machine learning (ML) is a section of artificial intelligence (AI) that concentrates on developing a method that depends on data. The major problem of HG recognition is that the machine does not identify the human language straightforwardly, and human–machine interaction is required of media for communication, which is determined by machines and, in addition to humans, to assist hearing-impaired individuals and ageing people. Thus, HG recognition as a communication media is necessary to provide instructions to the computer. This paper proposes the Swin Transformer-Driven Framework for Gesture Recognition by Integrating Deep Learning with the Secretary Bird Optimization (STFGR-IDLSBO) methodology. The main intention of the STFGR-IDLSBO methodology is to develop an efficient and robust system for gesture recognition to assist hearing-impaired persons. Initially, the proposed STFGR-IDLSBO method utilizes adaptive bilateral filtering (ABF) in the image pre-processing stage to reduce noise while preserving the edges of the gestures in the captured images. Furthermore, the swin transformer (ST) is a feature extractor that effectively captures multiscale representations and spatial hierarchies from gesture images. The hybrid model integrates the convolutional neural network and bi-directional long short-term memory (CNN-BiLSTM) technique, which is employed for the gesture classification process. Finally, the secretary bird optimizer algorithm (SBOA) is utilized for the optimum hyperparameter tuning of the CNN-BiLSTM classifier. To ensure the enhanced performance of the STFGR-IDLSBO methodology, a wide range simulation investigation is performed under the Traffic Police Gesture dataset. The performance validation of the STFGR-IDLSBO technique portrayed a superior accuracy value of 99.25% over existing methods.
期刊介绍:
in Shams Engineering Journal is an international journal devoted to publication of peer reviewed original high-quality research papers and review papers in both traditional topics and those of emerging science and technology. Areas of both theoretical and fundamental interest as well as those concerning industrial applications, emerging instrumental techniques and those which have some practical application to an aspect of human endeavor, such as the preservation of the environment, health, waste disposal are welcome. The overall focus is on original and rigorous scientific research results which have generic significance.
Ain Shams Engineering Journal focuses upon aspects of mechanical engineering, electrical engineering, civil engineering, chemical engineering, petroleum engineering, environmental engineering, architectural and urban planning engineering. Papers in which knowledge from other disciplines is integrated with engineering are especially welcome like nanotechnology, material sciences, and computational methods as well as applied basic sciences: engineering mathematics, physics and chemistry.