Fatin Nabilah Shaari , Aimi Salihah Abdul Nasir , Wan Azani Mustafa , Wan Aireene Wan Ahmad , Abdul Syafiq Abdull Sukor
{"title":"Attention-enhanced hybrid CNN–LSTM network with self-adaptive CBAM for COVID-19 diagnosis","authors":"Fatin Nabilah Shaari , Aimi Salihah Abdul Nasir , Wan Azani Mustafa , Wan Aireene Wan Ahmad , Abdul Syafiq Abdull Sukor","doi":"10.1016/j.array.2025.100424","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate identification of COVID-19 still presents difficulties due to the limitations of RT-PCR testing, such as reduced sensitivity and restricted availability. Chest X-Ray (CXR) imaging modalities, combined with deep learning models, offer a non-invasive solution. However, baseline Convolutional Neural Network (CNN) commonly faced obstacles to fully capture the temporal dependencies present in sequential medical imaging data, limiting their diagnostic performance. To address this, we propose Dual-Attention CNN-LSTM, an innovative hybrid deep learning model designed to enhance COVID-19 detection from CXR images. This model synergizes CNN's spatial feature extraction capabilities with the sequential learning strengths of Long Short-Term Memory (LSTM), further enhanced by the proposed Self-Adaptive Convolutional Block Attention Module (SA-CBAM) and Multi-Head Attention (MHA). SA-CBAM enables CNN to selectively focus on critical lung abnormalities, while MHA empowers LSTM to capture temporal dependencies and dynamic variations in imaging sequences. By fusing these attention-optimized features, Dual-Attention CNN-LSTM delivers an unprecedented level of robustness in CXR classification. Additionally, this study introduces five pre-trained-LSTM models, leveraging transfer learning to enhance CXR pattern recognition and serving as comparative models for the proposed Dual-Attention CNN-LSTM. Our comprehensive evaluation across multiple baseline models for three-class classification (normal, pneumonia, COVID-19) demonstrates that Dual-Attention CNN-LSTM surpasses state-of-the-art performance, achieving a remarkable weighted accuracy of 99.97 %, with precision, recall, specificity, F1-score, and MCC all exceeding 99.95 %. These findings highlight the potential of our approach as a transformative tool for accurate and early disease diagnosis, ultimately improving clinical decision-making and patient outcomes.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"26 ","pages":"Article 100424"},"PeriodicalIF":2.3000,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005625000517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate identification of COVID-19 still presents difficulties due to the limitations of RT-PCR testing, such as reduced sensitivity and restricted availability. Chest X-Ray (CXR) imaging modalities, combined with deep learning models, offer a non-invasive solution. However, baseline Convolutional Neural Network (CNN) commonly faced obstacles to fully capture the temporal dependencies present in sequential medical imaging data, limiting their diagnostic performance. To address this, we propose Dual-Attention CNN-LSTM, an innovative hybrid deep learning model designed to enhance COVID-19 detection from CXR images. This model synergizes CNN's spatial feature extraction capabilities with the sequential learning strengths of Long Short-Term Memory (LSTM), further enhanced by the proposed Self-Adaptive Convolutional Block Attention Module (SA-CBAM) and Multi-Head Attention (MHA). SA-CBAM enables CNN to selectively focus on critical lung abnormalities, while MHA empowers LSTM to capture temporal dependencies and dynamic variations in imaging sequences. By fusing these attention-optimized features, Dual-Attention CNN-LSTM delivers an unprecedented level of robustness in CXR classification. Additionally, this study introduces five pre-trained-LSTM models, leveraging transfer learning to enhance CXR pattern recognition and serving as comparative models for the proposed Dual-Attention CNN-LSTM. Our comprehensive evaluation across multiple baseline models for three-class classification (normal, pneumonia, COVID-19) demonstrates that Dual-Attention CNN-LSTM surpasses state-of-the-art performance, achieving a remarkable weighted accuracy of 99.97 %, with precision, recall, specificity, F1-score, and MCC all exceeding 99.95 %. These findings highlight the potential of our approach as a transformative tool for accurate and early disease diagnosis, ultimately improving clinical decision-making and patient outcomes.