Marie-Christine Pali , Christina Schwaiger , Malik Galijasevic , Valentin K. Ladenhauf , Stephanie Mangesius , Elke R. Gizewski
{"title":"Semi-supervised learning and integration of multi-sequence MR-images for carotid vessel wall and plaque segmentation","authors":"Marie-Christine Pali , Christina Schwaiger , Malik Galijasevic , Valentin K. Ladenhauf , Stephanie Mangesius , Elke R. Gizewski","doi":"10.1016/j.cmpbup.2025.100230","DOIUrl":"10.1016/j.cmpbup.2025.100230","url":null,"abstract":"<div><div>The analysis of carotid arteries, particularly plaques, in multi-sequence Magnetic Resonance Imaging (MRI) data is crucial for assessing the risk of atherosclerosis and ischemic stroke. In order to evaluate metrics and radiomic features, quantifying the state of atherosclerosis, accurate segmentation is important. However, the complex morphology of plaques and the scarcity of labeled data poses significant challenges. In this work, we address these problems and propose a semi-supervised deep learning-based approach designed to effectively integrate multi-sequence MRI data for the segmentation of carotid artery vessel wall and plaque. The proposed algorithm consists of two networks: a coarse localization model identifies the region of interest guided by some prior knowledge on the position and number of carotid arteries, followed by a fine segmentation model for precise delineation of vessel walls and plaques. To effectively integrate complementary information across different MRI sequences, we investigate different fusion strategies and introduce a multi-level multi-sequence version of U-Net architecture. To address the challenges of limited labeled data and the complexity of carotid artery MRI, we propose a semi-supervised approach that enforces consistency under various input transformations. Our approach is evaluated on 52 patients with arteriosclerosis, each with five MRI sequences. Comprehensive experiments demonstrate the effectiveness of our approach and emphasize the role of fusion point selection in U-Net-based architectures. To validate the accuracy of our results, we also include an expert-based assessment of model performance. Our findings highlight the potential of fusion strategies and semi-supervised learning for improving carotid artery segmentation in data-limited MRI applications.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"9 ","pages":"Article 100230"},"PeriodicalIF":0.0,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reassessment of pelvic radiographic measurements for delivery prediction using machine learning","authors":"Ayano Suemori , Jota Maki , Hikaru Ooba , Hikari Nakato , Keiichi Oishi , Tomohiro Mitoma , Sakurako Mishima , Akiko Ohira , Satoe Kirino , Eriko Eto , Hisashi Masuyama","doi":"10.1016/j.cmpbup.2026.100231","DOIUrl":"10.1016/j.cmpbup.2026.100231","url":null,"abstract":"<div><h3>Background and Objective</h3><div>Pelvimetry has historically shown limitations in diagnosing cephalopelvic disproportion, yet recent evidence suggests potential predictive value. This study uses artificial intelligence to reassess pelvimetry's utility in predicting cesarean section.</div></div><div><h3>Methods</h3><div>This single-center, retrospective case-control study included pregnant women at 37 weeks 0 days and 41 weeks 6 days of gestation, who underwent pelvic radiography for suspected cephalopelvic disproportion from January 2015 to August 2023. Pelvic radiographic measurements were obtained using the Guthmann-Sussmann method. Maternal characteristics, ultrasound examination data, and pelvimetric measurements were extracted from electronic medical records as potential predictors of delivery outcomes. In this study, the input data were analyzed using four machine learning models: Light Gradient Boosting Machine, Random Forest, Extreme Gradient Boosting, and Category Boosting. The primary outcome was the hierarchical importance of pelvic measurements in the predictive models.</div></div><div><h3>Results</h3><div>Analysis included 355 participants. The strongest predictors were the differences between (1) the obstetric conjugate and biparietal diameter and (2) the interspinous diameter and biparietal diameter. The receiver operating characteristic curve for each model was Light Gradient Boosting Machine 0.74, Random Forest 0.85, Extreme Gradient Boosting 0.83, and Category Boosting 0.82.</div></div><div><h3>Conclusions</h3><div>We developed high-performance machine learning models demonstrating that pelvimetric measurements— particularly, the differences between the obstetric conjugate and biparietal diameter, and between the interspinous diameter and biparietal diameter —combined with maternal and ultrasound factors, are strong predictors of cesarean section. The model’s ability to capture nonlinear associations may enhance predictive accuracy, and reassessing pelvimetric values could support delivery planning in clinical settings.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"9 ","pages":"Article 100231"},"PeriodicalIF":0.0,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146037809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictive analysis of clinical features for HPV status in oropharynx squamous cell carcinoma: A machine learning approach with explainability","authors":"Emily Diaz Badilla , Ignasi Cos , Claudio Sampieri , Berta Alegre , Isabel Vilaseca , Simone Balocco , Petia Radeva","doi":"10.1016/j.cmpbup.2024.100170","DOIUrl":"10.1016/j.cmpbup.2024.100170","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Oropharynx Squamous Cell Carcinoma (OPSCC) linked to Human Papillomavirus (HPV) exhibits a more favorable prognosis than other squamous cell carcinomas of the upper aerodigestive tract. Finding reliable non-invasive detection methods of this prognostic entity is key to propose appropriate therapeutic decisions. This study aims to provide a comprehensive method based on pre-treatment clinical data for predicting the patient’s HPV status over a large OPSCC patient cohort and employing explainability techniques to interpret the significance and effects of the features.</div></div><div><h3>Materials and Methods:</h3><div>We employed the RADCURE dataset clinical information to train six Machine Learning algorithms, evaluating them via cross-validation for grid search hyper-parameter tuning and feature selection as well as a final performance measurement on a 20% sample test set. For explainability, SHAP and LIME were used to identify the most relevant relationships and their effect on the predictive model. Furthermore, additional publicly available datasets were scrutinized to compare outcomes and assess the method’s generalization across diverse feature sets and populations.</div></div><div><h3>Results:</h3><div>The best model yielded an AUC of 0.85, a sensitivity of 0.83, and a specificity of 0.75 over the testing set. The explainability analysis highlighted the remarkable significance of specific clinical attributes, in particular the oropharynx subsite tumor location and the patient’s smoking history. The contribution of each variable to the prediction was substantiated by creating a 95% confidence intervals of model coefficients by means of a 10,000 sample bootstrap and by analyzing top contributors across the best-performing models.</div></div><div><h3>Conclusions:</h3><div>The combination of specific clinical factors typically collected for OPSCC patients, such as smoking habits and the tumor oropharynx sub-location, along with the ML models hereby presented, can by themselves provide an informed analysis of the HPV status, and of proper use of data science techniques to explain it. Future work should focus on adding other data modalities such as CT scans to enhance performance and to uncover new relations, thus aiding medical practitioners in diagnosing OPSCC more accurately.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"7 ","pages":"Article 100170"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143180353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Noemi Giordano, Silvia Cannone, Gabriella Balestra, Marco Knaflitz
{"title":"Independence on the lead of the identification of the ventricular depolarization in the electrocardiogram in wearable devices","authors":"Noemi Giordano, Silvia Cannone, Gabriella Balestra, Marco Knaflitz","doi":"10.1016/j.cmpbup.2025.100196","DOIUrl":"10.1016/j.cmpbup.2025.100196","url":null,"abstract":"<div><h3>Goal</h3><div>The home monitoring of cardiac time intervals reduces hospitalization and mortality of cardiovascular patients. However, a reliable time reference in the electrocardiogram is necessary. Nevertheless, the use of different single leads, typical of wearable devices, impacts the repeatability of the time reference and thus the accuracy of the time-dependent parameters. This work proposes a simple approach to detect the peak and onset of the ventricular depolarization, and demonstrates its lead independence, which makes it suitable for wearable devices even with non-standard leads.</div></div><div><h3>Methods</h3><div>Our method grounds on an energy-based approach, which we applied on a) a publicly available dataset with standard 12-lead recordings; b) a proof-of-concept dataset including a custom precordial non-standard lead implemented on a wearable device.</div></div><div><h3>Results</h3><div>Compared against the Pan-Tompkins algorithm, our method reduced the absolute error between each lead and the first standard lead by 26 % to 64 % for the peak, and by 70 % to 82 % for the onset detection. The achieved consistency across leads is compatible with clinical monitoring. The computational time was also reduced by 65 % to 96 %, making the algorithm suitable for use on microcontroller-based wearable devices.</div></div><div><h3>Conclusions</h3><div>The proposed method enables the identification of a stable reference of the ventricular depolarization regardless of the choice of the lead. The presented results open to the implementation on wearable devices for chronic disease monitoring purposes.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"8 ","pages":"Article 100196"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144230049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A Z M Ehtesham Chowdhury , Andrew Mehnert , Graham Mann , William H. Morgan , Ferdous Sohel
{"title":"Multiscale guided attention network for optic disc segmentation of retinal images","authors":"A Z M Ehtesham Chowdhury , Andrew Mehnert , Graham Mann , William H. Morgan , Ferdous Sohel","doi":"10.1016/j.cmpbup.2025.100180","DOIUrl":"10.1016/j.cmpbup.2025.100180","url":null,"abstract":"<div><div>Optic disc (OD) segmentation from retinal images is crucial for diagnosing, assessing, and tracking the progression of several sight-threatening diseases. This paper presents a deep machine-learning method for semantically segmenting OD from retinal images. The method is named multiscale guided attention network (MSGANet-OD), comprising encoders for extracting multiscale features and decoders for constructing segmentation maps from the extracted features. The decoder also includes a guided attention module that incorporates features related to structural, contextual, and illumination information to segment OD. A custom loss function is proposed to retain the optic disc's geometrical shape (i.e., elliptical) constraint and to alleviate the blood vessels' influence in the overlapping region between the OD and vessels. MSGANet-OD was trained and tested on an in-house clinical color retinal image dataset captured during ophthalmodynamometry as well as on several publicly available color fundus image datasets, e.g., DRISHTI-GS, RIM-ONE-r3, and REFUGE1. Experimental results show that MSGANet-OD achieved superior OD segmentation performance from ophthalmodynamometry images compared to widely used segmentation methods. Our method also achieved competitive results compared to state-of-the-art OD segmentation methods on public datasets. The proposed method can be used in automated systems to quantitatively assess optic nerve head abnormalities (e.g., glaucoma, optic disc neuropathy) and vascular changes in the OD region.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"7 ","pages":"Article 100180"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143179430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust lung segmentation in Chest X-ray images using modified U-Net with deeper network and residual blocks","authors":"Wiley Tam , Paul Babyn , Javad Alirezaie","doi":"10.1016/j.cmpbup.2025.100211","DOIUrl":"10.1016/j.cmpbup.2025.100211","url":null,"abstract":"<div><div>Lung diseases remain a leading cause of mortality worldwide, as evidenced by statistics from the World Health Organization (WHO). The limited availability of radiologists to interpret Chest X-ray (CXR) images for diagnosing common lung conditions poses a significant challenge, often resulting in delayed diagnosis and treatment. In response, Computer-Aided Diagnostic (CAD) tools can be used to potentially streamline and expedite the diagnostic process. Recently, deep learning techniques have gained prominence in the automated analysis of CXR images, particularly in segmenting lung regions as a critical preliminary step. This study aims to develop and evaluate a lung segmentation model based on a modified U-Net architecture. The architecture leverages techniques such as transfer learning with DenseNet201 as a feature extractor alongside dilated convolutions and residual blocks. An ablation study was conducted to evaluate these architectural components, along with additional elements like augmented data, alternative backbones, and attention mechanisms. Numerous and extensive experiments were performed on two publicly available datasets, the Montgomery County (MC) and Shenzhen Hospital (SH) datasets, to validate the efficacy of these techniques on segmentation performance. Outperforming other state-of-the-art methods on the MC dataset, the proposed model achieved a Jaccard Index (IoU) of 97.77 and a Dice Similarity Coefficient (DSC) of 98.87. These results represent a significant improvement over the baseline U-Net, with gains of 3.37% and 1.75% in IoU and DSC, respectively. These findings highlight the importance of architectural enhancements in deep learning-based lung segmentation models, contributing to more efficient, accurate, and reliable CAD systems for lung disease assessment.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"8 ","pages":"Article 100211"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Acoustic cues for person identification using cough sounds","authors":"Van-Thuan Tran, Ting-Hao You, Wei-Ho Tsai","doi":"10.1016/j.cmpbup.2025.100195","DOIUrl":"10.1016/j.cmpbup.2025.100195","url":null,"abstract":"<div><h3>Objectives</h3><div>This study presents an improved approach to person identification (PID) using nonverbal vocalizations, focusing specifically on cough sounds as a biometric modality. While recent works have demonstrated the feasibility of cough-based PID (CPID), most report accuracies around 80–90 % and could face limitations in terms of model efficiency, generalization, or robustness. Our objective is to advance CPID performance through compact model design and more effective training strategies.</div></div><div><h3>Methods</h3><div>We collected a custom dataset from 19 subjects and developed a lightweight yet effective deep learning framework for CPID. The proposed architecture, CoughCueNet, is a convolutional recurrent neural network designed to capture both spatial and temporal patterns in cough sounds. The training process incorporates a hybrid loss function that combines supervised contrastive (SC) learning and cross-entropy (CE) loss to enhance feature discrimination. We systematically evaluated multiple acoustic representations, including MFCCs and spectrograms, to identify the most suitable features. We also applied data augmentation for robustness and investigated cross-modal transferability by testing speech-trained models on cough data.</div></div><div><h3>Results</h3><div>Our CPID system achieved a mean identification accuracy of 97.18 %. Training the proposed CoughCueNet using a hybrid SC+CE loss function consistently improved model generalization and robustness. It outperformed the same network and larger-capacity networks (i.e., VGG16 and ResNet50) trained with CE loss alone, which achieved accuracies around 90 %. Among the tested features, MFCCs yielded superior identification performance over spectrograms. Experiments with speech-trained models tested on coughs revealed limited cross-vocal transferability, emphasizing the need for cough-specific models.</div></div><div><h3>Conclusion</h3><div>This work advances the state of cough-based PID by demonstrating that high-accuracy identification is achievable using compact models and hybrid training strategies. It establishes cough sounds as a practical and distinctive biometric modality, with promising applications in security, user authentication, and health monitoring, particularly in environments where speech-based systems are less reliable or infeasible.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"8 ","pages":"Article 100195"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144230050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing stroke prediction models: A mixing of data augmentation and transfer learning for small-scale dataset in machine learning","authors":"Imam Tahyudin , Ade Nurhopipah , Ades Tikaningsih , Puji Lestari , Yaya Suryana , Edi Winarko , Eko Winarto , Nazwan Haza , Hidetaka Nambo","doi":"10.1016/j.cmpbup.2025.100198","DOIUrl":"10.1016/j.cmpbup.2025.100198","url":null,"abstract":"<div><div>Machine learning is a powerful technique for analysing datasets and making data-driven recommendations. However, in general, the performance of machine learning in recognising patterns is proportional to the size of the dataset. On the other hand, in some cases, such as in the medical field, providing an instance of a dataset takes a lot of work and budget. Therefore, additional data acquisition techniques are needed to increase data size and improve model quality.</div><div>This study applied Data Augmentation and Transfer Learning to solve small-scale dataset problems in analyzing stroke patient information in The Banyumas Regional General Hospital (RSUD Banyumas). The information is utilized to predict the patient's status when discharged from the hospital. The research compared the prediction accuracy from three solutions: Data Augmentation, Transfer Learning, and the mixing of both methods. The classification models employed in this study were four algorithms: Random Forest, Support Vector Machine, Gradient Boosting, and Extreme Gradient Boosting. We implemented the Synthetic Minority Over-sampling Technique for Nominal and Continuous to generate the artificial dataset. In the Transfer Learning process, we used a benchmark stroke dataset with a different target than ours, so we labelled it based on the nearest neighbours of the original dataset. Applying Data Augmentation in this study is a good decision because it leads to better performance than using only the original dataset. However, implementing the Transfer Learning technique does not give a satisfying result for XGBoost and SVM. Mixing Data Augmentation and Transfer Learning provides the best performance with accuracy and recall, both 0.813, the precision of 0.853497, and the F-1 score of 0.826628 given by the Random Forest model. The research can contribute significantly to developing better classification models so physicians can obtain more accurate information and help treat stroke cases more effectively and efficiently.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"8 ","pages":"Article 100198"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144500858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Resectograms: Planning liver surgery with real-time occlusion-free visualization of virtual resections","authors":"Ruoyan Meng , Davit Aghayan , Egidijus Pelanis , Bjørn Edwin , Faouzi Alaya Cheikh , Rafael Palomar","doi":"10.1016/j.cmpbup.2025.100186","DOIUrl":"10.1016/j.cmpbup.2025.100186","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Visualization of virtual resections plays a central role in computer-assisted liver surgery planning. However, the intricate liver anatomical information often results in occlusions and visualization information clutter, which can lead to inaccuracies in virtual resections. To overcome these challenges, we introduce <em>Resectograms</em>, which are planar (2D) representations of virtual resections enabling the visualization of information associated with the surgical plan.</div></div><div><h3>Methods:</h3><div>Resectograms are computed in real-time and displayed as additional 2D views showing anatomical, functional, and risk-associated information extracted from the 3D virtual resection as this is modified during planning, offering surgeons an occlusion-free visualization of the virtual resection during surgery planning. To further improve functionality, we explored three flattening methods: fixed-shape, Least Squares Conformal Maps, and As-Rigid-As-Possible, to generate these 2D views. Additionally, we optimized GPU memory usage by downsampling texture objects, ensuring errors remain within acceptable limits as defined by surgeons.</div></div><div><h3>Results:</h3><div>We evaluated Resectograms with experienced surgeons (n = 4, 9-15 years) and assessed 2D flattening methods with computer and biomedical scientists (n = 11) through visual experiments. Surgeons found Resectograms valuable for enhancing surgical planning effectiveness and accuracy. Among flattening methods, Least Squares Conformal Maps and As-Rigid-As-Possible techniques demonstrated similarly low distortion levels, superior to the fixed-shape approach. Our analysis of texture object downsampling revealed effectiveness for liver and tumor segmentations, but less so for vessel segmentations.</div></div><div><h3>Conclusions:</h3><div>This paper presents Resectograms, a novel method for visualizing liver virtual resection plans in 2D, offering an intuitive, occlusion-free representation computable in real-time. Resectograms incorporate multiple information layers, providing comprehensive data for liver surgery planning. We enhanced the visualization through improved 3D-to-2D orientation mapping and distortion-minimizing parameterization algorithms. This research contributes to advancing liver surgery planning tools by offering a more accessible and informative visualization method. The code repository for this work is available at: <span><span>https://github.com/ALive-research/Slicer-Liver</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"7 ","pages":"Article 100186"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143518926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GLAAM and GLAAI: Pioneering attention models for robust automated cataract detection","authors":"Deepak Kumar , Chaman Verma , Zoltán Illés","doi":"10.1016/j.cmpbup.2025.100182","DOIUrl":"10.1016/j.cmpbup.2025.100182","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Early detection of eye diseases, especially cataracts, is essential for preventing vision impairment. Accurate and cost-effective cataract diagnosis often requires advanced methods. This study proposes novel deep learning models that integrate global and local attention mechanisms into MobileNet and InceptionV3 architectures to improve cataract detection from fundus images.</div></div><div><h3>Methods:</h3><div>Two deep learning models, Global–Local Attention Augmented MobileNet (GLAAM) and Global–Local Attention Augmented InceptionV3 (GLAAI), were developed to enhance the analysis of fundus images. The models incorporate a combined attention mechanism to effectively capture deteriorated regions in retinal images. Data augmentation techniques were employed to prevent overfitting during training and testing on two cataract datasets. Additionally, Grad-CAM visualizations were used to increase interpretability by highlighting key regions influencing predictions.</div></div><div><h3>Results:</h3><div>The GLAAM model achieved a balanced accuracy of 97.08%, an average precision of 97.11%, and an F1-score of 97.12% on the retinal dataset. Grad-CAM visualizations confirmed the models’ ability to identify crucial cataract-related regions in fundus images.</div></div><div><h3>Conclusion:</h3><div>This study demonstrates a significant advancement in cataract diagnosis using deep learning, with GLAAM and GLAAI models exhibiting strong diagnostic performance. These models have the potential to enhance diagnostic tools and improve patient care by offering a cost-effective and accurate solution for cataract detection, suitable for integration into clinical settings.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"7 ","pages":"Article 100182"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}