Annapareddy V. N. Reddy, Pradeep Kumar Mallick, Sachin Kumar, Debahuti Mishra, P. Ashok Reddy, Sambasivarao Chindam
{"title":"猫鼬优化SENet方法:视网膜眼底图像增强和分类的进展","authors":"Annapareddy V. N. Reddy, Pradeep Kumar Mallick, Sachin Kumar, Debahuti Mishra, P. Ashok Reddy, Sambasivarao Chindam","doi":"10.1007/s13369-025-10548-5","DOIUrl":null,"url":null,"abstract":"<div><p>This manuscript explores the dynamic field of retinal fundus image classification, harnessing diverse machine and deep learning (DL) techniques. It emphasizes the transformative potential of transformer-based architectures, originally designed for natural language processing, in reshaping image classification tasks. These architectures excel in capturing long-range dependencies within images, enhancing the comprehension of complex patterns. The research addresses the persistent challenge of limited training data by introducing innovative data augmentation strategies. A pioneering stacked augmentation approach, incorporating DL-based techniques, refines images at the pixel level, producing nuanced augmented counterparts. Notably, this approach systematically stacks augmented images along the third dimension, enhancing model accuracy while significantly reducing the sample size, expediting the training process. Additionally, the manuscript introduces the Meerkat optimizer, a cooperative multi-agent optimization technique, to enhance the classification accuracy of the squeeze-and-excitation network (SENet). Inspired by Meerkat social behavior, this optimization strategy navigates the solution space efficiently, leading to robust model configurations. Comparative evaluations with traditional optimization techniques validate the superior performance of Meerkat-optimized SENet. In a broader context, the study sheds light on the nuanced behaviors of various transformer networks in retinal fundus image classification, including pyramid vision transformer, bottleneck transformer, convolutional vision transformer, swin transformer, ViT, spatial transformer network (STNet), and SENet. Furthermore, an in-depth analysis of augmentation insights highlights consistent performance improvement across transformer networks when coupled with DL-based augmentation. SENet emerges as a standout performer, showcasing exceptional learning and generalization in diverse augmentation scenarios and datasets. The investigation into decision variables optimization for SENet through the Meerkat optimizer provides detailed insights into the network's behavior, including the selection of squeeze type (<span>\\(S\\)</span>), excitation operator (<span>\\(E\\)</span>), and reduction ratio (<span>\\(r\\)</span>), showcasing the adaptability and efficiency of the Meerkat optimization strategy.</p></div>","PeriodicalId":54354,"journal":{"name":"Arabian Journal for Science and Engineering","volume":"50 19","pages":"15235 - 15279"},"PeriodicalIF":2.9000,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Meerkat-Optimized SENet Approach: Advancements in Retinal Fundus Image Augmentation and Classification\",\"authors\":\"Annapareddy V. N. Reddy, Pradeep Kumar Mallick, Sachin Kumar, Debahuti Mishra, P. Ashok Reddy, Sambasivarao Chindam\",\"doi\":\"10.1007/s13369-025-10548-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This manuscript explores the dynamic field of retinal fundus image classification, harnessing diverse machine and deep learning (DL) techniques. It emphasizes the transformative potential of transformer-based architectures, originally designed for natural language processing, in reshaping image classification tasks. These architectures excel in capturing long-range dependencies within images, enhancing the comprehension of complex patterns. The research addresses the persistent challenge of limited training data by introducing innovative data augmentation strategies. A pioneering stacked augmentation approach, incorporating DL-based techniques, refines images at the pixel level, producing nuanced augmented counterparts. Notably, this approach systematically stacks augmented images along the third dimension, enhancing model accuracy while significantly reducing the sample size, expediting the training process. Additionally, the manuscript introduces the Meerkat optimizer, a cooperative multi-agent optimization technique, to enhance the classification accuracy of the squeeze-and-excitation network (SENet). Inspired by Meerkat social behavior, this optimization strategy navigates the solution space efficiently, leading to robust model configurations. Comparative evaluations with traditional optimization techniques validate the superior performance of Meerkat-optimized SENet. In a broader context, the study sheds light on the nuanced behaviors of various transformer networks in retinal fundus image classification, including pyramid vision transformer, bottleneck transformer, convolutional vision transformer, swin transformer, ViT, spatial transformer network (STNet), and SENet. Furthermore, an in-depth analysis of augmentation insights highlights consistent performance improvement across transformer networks when coupled with DL-based augmentation. SENet emerges as a standout performer, showcasing exceptional learning and generalization in diverse augmentation scenarios and datasets. The investigation into decision variables optimization for SENet through the Meerkat optimizer provides detailed insights into the network's behavior, including the selection of squeeze type (<span>\\\\(S\\\\)</span>), excitation operator (<span>\\\\(E\\\\)</span>), and reduction ratio (<span>\\\\(r\\\\)</span>), showcasing the adaptability and efficiency of the Meerkat optimization strategy.</p></div>\",\"PeriodicalId\":54354,\"journal\":{\"name\":\"Arabian Journal for Science and Engineering\",\"volume\":\"50 19\",\"pages\":\"15235 - 15279\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Arabian Journal for Science and Engineering\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s13369-025-10548-5\",\"RegionNum\":4,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal for Science and Engineering","FirstCategoryId":"103","ListUrlMain":"https://link.springer.com/article/10.1007/s13369-025-10548-5","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Meerkat-Optimized SENet Approach: Advancements in Retinal Fundus Image Augmentation and Classification
This manuscript explores the dynamic field of retinal fundus image classification, harnessing diverse machine and deep learning (DL) techniques. It emphasizes the transformative potential of transformer-based architectures, originally designed for natural language processing, in reshaping image classification tasks. These architectures excel in capturing long-range dependencies within images, enhancing the comprehension of complex patterns. The research addresses the persistent challenge of limited training data by introducing innovative data augmentation strategies. A pioneering stacked augmentation approach, incorporating DL-based techniques, refines images at the pixel level, producing nuanced augmented counterparts. Notably, this approach systematically stacks augmented images along the third dimension, enhancing model accuracy while significantly reducing the sample size, expediting the training process. Additionally, the manuscript introduces the Meerkat optimizer, a cooperative multi-agent optimization technique, to enhance the classification accuracy of the squeeze-and-excitation network (SENet). Inspired by Meerkat social behavior, this optimization strategy navigates the solution space efficiently, leading to robust model configurations. Comparative evaluations with traditional optimization techniques validate the superior performance of Meerkat-optimized SENet. In a broader context, the study sheds light on the nuanced behaviors of various transformer networks in retinal fundus image classification, including pyramid vision transformer, bottleneck transformer, convolutional vision transformer, swin transformer, ViT, spatial transformer network (STNet), and SENet. Furthermore, an in-depth analysis of augmentation insights highlights consistent performance improvement across transformer networks when coupled with DL-based augmentation. SENet emerges as a standout performer, showcasing exceptional learning and generalization in diverse augmentation scenarios and datasets. The investigation into decision variables optimization for SENet through the Meerkat optimizer provides detailed insights into the network's behavior, including the selection of squeeze type (\(S\)), excitation operator (\(E\)), and reduction ratio (\(r\)), showcasing the adaptability and efficiency of the Meerkat optimization strategy.
期刊介绍:
King Fahd University of Petroleum & Minerals (KFUPM) partnered with Springer to publish the Arabian Journal for Science and Engineering (AJSE).
AJSE, which has been published by KFUPM since 1975, is a recognized national, regional and international journal that provides a great opportunity for the dissemination of research advances from the Kingdom of Saudi Arabia, MENA and the world.