Shihao Wang , Zhengxing Huang , Xirali Ablat , Alimjan Aysa , Kurban Ubul
{"title":"HyperSegmenter:重新评估大型核CNN架构在高效语义分割中的潜力","authors":"Shihao Wang , Zhengxing Huang , Xirali Ablat , Alimjan Aysa , Kurban Ubul","doi":"10.1016/j.eswa.2025.128221","DOIUrl":null,"url":null,"abstract":"<div><div>Semantic segmentation aims to precisely delineate the semantic content of each pixel in images, providing profound comprehension and precise localization for diverse vision tasks. While the advent of the Vision Transformer architecture has significantly propelled the field forward, these approaches still encounter challenges such as local inductive bias and elevated time complexity stemming from self-attention mechanisms. Addressing these issues, this paper reassesses the convolutional neural network architecture. We introduce an efficient convolutional operator and establish the SCU module as foundational to alleviate constraints observed in current methodologies. Furthermore, to mitigate redundancy within decoder structures, we endeavored to redesign a ’Sandwich’ decoder integrating the LKD and AKConv modules, specifically designed for demanding semantic segmentation tasks. Our model, termed HyperSegmenter, endeavors to enhance both efficiency and adaptability. HyperSegmenter is categorized into four iterations: Tiny, Small, Base, and Large, and underwent rigorous evaluations across three benchmark datasets-ADE20K, Cityscape, and COCO-Stuff. Experimental outcomes demonstrate substantial performance gains, achieving respective accuracies of 52.23 %, 82.54 %, and 48.91 %. These results underscore its efficacy and applicability in intricate scenarios.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"287 ","pages":"Article 128221"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HyperSegmenter: Reappraising the potential of large kernel CNN architecture in efficient semantic segmentation\",\"authors\":\"Shihao Wang , Zhengxing Huang , Xirali Ablat , Alimjan Aysa , Kurban Ubul\",\"doi\":\"10.1016/j.eswa.2025.128221\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Semantic segmentation aims to precisely delineate the semantic content of each pixel in images, providing profound comprehension and precise localization for diverse vision tasks. While the advent of the Vision Transformer architecture has significantly propelled the field forward, these approaches still encounter challenges such as local inductive bias and elevated time complexity stemming from self-attention mechanisms. Addressing these issues, this paper reassesses the convolutional neural network architecture. We introduce an efficient convolutional operator and establish the SCU module as foundational to alleviate constraints observed in current methodologies. Furthermore, to mitigate redundancy within decoder structures, we endeavored to redesign a ’Sandwich’ decoder integrating the LKD and AKConv modules, specifically designed for demanding semantic segmentation tasks. Our model, termed HyperSegmenter, endeavors to enhance both efficiency and adaptability. HyperSegmenter is categorized into four iterations: Tiny, Small, Base, and Large, and underwent rigorous evaluations across three benchmark datasets-ADE20K, Cityscape, and COCO-Stuff. Experimental outcomes demonstrate substantial performance gains, achieving respective accuracies of 52.23 %, 82.54 %, and 48.91 %. These results underscore its efficacy and applicability in intricate scenarios.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"287 \",\"pages\":\"Article 128221\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S095741742501841X\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095741742501841X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
HyperSegmenter: Reappraising the potential of large kernel CNN architecture in efficient semantic segmentation
Semantic segmentation aims to precisely delineate the semantic content of each pixel in images, providing profound comprehension and precise localization for diverse vision tasks. While the advent of the Vision Transformer architecture has significantly propelled the field forward, these approaches still encounter challenges such as local inductive bias and elevated time complexity stemming from self-attention mechanisms. Addressing these issues, this paper reassesses the convolutional neural network architecture. We introduce an efficient convolutional operator and establish the SCU module as foundational to alleviate constraints observed in current methodologies. Furthermore, to mitigate redundancy within decoder structures, we endeavored to redesign a ’Sandwich’ decoder integrating the LKD and AKConv modules, specifically designed for demanding semantic segmentation tasks. Our model, termed HyperSegmenter, endeavors to enhance both efficiency and adaptability. HyperSegmenter is categorized into four iterations: Tiny, Small, Base, and Large, and underwent rigorous evaluations across three benchmark datasets-ADE20K, Cityscape, and COCO-Stuff. Experimental outcomes demonstrate substantial performance gains, achieving respective accuracies of 52.23 %, 82.54 %, and 48.91 %. These results underscore its efficacy and applicability in intricate scenarios.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.