Semantic segmentation based on enhanced gated pyramid network with lightweight attention module

IF 1 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

AI Communications Pub Date : 2023-12-06 DOI:10.3233/aic-220254

A. Viswanathan, V. S. kumar, M. Umamaheswari, V. Janarthanan, M. Jaganathan

{"title":"Semantic segmentation based on enhanced gated pyramid network with lightweight attention module","authors":"A. Viswanathan, V. S. kumar, M. Umamaheswari, V. Janarthanan, M. Jaganathan","doi":"10.3233/aic-220254","DOIUrl":null,"url":null,"abstract":"Semantic segmentation has made tremendous progress in recent years. The development of large datasets and the regression of convolutional models have enabled effective training of very large semantic model. Nevertheless, higher capacity indicates a higher computational problem, thus preventing real-time operation. Yet, due to the limited annotations, the models may have relied heavily on the available contexts in the training data, resulting in poor generalization to previously unseen scenes. Therefore, to resolve these issues, Enhanced Gated Pyramid network (GPNet) with Lightweight Attention Module (LAM) is proposed in this paper. GPNet is used for semantic feature extraction and GPNet is enhanced by the pre-trained dilated DetNet and Dense Connection Block (DCB). LAM approach is applied to habitually rescale the different feature channels weights. LAM module can increase the accuracy and effectiveness of the proposed methodology. The performance of proposed method is validated using Google Colab environment with different datasets such as Cityscapes, CamVid and ADE20K. The experimental results are compared with various methods like GPNet-ResNet-101 and GPNet-ResNet-50 in terms of IoU, precision, accuracy, F1 score and recall. From the overall analysis cityscapes dataset achieves 94.82% pixel accuracy.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"42 21","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Communications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/aic-220254","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Semantic segmentation has made tremendous progress in recent years. The development of large datasets and the regression of convolutional models have enabled effective training of very large semantic model. Nevertheless, higher capacity indicates a higher computational problem, thus preventing real-time operation. Yet, due to the limited annotations, the models may have relied heavily on the available contexts in the training data, resulting in poor generalization to previously unseen scenes. Therefore, to resolve these issues, Enhanced Gated Pyramid network (GPNet) with Lightweight Attention Module (LAM) is proposed in this paper. GPNet is used for semantic feature extraction and GPNet is enhanced by the pre-trained dilated DetNet and Dense Connection Block (DCB). LAM approach is applied to habitually rescale the different feature channels weights. LAM module can increase the accuracy and effectiveness of the proposed methodology. The performance of proposed method is validated using Google Colab environment with different datasets such as Cityscapes, CamVid and ADE20K. The experimental results are compared with various methods like GPNet-ResNet-101 and GPNet-ResNet-50 in terms of IoU, precision, accuracy, F1 score and recall. From the overall analysis cityscapes dataset achieves 94.82% pixel accuracy.

查看原文本刊更多论文

基于带轻量级注意力模块的增强型门控金字塔网络的语义分割技术

语义分割近年来取得了巨大的进展。大数据集的发展和卷积模型的回归使得超大型语义模型的有效训练成为可能。然而，更高的容量意味着更高的计算问题，从而阻碍了实时操作。然而，由于有限的注释，模型可能严重依赖于训练数据中的可用上下文，导致对以前未见过的场景的泛化能力差。因此，为了解决这些问题，本文提出了带有轻量级关注模块(LAM)的增强型门控金字塔网络(GPNet)。GPNet用于语义特征提取，并通过预训练的扩展DetNet和密集连接块(Dense Connection Block, DCB)对GPNet进行增强。采用LAM方法习惯性地调整不同特征通道的权重。LAM模块可以提高所提出方法的准确性和有效性。利用Google Colab环境对cityscape、CamVid和ADE20K等不同的数据集进行了性能验证。实验结果与GPNet-ResNet-101和GPNet-ResNet-50等多种方法在IoU、精密度、准确度、F1分数和召回率等方面进行了比较。从整体分析来看，城市景观数据集的像素精度达到了94.82%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

AI Communications 工程技术-计算机：人工智能

CiteScore

2.30

自引率

12.50%

发文量

审稿时长

4.5 months

期刊介绍： AI Communications is a journal on artificial intelligence (AI) which has a close relationship to EurAI (European Association for Artificial Intelligence, formerly ECCAI). It covers the whole AI community: Scientific institutions as well as commercial and industrial companies. AI Communications aims to enhance contacts and information exchange between AI researchers and developers, and to provide supranational information to those concerned with AI and advanced information processing. AI Communications publishes refereed articles concerning scientific and technical AI procedures, provided they are of sufficient interest to a large readership of both scientific and practical background. In addition it contains high-level background material, both at the technical level as well as the level of opinions, policies and news.