基于带轻量级注意力模块的增强型门控金字塔网络的语义分割技术

IF 1.4 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
A. Viswanathan, V. S. kumar, M. Umamaheswari, V. Janarthanan, M. Jaganathan
{"title":"基于带轻量级注意力模块的增强型门控金字塔网络的语义分割技术","authors":"A. Viswanathan, V. S. kumar, M. Umamaheswari, V. Janarthanan, M. Jaganathan","doi":"10.3233/aic-220254","DOIUrl":null,"url":null,"abstract":"Semantic segmentation has made tremendous progress in recent years. The development of large datasets and the regression of convolutional models have enabled effective training of very large semantic model. Nevertheless, higher capacity indicates a higher computational problem, thus preventing real-time operation. Yet, due to the limited annotations, the models may have relied heavily on the available contexts in the training data, resulting in poor generalization to previously unseen scenes. Therefore, to resolve these issues, Enhanced Gated Pyramid network (GPNet) with Lightweight Attention Module (LAM) is proposed in this paper. GPNet is used for semantic feature extraction and GPNet is enhanced by the pre-trained dilated DetNet and Dense Connection Block (DCB). LAM approach is applied to habitually rescale the different feature channels weights. LAM module can increase the accuracy and effectiveness of the proposed methodology. The performance of proposed method is validated using Google Colab environment with different datasets such as Cityscapes, CamVid and ADE20K. The experimental results are compared with various methods like GPNet-ResNet-101 and GPNet-ResNet-50 in terms of IoU, precision, accuracy, F1 score and recall. From the overall analysis cityscapes dataset achieves 94.82% pixel accuracy.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"42 21","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic segmentation based on enhanced gated pyramid network with lightweight attention module\",\"authors\":\"A. Viswanathan, V. S. kumar, M. Umamaheswari, V. Janarthanan, M. Jaganathan\",\"doi\":\"10.3233/aic-220254\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic segmentation has made tremendous progress in recent years. The development of large datasets and the regression of convolutional models have enabled effective training of very large semantic model. Nevertheless, higher capacity indicates a higher computational problem, thus preventing real-time operation. Yet, due to the limited annotations, the models may have relied heavily on the available contexts in the training data, resulting in poor generalization to previously unseen scenes. Therefore, to resolve these issues, Enhanced Gated Pyramid network (GPNet) with Lightweight Attention Module (LAM) is proposed in this paper. GPNet is used for semantic feature extraction and GPNet is enhanced by the pre-trained dilated DetNet and Dense Connection Block (DCB). LAM approach is applied to habitually rescale the different feature channels weights. LAM module can increase the accuracy and effectiveness of the proposed methodology. The performance of proposed method is validated using Google Colab environment with different datasets such as Cityscapes, CamVid and ADE20K. The experimental results are compared with various methods like GPNet-ResNet-101 and GPNet-ResNet-50 in terms of IoU, precision, accuracy, F1 score and recall. From the overall analysis cityscapes dataset achieves 94.82% pixel accuracy.\",\"PeriodicalId\":50835,\"journal\":{\"name\":\"AI Communications\",\"volume\":\"42 21\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.3233/aic-220254\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Communications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/aic-220254","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

语义分割近年来取得了巨大的进展。大数据集的发展和卷积模型的回归使得超大型语义模型的有效训练成为可能。然而,更高的容量意味着更高的计算问题,从而阻碍了实时操作。然而,由于有限的注释,模型可能严重依赖于训练数据中的可用上下文,导致对以前未见过的场景的泛化能力差。因此,为了解决这些问题,本文提出了带有轻量级关注模块(LAM)的增强型门控金字塔网络(GPNet)。GPNet用于语义特征提取,并通过预训练的扩展DetNet和密集连接块(Dense Connection Block, DCB)对GPNet进行增强。采用LAM方法习惯性地调整不同特征通道的权重。LAM模块可以提高所提出方法的准确性和有效性。利用Google Colab环境对cityscape、CamVid和ADE20K等不同的数据集进行了性能验证。实验结果与GPNet-ResNet-101和GPNet-ResNet-50等多种方法在IoU、精密度、准确度、F1分数和召回率等方面进行了比较。从整体分析来看,城市景观数据集的像素精度达到了94.82%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Semantic segmentation based on enhanced gated pyramid network with lightweight attention module
Semantic segmentation has made tremendous progress in recent years. The development of large datasets and the regression of convolutional models have enabled effective training of very large semantic model. Nevertheless, higher capacity indicates a higher computational problem, thus preventing real-time operation. Yet, due to the limited annotations, the models may have relied heavily on the available contexts in the training data, resulting in poor generalization to previously unseen scenes. Therefore, to resolve these issues, Enhanced Gated Pyramid network (GPNet) with Lightweight Attention Module (LAM) is proposed in this paper. GPNet is used for semantic feature extraction and GPNet is enhanced by the pre-trained dilated DetNet and Dense Connection Block (DCB). LAM approach is applied to habitually rescale the different feature channels weights. LAM module can increase the accuracy and effectiveness of the proposed methodology. The performance of proposed method is validated using Google Colab environment with different datasets such as Cityscapes, CamVid and ADE20K. The experimental results are compared with various methods like GPNet-ResNet-101 and GPNet-ResNet-50 in terms of IoU, precision, accuracy, F1 score and recall. From the overall analysis cityscapes dataset achieves 94.82% pixel accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
AI Communications
AI Communications 工程技术-计算机:人工智能
CiteScore
2.30
自引率
12.50%
发文量
34
审稿时长
4.5 months
期刊介绍: AI Communications is a journal on artificial intelligence (AI) which has a close relationship to EurAI (European Association for Artificial Intelligence, formerly ECCAI). It covers the whole AI community: Scientific institutions as well as commercial and industrial companies. AI Communications aims to enhance contacts and information exchange between AI researchers and developers, and to provide supranational information to those concerned with AI and advanced information processing. AI Communications publishes refereed articles concerning scientific and technical AI procedures, provided they are of sufficient interest to a large readership of both scientific and practical background. In addition it contains high-level background material, both at the technical level as well as the level of opinions, policies and news.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信