Semantic Image Segmentation: Two Decades of Research

IF 9.3 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Foundations and Trends in Computer Graphics and Vision Pub Date : 2023-02-13 DOI:10.1561/0600000095

G. Csurka, Riccardo Volpi, Boris Chidlovskii

{"title":"Semantic Image Segmentation: Two Decades of Research","authors":"G. Csurka, Riccardo Volpi, Boris Chidlovskii","doi":"10.1561/0600000095","DOIUrl":null,"url":null,"abstract":"Semantic image segmentation (SiS) plays a fundamental role in a broad variety of computer vision applications, providing key information for the global understanding of an image. This survey is an effort to summarize two decades of research in the field of SiS, where we propose a literature review of solutions starting from early historical methods followed by an overview of more recent deep learning methods including the latest trend of using transformers. We complement the review by discussing particular cases of the weak supervision and side machine learning techniques that can be used to improve the semantic segmentation such as curriculum, incremental or self-supervised learning. State-of-the-art SiS models rely on a large amount of annotated samples, which are more expensive to obtain than labels for tasks such as image classification. Since unlabeled data is instead significantly cheaper to obtain, it is not surprising that Unsupervised Domain Adaptation (UDA) reached a broad success within the semantic segmentation community. Therefore, a second core contribution of this book is to summarize five years of a rapidly growing field, Domain Adaptation for Semantic Image Segmentation (DASiS) which embraces the importance of semantic segmentation itself and a critical need of adapting segmentation models to new environments. In addition to providing a comprehensive survey on DASiS techniques, we unveil also newer trends such as multi-domain learning, domain generalization, domain incremental learning, test-time adaptation and source-free domain adaptation. Finally, we conclude this survey by describing datasets and benchmarks most widely used in SiS and DASiS and briefly discuss related tasks such as instance and panoptic image segmentation, as well as applications such as medical image segmentation.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":"8 1","pages":"1-162"},"PeriodicalIF":9.3000,"publicationDate":"2023-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations and Trends in Computer Graphics and Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1561/0600000095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 11

Abstract

Semantic image segmentation (SiS) plays a fundamental role in a broad variety of computer vision applications, providing key information for the global understanding of an image. This survey is an effort to summarize two decades of research in the field of SiS, where we propose a literature review of solutions starting from early historical methods followed by an overview of more recent deep learning methods including the latest trend of using transformers. We complement the review by discussing particular cases of the weak supervision and side machine learning techniques that can be used to improve the semantic segmentation such as curriculum, incremental or self-supervised learning. State-of-the-art SiS models rely on a large amount of annotated samples, which are more expensive to obtain than labels for tasks such as image classification. Since unlabeled data is instead significantly cheaper to obtain, it is not surprising that Unsupervised Domain Adaptation (UDA) reached a broad success within the semantic segmentation community. Therefore, a second core contribution of this book is to summarize five years of a rapidly growing field, Domain Adaptation for Semantic Image Segmentation (DASiS) which embraces the importance of semantic segmentation itself and a critical need of adapting segmentation models to new environments. In addition to providing a comprehensive survey on DASiS techniques, we unveil also newer trends such as multi-domain learning, domain generalization, domain incremental learning, test-time adaptation and source-free domain adaptation. Finally, we conclude this survey by describing datasets and benchmarks most widely used in SiS and DASiS and briefly discuss related tasks such as instance and panoptic image segmentation, as well as applications such as medical image segmentation.

查看原文本刊更多论文

语义图像分割:二十年的研究

语义图像分割(SiS)在广泛的计算机视觉应用中起着基础作用，为图像的全局理解提供关键信息。本调查旨在总结深度学习领域二十年来的研究，其中我们提出了从早期历史方法开始的解决方案的文献综述，然后概述了最近的深度学习方法，包括使用变压器的最新趋势。我们通过讨论弱监督和侧机器学习技术的特定案例来补充评论，这些技术可用于改进语义分割，如课程，增量或自监督学习。最先进的si模型依赖于大量带注释的样本，在图像分类等任务中，这些样本的获取成本比标签要高。由于未标记数据的获取成本要低得多，因此无监督域自适应(UDA)在语义分割领域取得广泛成功也就不足为奇了。因此，本书的第二个核心贡献是总结了五年来快速发展的领域，语义图像分割的领域适应(DASiS)，它包含了语义分割本身的重要性以及使分割模型适应新环境的关键需求。除了提供对DASiS技术的全面调查外，我们还揭示了新的趋势，如多领域学习，领域泛化，领域增量学习，测试时间自适应和无源领域自适应。最后，我们描述了在si和DASiS中最广泛使用的数据集和基准，并简要讨论了相关的任务，如实例和全景图像分割，以及医学图像分割等应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Foundations and Trends in Computer Graphics and Vision COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-

CiteScore

31.20

自引率

0.00%

发文量

期刊介绍： The growth in all aspects of research in the last decade has led to a multitude of new publications and an exponential increase in published research. Finding a way through the excellent existing literature and keeping up to date has become a major time-consuming problem. Electronic publishing has given researchers instant access to more articles than ever before. But which articles are the essential ones that should be read to understand and keep abreast with developments of any topic? To address this problem Foundations and Trends® in Computer Graphics and Vision publishes high-quality survey and tutorial monographs of the field. Each issue of Foundations and Trends® in Computer Graphics and Vision comprises a 50-100 page monograph written by research leaders in the field. Monographs that give tutorial coverage of subjects, research retrospectives as well as survey papers that offer state-of-the-art reviews fall within the scope of the journal.