{"title":"Semantic Segmentation with Perceiver IO","authors":"Keong-Hun Choi, J. Ha","doi":"10.23919/ICCAS55662.2022.10003862","DOIUrl":null,"url":null,"abstract":"Recently, in deep learning, the transformer is replacing the convolutional neural network (CNN) due to its performance and simple design. In particular, in recent studies, constructing an encoder of the transformer that effectively extracts features on an image has been widely used. However, even in these cases, models utilizing existing deep neural network structures needed to use a form suitable for each data format according to input modality. Recently, the Perceiver IO [6] has been proposed to overcome this limitation. It can process various data formats through one structure to extract a characteristic value. Also, it uses an output query to output data as we want. In this paper, a semantic segmentation model using the characteristics of the Perceiver IO is presented. Two types of input configuration are suggested, and experimental results show the feasibility of the proposed method.","PeriodicalId":129856,"journal":{"name":"2022 22nd International Conference on Control, Automation and Systems (ICCAS)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 22nd International Conference on Control, Automation and Systems (ICCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICCAS55662.2022.10003862","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, in deep learning, the transformer is replacing the convolutional neural network (CNN) due to its performance and simple design. In particular, in recent studies, constructing an encoder of the transformer that effectively extracts features on an image has been widely used. However, even in these cases, models utilizing existing deep neural network structures needed to use a form suitable for each data format according to input modality. Recently, the Perceiver IO [6] has been proposed to overcome this limitation. It can process various data formats through one structure to extract a characteristic value. Also, it uses an output query to output data as we want. In this paper, a semantic segmentation model using the characteristics of the Perceiver IO is presented. Two types of input configuration are suggested, and experimental results show the feasibility of the proposed method.