A Study about Learning Graph Representation on Farmhouse Apple Quality Images with Graph Transformer

Ji-Hoon Bae, Jinyoung Kim, Ju-Hwan Lee, Gwanghyun Yu, Gyeong-Ju Kwon
{"title":"A Study about Learning Graph Representation on Farmhouse Apple Quality Images with Graph Transformer","authors":"Ji-Hoon Bae, Jinyoung Kim, Ju-Hwan Lee, Gwanghyun Yu, Gyeong-Ju Kwon","doi":"10.30693/smj.2023.12.1.9","DOIUrl":null,"url":null,"abstract":"Recently, a convolutional neural network (CNN) based system is being developed to overcome the limitations of human resources in the apple quality classification of farmhouse. However, since convolutional neural networks receive only images of the same size, preprocessing such as sampling may be required, and in the case of oversampling, information loss of the original image such as image quality degradation and blurring occurs. In this paper, in order to minimize the above problem, to generate a image patch based graph of an original image and propose a random walk-based positional encoding method to apply the graph transformer model. The above method continuously learns the position embedding information of patches which don`t have a positional information based on the random walk algorithm, and finds the optimal graph structure by aggregating useful node information through the self-attention technique of graph transformer model. Therefore, it is robust and shows good performance even in a new graph structure of random node order and an arbitrary graph structure according to the location of an object in an image. As a result, when experimented with 5 apple quality datasets, the learning accuracy was higher than other GNN models by a minimum of 1.3% to a maximum of 4.7%, and the number of parameters was 3.59M, which was about 15% less than the 23.52M of the ResNet18 model. Therefore, it shows fast reasoning speed according to the reduction of the amount of computation and proves the effect.","PeriodicalId":249252,"journal":{"name":"Korean Institute of Smart Media","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Korean Institute of Smart Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30693/smj.2023.12.1.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recently, a convolutional neural network (CNN) based system is being developed to overcome the limitations of human resources in the apple quality classification of farmhouse. However, since convolutional neural networks receive only images of the same size, preprocessing such as sampling may be required, and in the case of oversampling, information loss of the original image such as image quality degradation and blurring occurs. In this paper, in order to minimize the above problem, to generate a image patch based graph of an original image and propose a random walk-based positional encoding method to apply the graph transformer model. The above method continuously learns the position embedding information of patches which don`t have a positional information based on the random walk algorithm, and finds the optimal graph structure by aggregating useful node information through the self-attention technique of graph transformer model. Therefore, it is robust and shows good performance even in a new graph structure of random node order and an arbitrary graph structure according to the location of an object in an image. As a result, when experimented with 5 apple quality datasets, the learning accuracy was higher than other GNN models by a minimum of 1.3% to a maximum of 4.7%, and the number of parameters was 3.59M, which was about 15% less than the 23.52M of the ResNet18 model. Therefore, it shows fast reasoning speed according to the reduction of the amount of computation and proves the effect.
用图转换器学习农家乐苹果品质图像的图表示研究
最近,一种基于卷积神经网络(CNN)的系统被开发出来,以克服农家院苹果质量分类中人力资源的局限性。然而,由于卷积神经网络只接收相同大小的图像,可能需要进行采样等预处理,在过采样的情况下,会发生原始图像的信息丢失,如图像质量下降和模糊。为了最大限度地减少上述问题,本文对原始图像生成基于图像补丁的图,并提出了一种基于随机游走的位置编码方法来应用图变换模型。该方法基于随机游走算法,不断学习没有位置信息的小块的位置嵌入信息,通过图变换模型的自关注技术,聚合有用节点信息,找到最优的图结构。因此,即使在随机节点顺序的新图结构和根据图像中对象位置的任意图结构中,它也具有良好的鲁棒性和性能。结果,在5个苹果质量数据集上进行实验时,学习精度比其他GNN模型提高了1.3% ~ 4.7%,参数个数为359 m,比ResNet18模型的2352 m减少了约15%。因此,根据计算量的减少,它显示出快速的推理速度,并证明了效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信