Tuna defect classification and grading using Twins transformer

IF 5.3 2区农林科学 Q1 ENGINEERING, CHEMICAL

Journal of Food Engineering Pub Date : 2025-02-18 DOI:10.1016/j.jfoodeng.2025.112535

Punnarai Siricharoen , Supanut Tangsinmankong , Seree Yengsakulpaisal , Natthanan Bhukan , Wisawapan Soingoen , Yutthana Lila , Saranya Jongaroontaprangsee , Stefan Mairhofer

{"title":"Tuna defect classification and grading using Twins transformer","authors":"Punnarai Siricharoen , Supanut Tangsinmankong , Seree Yengsakulpaisal , Natthanan Bhukan , Wisawapan Soingoen , Yutthana Lila , Saranya Jongaroontaprangsee , Stefan Mairhofer","doi":"10.1016/j.jfoodeng.2025.112535","DOIUrl":null,"url":null,"abstract":"<div><div>Ensuring the quality and safety of food products is of paramount importance within the food processing industry. Particularly in the seafood sector, the detection and classification of different quality defects in processed tuna loins poses a significant challenge, usually demanding the visual assessment by seasoned experts. This research proposes a technical solution to the tuna quality inspection using computer vision techniques to identify and localize different types of defects in contrast to what is considered the “standard” of a cleaned product, while additionally assessing the severity level of such defects affecting each individual loin. Image data of tuna defects are acquired under industrial conditions and compose two different datasets: a 4-common-defect dataset (TunaDefect-4) and a 6-extended-defect dataset (TunaDefect-6) including two additional types that are less common but of greater technical challenge. The quality grading process comprises 3 main steps. (1) Initially, preprocessing normalizes image input and augments the image dataset. (2) Then, a semantic segmentation model Twins-PCPVT-L, a pyramid vision transformer with self-attention and conditional positioning encoding, is employed for the TunaDefect-4 dataset. For the TunaDefect-6, a Twins-SVT-L, which amends the former model with locally-group self-attention and global sub-sampled attention, is used. The Twins-PCPVT-L applied to TunaDefect-4 has a mean pixel accuracy (mPA) of 93.96% and a mean IoU of 80.4%; while the Twins-SVT-L on the TunaDefect-6, results in an mPA of 83.82% and mIoU of 66.96%. (3) Lastly, the semantically segmented images are graded by severity ranging from level 0 to 4, where level 0 represents a fully cleaned loin and level 4, being the highest severity level, assigned to loins completely covered by various defects. The accuracy of severity grading is 84% for TunaDefect-4 and 76.6% for TunaDefect-6. Both models run within a total inference and processing time of approximately 0.20 s, faster than the conveyor's transport time. A web application prototype has been developed for the tuna quality classification and grading and is hosted on the Google Cloud Platform (GCP). The developed application responds in timely manner, to be used as a complementary identification and grading tool, with the potential to be integrated as an inline processing solution to further provide practicality to the industry.</div></div>","PeriodicalId":359,"journal":{"name":"Journal of Food Engineering","volume":"395 ","pages":"Article 112535"},"PeriodicalIF":5.3000,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Food Engineering","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0260877425000706","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Ensuring the quality and safety of food products is of paramount importance within the food processing industry. Particularly in the seafood sector, the detection and classification of different quality defects in processed tuna loins poses a significant challenge, usually demanding the visual assessment by seasoned experts. This research proposes a technical solution to the tuna quality inspection using computer vision techniques to identify and localize different types of defects in contrast to what is considered the “standard” of a cleaned product, while additionally assessing the severity level of such defects affecting each individual loin. Image data of tuna defects are acquired under industrial conditions and compose two different datasets: a 4-common-defect dataset (TunaDefect-4) and a 6-extended-defect dataset (TunaDefect-6) including two additional types that are less common but of greater technical challenge. The quality grading process comprises 3 main steps. (1) Initially, preprocessing normalizes image input and augments the image dataset. (2) Then, a semantic segmentation model Twins-PCPVT-L, a pyramid vision transformer with self-attention and conditional positioning encoding, is employed for the TunaDefect-4 dataset. For the TunaDefect-6, a Twins-SVT-L, which amends the former model with locally-group self-attention and global sub-sampled attention, is used. The Twins-PCPVT-L applied to TunaDefect-4 has a mean pixel accuracy (mPA) of 93.96% and a mean IoU of 80.4%; while the Twins-SVT-L on the TunaDefect-6, results in an mPA of 83.82% and mIoU of 66.96%. (3) Lastly, the semantically segmented images are graded by severity ranging from level 0 to 4, where level 0 represents a fully cleaned loin and level 4, being the highest severity level, assigned to loins completely covered by various defects. The accuracy of severity grading is 84% for TunaDefect-4 and 76.6% for TunaDefect-6. Both models run within a total inference and processing time of approximately 0.20 s, faster than the conveyor's transport time. A web application prototype has been developed for the tuna quality classification and grading and is hosted on the Google Cloud Platform (GCP). The developed application responds in timely manner, to be used as a complementary identification and grading tool, with the potential to be integrated as an inline processing solution to further provide practicality to the industry.

查看原文本刊更多论文

使用孪生变压器对金枪鱼缺陷进行分类和分级

确保食品的质量和安全在食品加工业中是至关重要的。特别是在海产品部门，对加工金枪鱼腰肉中不同质量缺陷的检测和分类提出了重大挑战，通常需要经验丰富的专家进行视觉评估。本研究提出了一种金枪鱼质量检测的技术解决方案，使用计算机视觉技术来识别和定位不同类型的缺陷，与被认为是清洁产品的“标准”相比，同时额外评估影响每个腰部的缺陷的严重程度。金枪鱼缺陷的图像数据是在工业条件下获得的，并组成两个不同的数据集：一个4-常见缺陷数据集（TunaDefect-4）和一个6-扩展缺陷数据集（TunaDefect-6），其中包括两种不太常见但技术挑战更大的附加类型。质量分级过程包括三个主要步骤。(1)首先对图像输入进行归一化处理，增强图像数据集。(2)然后，针对TunaDefect-4数据集，采用具有自注意和条件定位编码的金字塔视觉转换器Twins-PCPVT-L语义分割模型；对于TunaDefect-6，使用了twin - svt - l模型，该模型采用局部群体自注意和全局子采样注意对原模型进行了修正。应用于tuna缺损-4的twin - pcpvt - l平均像元精度（mPA）为93.96%，平均IoU为80.4%；TunaDefect-6上的twin - svt - l的mPA为83.82%，mIoU为66.96%。(3)最后，对语义分割后的图像进行0 ~ 4级的严重程度分级，其中0级表示完全清洁的腰部，4级为最高严重程度，表示完全被各种缺陷覆盖的腰部。TunaDefect-4和TunaDefect-6的严重性分级准确率分别为84%和76.6%。两种模型运行的总推理和处理时间约为0.20秒，比输送机的运输时间快。已经开发了一个用于金枪鱼质量分类和分级的web应用程序原型，并托管在谷歌云平台（GCP）上。开发的应用程序可以及时响应，作为补充识别和分级工具，并有可能集成为在线处理解决方案，进一步为行业提供实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Food Engineering 工程技术-工程：化工

CiteScore

11.80

自引率

5.50%

发文量

275

审稿时长

24 days

期刊介绍： The journal publishes original research and review papers on any subject at the interface between food and engineering, particularly those of relevance to industry, including: Engineering properties of foods, food physics and physical chemistry; processing, measurement, control, packaging, storage and distribution; engineering aspects of the design and production of novel foods and of food service and catering; design and operation of food processes, plant and equipment; economics of food engineering, including the economics of alternative processes. Accounts of food engineering achievements are of particular value.