用于产品包装检测的具有分解表示的定向 R-CNN

IF 2.1 4区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Photonics Journal Pub Date : 2024-08-26 DOI:10.1109/JPHOT.2024.3450295

Jiangyi Pan;Jianjun Yang;Yinhao Liu;Yijie Lv

{"title":"用于产品包装检测的具有分解表示的定向 R-CNN","authors":"Jiangyi Pan;Jianjun Yang;Yinhao Liu;Yijie Lv","doi":"10.1109/JPHOT.2024.3450295","DOIUrl":null,"url":null,"abstract":"Object detection is a vital task in the field of computer vision for various applications such as face detection, autonomous driving and industrial production. In recent years, with the rise of deep neural networks, there has been significant progress in improving object detection accuracy. However, despite the state-of-the-art methods being tested on public datasets, there still remains a considerable gap when applied to real-world scenarios. This is because there are many unknown types of damaged samples in industrial object detection, the scale of the types varies greatly and the position changes are complex. Many previous works focus on rotating object detection and improve it, but this paper mainly combines the prior knowledge in remote sensing and industrial scenes, and the research is more general. To fill the shortage of wrapper datasets, we established a Carton Packing Tape (CPT) Dataset with a large scale of images only containing cartons. Specifically, we first collect a large number of images of packaged cartons from the real packaging production line and provide detection boxes for them by manual labeling. We have observed that the contextual clues required for different object detection tasks exhibit inconsistency. Furthermore, targets in varying backgrounds necessitate different receptive fields, which can be dynamically adjusted using different convolutional kernels. The features naturally attended to by these receptive fields of different scales should possess a unified representation disentanglement. Based on these insights, we propose a pioneering object detection method tailored for industrial environments, termed as oriented R-CNN with disentangled representations (ORDR). The experimental results indicate that our proposed method outperforms better than some of the state-of-the-art detection techniques available.","PeriodicalId":13204,"journal":{"name":"IEEE Photonics Journal","volume":"16 5","pages":"1-11"},"PeriodicalIF":2.1000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10648834","citationCount":"0","resultStr":"{\"title\":\"Oriented R-CNN With Disentangled Representations for Product Packaging Detection\",\"authors\":\"Jiangyi Pan;Jianjun Yang;Yinhao Liu;Yijie Lv\",\"doi\":\"10.1109/JPHOT.2024.3450295\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Object detection is a vital task in the field of computer vision for various applications such as face detection, autonomous driving and industrial production. In recent years, with the rise of deep neural networks, there has been significant progress in improving object detection accuracy. However, despite the state-of-the-art methods being tested on public datasets, there still remains a considerable gap when applied to real-world scenarios. This is because there are many unknown types of damaged samples in industrial object detection, the scale of the types varies greatly and the position changes are complex. Many previous works focus on rotating object detection and improve it, but this paper mainly combines the prior knowledge in remote sensing and industrial scenes, and the research is more general. To fill the shortage of wrapper datasets, we established a Carton Packing Tape (CPT) Dataset with a large scale of images only containing cartons. Specifically, we first collect a large number of images of packaged cartons from the real packaging production line and provide detection boxes for them by manual labeling. We have observed that the contextual clues required for different object detection tasks exhibit inconsistency. Furthermore, targets in varying backgrounds necessitate different receptive fields, which can be dynamically adjusted using different convolutional kernels. The features naturally attended to by these receptive fields of different scales should possess a unified representation disentanglement. Based on these insights, we propose a pioneering object detection method tailored for industrial environments, termed as oriented R-CNN with disentangled representations (ORDR). The experimental results indicate that our proposed method outperforms better than some of the state-of-the-art detection techniques available.\",\"PeriodicalId\":13204,\"journal\":{\"name\":\"IEEE Photonics Journal\",\"volume\":\"16 5\",\"pages\":\"1-11\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10648834\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Photonics Journal\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10648834/\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Photonics Journal","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10648834/","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

物体检测是计算机视觉领域的一项重要任务，可用于人脸检测、自动驾驶和工业生产等多种应用。近年来，随着深度神经网络的兴起，在提高物体检测精度方面取得了显著进展。然而，尽管最先进的方法已在公共数据集上进行了测试，但在应用于现实世界场景时仍存在相当大的差距。这是因为在工业物体检测中，受损样本的未知类型很多，类型的规模差异很大，位置变化也很复杂。以往的许多研究都是针对旋转物体检测并加以改进，而本文主要结合了遥感和工业场景中的已有知识，研究更具普适性。为了弥补包装物数据集的不足，我们建立了一个只包含纸箱的大规模图像的纸箱包装带（CPT）数据集。具体来说，我们首先从真实的包装生产线上收集了大量包装纸箱的图像，并通过人工标注为其提供检测盒。我们观察到，不同物体检测任务所需的背景线索表现出不一致性。此外，不同背景下的目标需要不同的感受野，这些感受野可以使用不同的卷积核进行动态调整。这些不同尺度的感受野所自然关注的特征应具有统一的表示解缠。基于这些见解，我们提出了一种专为工业环境量身定制的开创性物体检测方法，即具有分离表征的定向 R-CNN 方法（ORDR）。实验结果表明，我们提出的方法优于现有的一些最先进的检测技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Oriented R-CNN With Disentangled Representations for Product Packaging Detection

Object detection is a vital task in the field of computer vision for various applications such as face detection, autonomous driving and industrial production. In recent years, with the rise of deep neural networks, there has been significant progress in improving object detection accuracy. However, despite the state-of-the-art methods being tested on public datasets, there still remains a considerable gap when applied to real-world scenarios. This is because there are many unknown types of damaged samples in industrial object detection, the scale of the types varies greatly and the position changes are complex. Many previous works focus on rotating object detection and improve it, but this paper mainly combines the prior knowledge in remote sensing and industrial scenes, and the research is more general. To fill the shortage of wrapper datasets, we established a Carton Packing Tape (CPT) Dataset with a large scale of images only containing cartons. Specifically, we first collect a large number of images of packaged cartons from the real packaging production line and provide detection boxes for them by manual labeling. We have observed that the contextual clues required for different object detection tasks exhibit inconsistency. Furthermore, targets in varying backgrounds necessitate different receptive fields, which can be dynamically adjusted using different convolutional kernels. The features naturally attended to by these receptive fields of different scales should possess a unified representation disentanglement. Based on these insights, we propose a pioneering object detection method tailored for industrial environments, termed as oriented R-CNN with disentangled representations (ORDR). The experimental results indicate that our proposed method outperforms better than some of the state-of-the-art detection techniques available.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Photonics Journal ENGINEERING, ELECTRICAL & ELECTRONIC-OPTICS

CiteScore

4.50

自引率

8.30%

发文量

489

审稿时长

1.4 months

期刊介绍： Breakthroughs in the generation of light and in its control and utilization have given rise to the field of Photonics, a rapidly expanding area of science and technology with major technological and economic impact. Photonics integrates quantum electronics and optics to accelerate progress in the generation of novel photon sources and in their utilization in emerging applications at the micro and nano scales spanning from the far-infrared/THz to the x-ray region of the electromagnetic spectrum. IEEE Photonics Journal is an online-only journal dedicated to the rapid disclosure of top-quality peer-reviewed research at the forefront of all areas of photonics. Contributions addressing issues ranging from fundamental understanding to emerging technologies and applications are within the scope of the Journal. The Journal includes topics in: Photon sources from far infrared to X-rays, Photonics materials and engineered photonic structures, Integrated optics and optoelectronic, Ultrafast, attosecond, high field and short wavelength photonics, Biophotonics, including DNA photonics, Nanophotonics, Magnetophotonics, Fundamentals of light propagation and interaction; nonlinear effects, Optical data storage, Fiber optics and optical communications devices, systems, and technologies, Micro Opto Electro Mechanical Systems (MOEMS), Microwave photonics, Optical Sensors.