{"title":"通过尺度平滑优化边界框回归的损失函数","authors":"Ying-Jun Lei , Bo-Yu Wang , Yu-Tong Yang","doi":"10.1016/j.asej.2024.103046","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning technology is widely used in target detection in machine vision. However, existing regression loss functions used for training networks suffer from slow convergence and imprecise localization, hindering the realization of fast and accurate visual detection. To address this, the study proposes the smoothing adaptive intersection over union loss (SAIoU Loss), which adapts bounding box regression through scale smoothing. By analyzing the bounding box regression process, SAIoU Loss incorporates a center-of-mass distance penalty term to enhance prediction speed during box distance regression in the pre-training phase. Additionally, it integrates a corner point distance penalty term with adaptive weights to refine the similarity of predicted box shapes throughout regression. The experimental results demonstrate that SAIoU Loss achieves a 39.6 mAP in target detection model training on PASCAL VOC, marking a 3.39% improvement. It also records the highest result of 26.7% in medium-sized target detection, which represents a 9.43% improvement over IoU. In the VisDrone 2019 dataset, SAIoU Loss reaches a detection accuracy of 14.8 mAP, improving by 1.3 mAP compared to the Baseline. The SAIoU loss proposed in this study realizes efficient and highly accurate target detection.</div></div>","PeriodicalId":48648,"journal":{"name":"Ain Shams Engineering Journal","volume":"15 11","pages":"Article 103046"},"PeriodicalIF":6.0000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing the loss function for bounding box regression through scale smoothing\",\"authors\":\"Ying-Jun Lei , Bo-Yu Wang , Yu-Tong Yang\",\"doi\":\"10.1016/j.asej.2024.103046\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Deep learning technology is widely used in target detection in machine vision. However, existing regression loss functions used for training networks suffer from slow convergence and imprecise localization, hindering the realization of fast and accurate visual detection. To address this, the study proposes the smoothing adaptive intersection over union loss (SAIoU Loss), which adapts bounding box regression through scale smoothing. By analyzing the bounding box regression process, SAIoU Loss incorporates a center-of-mass distance penalty term to enhance prediction speed during box distance regression in the pre-training phase. Additionally, it integrates a corner point distance penalty term with adaptive weights to refine the similarity of predicted box shapes throughout regression. The experimental results demonstrate that SAIoU Loss achieves a 39.6 mAP in target detection model training on PASCAL VOC, marking a 3.39% improvement. It also records the highest result of 26.7% in medium-sized target detection, which represents a 9.43% improvement over IoU. In the VisDrone 2019 dataset, SAIoU Loss reaches a detection accuracy of 14.8 mAP, improving by 1.3 mAP compared to the Baseline. The SAIoU loss proposed in this study realizes efficient and highly accurate target detection.</div></div>\",\"PeriodicalId\":48648,\"journal\":{\"name\":\"Ain Shams Engineering Journal\",\"volume\":\"15 11\",\"pages\":\"Article 103046\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ain Shams Engineering Journal\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2090447924004210\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ain Shams Engineering Journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2090447924004210","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
深度学习技术被广泛应用于机器视觉中的目标检测。然而,现有用于训练网络的回归损失函数存在收敛速度慢、定位不精确等问题,阻碍了快速准确视觉检测的实现。针对这一问题,本研究提出了平滑自适应交集联合损失(SAIoU Loss),它通过尺度平滑来调整边界框回归。通过分析边界框回归过程,SAIoU Loss 在预训练阶段的框距离回归中加入了质量中心距离惩罚项,以提高预测速度。此外,它还整合了带有自适应权重的角点距离惩罚项,以在整个回归过程中完善预测框形的相似性。实验结果表明,SAIoU Loss 在 PASCAL VOC 的目标检测模型训练中实现了 39.6 mAP,提高了 3.39%。在中型目标检测方面,它也取得了 26.7% 的最高成绩,比 IoU 提高了 9.43%。在 VisDrone 2019 数据集中,SAIoU Loss 的检测精度达到 14.8 mAP,比基准提高了 1.3 mAP。本研究提出的 SAIoU Loss 实现了高效、高精度的目标检测。
Optimizing the loss function for bounding box regression through scale smoothing
Deep learning technology is widely used in target detection in machine vision. However, existing regression loss functions used for training networks suffer from slow convergence and imprecise localization, hindering the realization of fast and accurate visual detection. To address this, the study proposes the smoothing adaptive intersection over union loss (SAIoU Loss), which adapts bounding box regression through scale smoothing. By analyzing the bounding box regression process, SAIoU Loss incorporates a center-of-mass distance penalty term to enhance prediction speed during box distance regression in the pre-training phase. Additionally, it integrates a corner point distance penalty term with adaptive weights to refine the similarity of predicted box shapes throughout regression. The experimental results demonstrate that SAIoU Loss achieves a 39.6 mAP in target detection model training on PASCAL VOC, marking a 3.39% improvement. It also records the highest result of 26.7% in medium-sized target detection, which represents a 9.43% improvement over IoU. In the VisDrone 2019 dataset, SAIoU Loss reaches a detection accuracy of 14.8 mAP, improving by 1.3 mAP compared to the Baseline. The SAIoU loss proposed in this study realizes efficient and highly accurate target detection.
期刊介绍:
in Shams Engineering Journal is an international journal devoted to publication of peer reviewed original high-quality research papers and review papers in both traditional topics and those of emerging science and technology. Areas of both theoretical and fundamental interest as well as those concerning industrial applications, emerging instrumental techniques and those which have some practical application to an aspect of human endeavor, such as the preservation of the environment, health, waste disposal are welcome. The overall focus is on original and rigorous scientific research results which have generic significance.
Ain Shams Engineering Journal focuses upon aspects of mechanical engineering, electrical engineering, civil engineering, chemical engineering, petroleum engineering, environmental engineering, architectural and urban planning engineering. Papers in which knowledge from other disciplines is integrated with engineering are especially welcome like nanotechnology, material sciences, and computational methods as well as applied basic sciences: engineering mathematics, physics and chemistry.