A teacher-student framework leveraging large vision model for data pre-annotation and YOLO for tunnel lining multiple defects instance segmentation

IF 10.4 1区计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Journal of Industrial Information Integration Pub Date : 2025-02-07 DOI:10.1016/j.jii.2025.100790

Hanlong Yang , Lujie Wang , Yue Pan , Jin-Jian Chen

{"title":"A teacher-student framework leveraging large vision model for data pre-annotation and YOLO for tunnel lining multiple defects instance segmentation","authors":"Hanlong Yang , Lujie Wang , Yue Pan , Jin-Jian Chen","doi":"10.1016/j.jii.2025.100790","DOIUrl":null,"url":null,"abstract":"<div><div>To achieve an accurate and efficient instance segmentation task for multiple defects within tunnel linings, this paper proposes a simple yet powerful Teacher-Student Framework (TeSF) leveraging the emerging Large Vision Model (LVM) and the advanced You Only Look Once v5 (YOLO v5) model. TeSF integrates a pre-trained LVM within the Teacher Module to alleviate data annotation efforts. Concurrently, the Student Module introduces a novel top-down model architecture, amalgamating YOLO v5 for top-level Classification & Localization and a Segment Head for down-level Segmentation, resulting in YOLO-SH. The Teacher Module acts as a data engine for automatic learning in the Student Module through a well-designed loss function. The proposed TeSF is tested in images collected from Shanghai metro tunnels to automatically recognize five different types of tunnel surface defects. Experiment results indicate that: (1) The LVM-based data annotation procedure in the Teacher Module surpasses the efficacy of the traditional manual method. (2) Optimal equilibrium between computational efficiency and segmentation accuracy is achieved with a medium-sized backbone for YOLO v5, yielding mask [email protected] values of 0.644 and 0.694, all within an inference time of 6.2ms/image. (3) The top-down Student Module with YOLO-SH v5m exhibits superior performance in instance segmentation compared to state-of-the-art models, bringing improvements of no less than 8.2% and 6.3% in box [email protected] and mask [email protected], respectively. In short, the novelty of TeSF lies in the utilization of the pre-trained LVM for streamlined data annotation coupled with the augmentation of YOLO-SH for a more cost-effective and precise detection of multiple defects within tunnels. The applicability of TeSF can extend to the analysis of 3D scanner images derived from in-service tunnel environments.</div></div>","PeriodicalId":55975,"journal":{"name":"Journal of Industrial Information Integration","volume":"44 ","pages":"Article 100790"},"PeriodicalIF":10.4000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Industrial Information Integration","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452414X25000147","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

To achieve an accurate and efficient instance segmentation task for multiple defects within tunnel linings, this paper proposes a simple yet powerful Teacher-Student Framework (TeSF) leveraging the emerging Large Vision Model (LVM) and the advanced You Only Look Once v5 (YOLO v5) model. TeSF integrates a pre-trained LVM within the Teacher Module to alleviate data annotation efforts. Concurrently, the Student Module introduces a novel top-down model architecture, amalgamating YOLO v5 for top-level Classification & Localization and a Segment Head for down-level Segmentation, resulting in YOLO-SH. The Teacher Module acts as a data engine for automatic learning in the Student Module through a well-designed loss function. The proposed TeSF is tested in images collected from Shanghai metro tunnels to automatically recognize five different types of tunnel surface defects. Experiment results indicate that: (1) The LVM-based data annotation procedure in the Teacher Module surpasses the efficacy of the traditional manual method. (2) Optimal equilibrium between computational efficiency and segmentation accuracy is achieved with a medium-sized backbone for YOLO v5, yielding mask [email protected] values of 0.644 and 0.694, all within an inference time of 6.2ms/image. (3) The top-down Student Module with YOLO-SH v5m exhibits superior performance in instance segmentation compared to state-of-the-art models, bringing improvements of no less than 8.2% and 6.3% in box [email protected] and mask [email protected], respectively. In short, the novelty of TeSF lies in the utilization of the pre-trained LVM for streamlined data annotation coupled with the augmentation of YOLO-SH for a more cost-effective and precise detection of multiple defects within tunnels. The applicability of TeSF can extend to the analysis of 3D scanner images derived from in-service tunnel environments.

查看原文本刊更多论文

利用大视觉模型对数据进行预标注，利用YOLO对隧道衬砌多缺陷实例进行分割

为了实现隧道衬里多个缺陷的准确高效的实例分割任务，本文利用新兴的大视觉模型（LVM）和先进的You Only Look Once v5 （YOLO v5）模型，提出了一个简单而强大的师生框架（TeSF）。TeSF在教师模块中集成了一个预训练的LVM，以减轻数据注释的工作量。同时，学生模块引入了一种新颖的自顶向下的模型体系结构，将YOLO v5合并为顶层分类&；定位和分段头用于向下的分段，从而产生YOLO-SH。教师模块通过精心设计的损失函数，在学生模块中充当自动学习的数据引擎。通过对上海地铁隧道图像的测试，可以自动识别五种不同类型的隧道表面缺陷。实验结果表明：(1)教师模块中基于lvmh的数据标注过程超越了传统手工标注方法的有效性。(2)在YOLO v5的中等主干下，计算效率和分割精度之间达到了最佳平衡，得到的掩码[email protected]值分别为0.644和0.694，都在6.2ms/image的推理时间内。(3)基于YOLO-SH v5m的自顶向下学生模块在实例分割方面的性能优于最先进的模型，在box [email protected]和mask [email protected]方面分别提高了不低于8.2%和6.3%。简而言之，TeSF的新颖之处在于利用预训练的LVM进行简化的数据注释，再加上YOLO-SH的增强，从而更经济、更精确地检测隧道内的多个缺陷。TeSF的适用性可以扩展到对在役隧道环境中产生的三维扫描仪图像的分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Industrial Information Integration Decision Sciences-Information Systems and Management

CiteScore

22.30

自引率

13.40%

发文量

100

期刊介绍： The Journal of Industrial Information Integration focuses on the industry's transition towards industrial integration and informatization, covering not only hardware and software but also information integration. It serves as a platform for promoting advances in industrial information integration, addressing challenges, issues, and solutions in an interdisciplinary forum for researchers, practitioners, and policy makers. The Journal of Industrial Information Integration welcomes papers on foundational, technical, and practical aspects of industrial information integration, emphasizing the complex and cross-disciplinary topics that arise in industrial integration. Techniques from mathematical science, computer science, computer engineering, electrical and electronic engineering, manufacturing engineering, and engineering management are crucial in this context.