Shyamala Devi M , Yuvaraj Natarajan , Sri Preethaa K․R , Priya S
{"title":"Thermal imagery temperature gradient analysis through deformable transformer residual regression network for crack severity estimation","authors":"Shyamala Devi M , Yuvaraj Natarajan , Sri Preethaa K․R , Priya S","doi":"10.1016/j.apples.2025.100265","DOIUrl":null,"url":null,"abstract":"<div><div>Structural crack severity estimation is critical for preventive maintenance, yet conventional approaches often fall short in capturing subsurface defects from thermal imagery due to noise and irrelevant metadata. This paper proposes a novel Vision Deformable Transformer Normalized Residual Regression Network (ViTNResNet18) for temperature gradient analysis and crack severity estimation. The thermal crack images used in this work are sourced from Mendeley Crack900 dataset. The ViTNResNet18 begins by automatically cropping to eliminate FLIR logos and temperature scales, ensuring focus on relevant thermal data. The novelty of the model is Gradient Thermal Filtering (GTF) that combines gradient magnitude, direction, thermal flow, Gabor frequency, and thermal clusters into a unified fused image to enhance crack feature representation. Subsequently, a Heat Dispersion Profile (HDP) is generated to extract critical thermal texture and gradient descriptors. The core ViTNResNet18 replaces standard ResNet18 blocks with Normalized Residual Blocks (NRB) to stabilize local feature extraction, while a ViT Convolutional Neural Network (CNN) fusion module is introduced after average pooling to capture global thermal dependencies. Unlike conventional ViT models, the proposed ViTNResNet18 replaces fixed linear projections with an offset predictor Multi-Layer Perceptron (MLP) and Deformable Positional Embedding (DPE), allowing adaptive focus on temperature gradient variations. The extracted heat dispersion profile is embedded along with transformer features and passed through an MLP regression head to directly estimate crack width and depth. Experimental results demonstrate that the proposed method achieves crack severity prediction accuracy of 99.60 %, significantly outperforming traditional models. The ViTNResNet18 delivers an intelligent and scalable solution for structural crack severity estimation by improving the accuracy of defect detection and quantification contributing to resilient infrastructure.</div></div>","PeriodicalId":72251,"journal":{"name":"Applications in engineering science","volume":"24 ","pages":"Article 100265"},"PeriodicalIF":2.1000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applications in engineering science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666496825000639","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Structural crack severity estimation is critical for preventive maintenance, yet conventional approaches often fall short in capturing subsurface defects from thermal imagery due to noise and irrelevant metadata. This paper proposes a novel Vision Deformable Transformer Normalized Residual Regression Network (ViTNResNet18) for temperature gradient analysis and crack severity estimation. The thermal crack images used in this work are sourced from Mendeley Crack900 dataset. The ViTNResNet18 begins by automatically cropping to eliminate FLIR logos and temperature scales, ensuring focus on relevant thermal data. The novelty of the model is Gradient Thermal Filtering (GTF) that combines gradient magnitude, direction, thermal flow, Gabor frequency, and thermal clusters into a unified fused image to enhance crack feature representation. Subsequently, a Heat Dispersion Profile (HDP) is generated to extract critical thermal texture and gradient descriptors. The core ViTNResNet18 replaces standard ResNet18 blocks with Normalized Residual Blocks (NRB) to stabilize local feature extraction, while a ViT Convolutional Neural Network (CNN) fusion module is introduced after average pooling to capture global thermal dependencies. Unlike conventional ViT models, the proposed ViTNResNet18 replaces fixed linear projections with an offset predictor Multi-Layer Perceptron (MLP) and Deformable Positional Embedding (DPE), allowing adaptive focus on temperature gradient variations. The extracted heat dispersion profile is embedded along with transformer features and passed through an MLP regression head to directly estimate crack width and depth. Experimental results demonstrate that the proposed method achieves crack severity prediction accuracy of 99.60 %, significantly outperforming traditional models. The ViTNResNet18 delivers an intelligent and scalable solution for structural crack severity estimation by improving the accuracy of defect detection and quantification contributing to resilient infrastructure.