Journal of Real-Time Image Processing最新文献_第6页

YOLO-FGD: a fast lightweight PCB defect method based on FasterNet and the Gather-and-Distribute mechanism YOLO-FGD：基于 FasterNet 和聚散机制的快速轻量级 PCB 缺陷处理方法

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-07-03 DOI: 10.1007/s11554-024-01504-x

Changxin Qin, Zhongyu Zhou

{"title":"YOLO-FGD: a fast lightweight PCB defect method based on FasterNet and the Gather-and-Distribute mechanism","authors":"Changxin Qin, Zhongyu Zhou","doi":"10.1007/s11554-024-01504-x","DOIUrl":"https://doi.org/10.1007/s11554-024-01504-x","url":null,"abstract":"With the rapid expansion of the electronics industry, the demand for high-quality printed circuit boards has surged. However, existing PCB defect detection methods suffer from various limitations, such as slow speeds, low accuracy, and restricted detection scope, often leading to false positives and negatives. To overcome these challenges, this paper presents YOLO-FGD, a novel detection model. YOLO-FGD replaces YOLOv5’s backbone network with FasterNet, significantly accelerating feature extraction. The Neck section adopts the Gather-and-Distribute mechanism, which enhances multiscale feature fusion for small targets through convolution and self-attention mechanisms. Integration of the C3_Faster feature extraction module effectively reduces the number of parameters and the number of FLOPs, accelerating the computations. Experiments on the PCB-DATASETS dataset show promising results: the mean average precision50 reaches 98.8%, the mean average precision50–95 reaches 57.2%, the computational load is reduced to 11.5 GFLOPs, and the model size is only 12.6 MB, meeting lightweight standards. These findings underscore the effectiveness of YOLO-FGD in efficiently detecting PCB defects, providing robust support for electronic manufacturing quality control.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"31 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141516484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TeleStroke: real-time stroke detection with federated learning and YOLOv8 on edge devices TeleStroke：利用联合学习和 YOLOv8 在边缘设备上进行实时中风检测

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-06-26 DOI: 10.1007/s11554-024-01500-1

Abdussalam Elhanashi, Pierpaolo Dini, Sergio Saponara, Qinghe Zheng

{"title":"TeleStroke: real-time stroke detection with federated learning and YOLOv8 on edge devices","authors":"Abdussalam Elhanashi, Pierpaolo Dini, Sergio Saponara, Qinghe Zheng","doi":"10.1007/s11554-024-01500-1","DOIUrl":"https://doi.org/10.1007/s11554-024-01500-1","url":null,"abstract":"Stroke, a life-threatening medical condition, necessitates immediate intervention for optimal outcomes. Timely diagnosis and treatment play a crucial role in reducing mortality and minimizing long-term disabilities associated with strokes. This study presents a novel approach to meet these critical needs by proposing a real-time stroke detection system based on deep learning (DL) with utilization of federated learning (FL) to enhance accuracy and privacy preservation. The primary objective of this research is to develop an efficient and accurate model capable of discerning between stroke and non-stroke cases in real-time, facilitating healthcare professionals in making well-informed decisions. Traditional stroke detection methods relying on manual interpretation of medical images are time-consuming and prone to human error. DL techniques have shown promise in automating this process, yet challenges persist due to the need for extensive and diverse datasets and privacy concerns. To address these challenges, our methodology involves utilization and assessing YOLOv8 models on comprehensive datasets comprising both stroke and non-stroke based on the facial paralysis of the individuals from the images. This training process empowers the model to grasp intricate patterns and features associated with strokes, thereby enhancing its diagnostic accuracy. In addition, federated learning, a decentralized training approach, is employed to bolster privacy while preserving model performance. This approach enables the model to learn from data distributed across various clients without compromising sensitive patient information. The proposed methodology has been implemented on NVIDIA platforms, utilizing their advanced GPU capabilities to enable real-time processing and analysis. This optimized model has the potential to revolutionize stroke diagnosis and patient care, promising to save lives and elevate the quality of healthcare services in the neurology field.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"61 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards real-time video analysis of flooded areas: redundancy-based accelerator for object detection models 实现水灾地区的实时视频分析：基于冗余的物体检测模型加速器

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-06-25 DOI: 10.1007/s11554-024-01490-0

Shubhasree AV, Praveen Sankaran, Raghu C.V

{"title":"Towards real-time video analysis of flooded areas: redundancy-based accelerator for object detection models","authors":"Shubhasree AV, Praveen Sankaran, Raghu C.V","doi":"10.1007/s11554-024-01490-0","DOIUrl":"https://doi.org/10.1007/s11554-024-01490-0","url":null,"abstract":"The state of Kerala in India has seen multiple instances of intense cyclones in recent years, resulting in heavy flooding. One of the biggest challenges faced by rescuers is the accessibility to flooded areas and buildings during rescue operations. In such scenarios, unmanned aerial vehicles (UAVs) can deliver reliable aerial visual data to aid planning and operations during rescue. Object detectors based on deep learning methods provide an effective solution to automate the process of detecting relevant information from image/video data. These models are complex and resource-hungry, leading to severe speed constraints during field operations. The pixel displacement algorithm (PDA), a portable and effective technique, is developed in this work to speed up object detection models on devices with limited resources, such as edge devices. This method can be integrated with all object detection models to speed up the inference time. The proposed method is combined with multiple object detection models in this work to show its effectiveness. The YOLOv4 model combined with the proposed method outperformed the AP50 performance of the YOLOv4-tiny model by 6(%) while maintaining the same processing time. This approach gave almost 10(times ) speed improvement to Jetson Nano at an accuracy cost of (3%) when compared to YOLOv4. Further, a model to predict maximum pixel shift with respect to frame skip is proposed using parameters such as the altitude and velocity of the UAV and the tilt of the camera. Accurate prediction of pixel shift leads to a reduced search area, leading to reduced inference time. The effectiveness of the proposed model was tested against annotated locations, and it was found that the method was able to predict the search area for each test video segment with a high degree of accuracy.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"38 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141532365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Safety helmet detection based on improved YOLOv7-tiny with multiple feature enhancement 基于改进型 YOLOv7-tiny 和多重特征增强的安全头盔检测技术

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-06-25 DOI: 10.1007/s11554-024-01501-0

Shuqiang Wang, Peiyang Wu, Qingqing Wu

{"title":"Safety helmet detection based on improved YOLOv7-tiny with multiple feature enhancement","authors":"Shuqiang Wang, Peiyang Wu, Qingqing Wu","doi":"10.1007/s11554-024-01501-0","DOIUrl":"https://doi.org/10.1007/s11554-024-01501-0","url":null,"abstract":"Safety helmets are vital protective gear for construction workers, effectively reducing head injuries and safeguarding lives. By identification of safety helmet usage, workers’ unsafe behaviors can be detected and corrected in a timely manner, reducing the possibility of accidents. Target detection methods based on computer vision can achieve fast and accurate detection regarding the wearing habits of safety helmets of workers. In this study, we propose a real-time construction-site helmet detection algorithm that improves YOLOv7-tiny to address the problems associated with automatically identifying construction-site helmets. First, the Efficient Multi-scale Attention (EMA) module is introduced at the trunk to capture the detailed information; here, the model is more focused on training to recognize the helmet-related target features. Second, the detection head is replaced with a self-attentive Dynamic Head (DyHead) for stronger feature representation. Finally, Wise-IoU (WIoU) with a dynamic nonmonotonic focusing mechanism is used as a loss function to improve the model’s ability to manage the situation of mutual occlusion between workers and enhance the detection performance. The experimental results show that the improved YOLOv7-tiny algorithm model yields 3.3, 1.5, and 5.6% improvements in the evaluation of indices of mAP@0.5, precision, and recall, respectively, while maintaining its lightweight features; this enables more accurate detection with a suitable detection speed and is more in conjunction with the needs of on-site-automated detection.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"30 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A novel pipelined architecture of entropy filter 熵滤波器的新型流水线结构

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-06-23 DOI: 10.1007/s11554-024-01498-6

Dat Ngo, Bongsoon Kang

引用次数: 0

Rtsds:a real-time and efficient method for detecting surface defects in strip steel Rtsds：检测带钢表面缺陷的实时高效方法

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-06-19 DOI: 10.1007/s11554-024-01497-7

Qingtian Zeng, Daibai Wei, Minghao Zou

{"title":"Rtsds:a real-time and efficient method for detecting surface defects in strip steel","authors":"Qingtian Zeng, Daibai Wei, Minghao Zou","doi":"10.1007/s11554-024-01497-7","DOIUrl":"https://doi.org/10.1007/s11554-024-01497-7","url":null,"abstract":"To address the issues of varying defect sizes, inconsistent data quality, and real-time detection challenges in steel defect detection, we propose a real-time efficient steel defect detection network (RTSD). This model employs a multi-scale feature extraction module (MSC3) and a mid-sized object detector (MidObj) to comprehensively capture texture features of defects across different scales. We incorporate a coordinate attention module (CA) and replace the spatial pyramid pooling structure (SPPF) to enhance defect localization capabilities. Additionally, we introduce the Wise-IoU (WIoU) loss function to balance attention to various quality defects. To address the real-time detection issue, we use Taylor channel pruning to reduce model complexity and employ channel-wise knowledge distillation instead of fine-tuning to mitigate the negative impacts of pruning. Experimental results show that on the NEU-DET data set, the average precision of RTSD reaches 83.5%. The model parameters, calculation amount, and size are 5.9M, 7.9 GFLOPs, and 11.9M, respectively, with an inference speed of up to 247.6 FPS. This demonstrates that our method can enhance performance while significantly reducing model complexity and computational overhead, offering a highly practical solution for industrial applications.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"9 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141530841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A generic deep learning architecture optimization method for edge device based on start-up latency reduction 基于降低启动延迟的边缘设备通用深度学习架构优化方法

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-06-18 DOI: 10.1007/s11554-024-01496-8

Qi Li, Hengyi Li, Lin Meng

{"title":"A generic deep learning architecture optimization method for edge device based on start-up latency reduction","authors":"Qi Li, Hengyi Li, Lin Meng","doi":"10.1007/s11554-024-01496-8","DOIUrl":"https://doi.org/10.1007/s11554-024-01496-8","url":null,"abstract":"In the promising Artificial Intelligence of Things technology, deep learning algorithms are implemented on edge devices to process data locally. However, high-performance deep learning algorithms are accompanied by increased computation and parameter storage costs, leading to difficulties in implementing huge deep learning algorithms on memory and power constrained edge devices, such as smartphones and drones. Thus various compression methods are proposed, such as channel pruning. According to the analysis of low-level operations on edge devices, existing channel pruning methods have limited effect on latency optimization. Due to data processing operations, the pruned residual blocks still result in significant latency, which hinders real-time processing of CNNs on edge devices. Hence, we propose a generic deep learning architecture optimization method to achieve further acceleration on edge devices. The network is optimized in two stages, Global Constraint and Start-up Latency Reduction, and pruning of both channels and residual blocks is achieved. Optimized networks are evaluated on desktop CPU, FPGA, ARM CPU, and PULP platforms. The experimental results show that the latency is reduced by up to 70.40%, which is 13.63% higher than only applying channel pruning and achieving real-time processing in the edge device.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"215 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep learning based insulator fault detection algorithm for power transmission lines 基于深度学习的输电线路绝缘体故障检测算法

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-06-18 DOI: 10.1007/s11554-024-01495-9

Han Wang, Qing Yang, Binlin Zhang, Dexin Gao

{"title":"Deep learning based insulator fault detection algorithm for power transmission lines","authors":"Han Wang, Qing Yang, Binlin Zhang, Dexin Gao","doi":"10.1007/s11554-024-01495-9","DOIUrl":"https://doi.org/10.1007/s11554-024-01495-9","url":null,"abstract":"Aiming at the complex background of transmission lines at the present stage, which leads to the problem of low accuracy of insulator fault detection for small targets, a deep learning-based insulator fault detection algorithm for transmission lines is proposed. First, aerial images of insulators are collected using UAVs in different scenarios to establish insulator fault datasets. After that, in order to improve the detection efficiency of the target detection algorithm, certain improvements are made on the basis of the YOLOV9 algorithm. The improved algorithm enhances the feature extraction capability of the algorithm for insulator faults at a smaller computational cost by adding the GAM attention mechanism; at the same time, in order to realize the detection efficiency of small targets for insulator faults, the generalized efficient layer aggregation network (GELAN) module is improved and a new SC-GELAN module is proposed; the original loss function is replaced by the effective intersection-over-union (EIOU) loss function to minimize the difference between the aspect ratio of the predicted frame and the real frame, thereby accelerating the convergence speed of the model. Finally, the proposed algorithm is trained and tested with other target detection algorithms on the established insulator fault dataset. The experimental results and analysis show that the algorithm in this paper ensures a certain detection speed, while the algorithmic model has a higher detection accuracy, which is more suitable for UAV fault detection of insulators on transmission lines.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"70 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ARF-YOLOv8: a novel real-time object detection model for UAV-captured images detection ARF-YOLOv8：用于无人机捕获图像检测的新型实时物体检测模型

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-06-04 DOI: 10.1007/s11554-024-01483-z

YaLin Zeng, DongJin Guo, WeiKai He, Tian Zhang, ZhongTao Liu

{"title":"ARF-YOLOv8: a novel real-time object detection model for UAV-captured images detection","authors":"YaLin Zeng, DongJin Guo, WeiKai He, Tian Zhang, ZhongTao Liu","doi":"10.1007/s11554-024-01483-z","DOIUrl":"https://doi.org/10.1007/s11554-024-01483-z","url":null,"abstract":"There are several difficulties in the task of object detection for Unmanned Aerial Vehicle (UAV) photography images, including the small size of objects, densely distributed objects, and diverse perspectives from which the objects are captured. To tackle these challenges, we proposed a real-time algorithm named adjusting overall receptive field enhancement YOLOv8 (ARF-YOLOv8) for object detection in UAV-captured images. Our approach begins with a comprehensive restructuring of the YOLOv8 network architecture. The primary objectives are to mitigate the loss of shallow-level information and establish an optimal model receptive field. Subsequently, we designed a bibranch fusion attention module based on Coordinate Attention which is seamlessly integrated into the detection network. This module combines features processed by Coordinate Attention module with shallow-level features, facilitating the extraction of multi-level feature information. Furthermore, recognizing the influence of target size on boundary box loss, we refine the boundary box loss function CIoU Loss employed in YOLOv8. Extensive experimentation conducted on the visdrone2019 dataset provides empirical evidence supporting the superior performance of ARF-YOLOv8. In comparison to YOLOv8, our method demonstrates a noteworthy 6.86% increase in mAP (0.5:0.95) while maintaining similar detection speeds. The code is available at https://github.com/sbzeng/ARF-YOLOv8-for-uav/tree/main.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"66 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141252696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fcd-cnn: FPGA-based CU depth decision for HEVC intra encoder using CNN Fcd-cnn：使用 CNN 为 HEVC 内编码器做出基于 FPGA 的 CU 深度决策

IF 3 4区计算机科学

Journal of Real-Time Image Processing Pub Date : 2024-06-02 DOI: 10.1007/s11554-024-01487-9

Hossein Dehnavi, Mohammad Dehnavi, Sajad Haghzad Klidbary

{"title":"Fcd-cnn: FPGA-based CU depth decision for HEVC intra encoder using CNN","authors":"Hossein Dehnavi, Mohammad Dehnavi, Sajad Haghzad Klidbary","doi":"10.1007/s11554-024-01487-9","DOIUrl":"https://doi.org/10.1007/s11554-024-01487-9","url":null,"abstract":"Video compression for storage and transmission has always been a focal point for researchers in the field of image processing. Their efforts aim to reduce the data volume required for video representation while maintaining its quality. HEVC is one of the efficient standards for video compression, receiving special attention due to the increasing demand for high-resolution videos. The main step in video compression involves dividing the coding unit (CU) blocks into smaller blocks that have a uniform texture. In traditional methods, The Discrete Cosine Transform (DCT) is applied, followed by the use of RDO for decision-making on partitioning. This paper presents a novel convolutional neural network (CNN) and its hardware implementation as an alternative to DCT, aimed at speeding up partitioning and reducing the hardware resources required. The proposed hardware utilizes an efficient and lightweight CNN to partition CUs with low hardware resources in real-time applications. This CNN is trained for different Quantization Parameters (QPs) and block sizes to prevent overfitting. Furthermore, the system’s input size is fixed at (16times 16), and other input sizes are scaled to this dimension. Loop unrolling, data reuse, and resource sharing are applied in hardware implementation to save resources. The hardware architecture is fixed for all block sizes and QPs, and only the coefficients of the CNN are changed. In terms of compression quality, the proposed hardware achieves a (4.42%) BD-BR and (-,0.19) BD-PSNR compared to HM16.5. The proposed system can process (64times 64) CU at 150 MHz and in 4914 clock cycles. The hardware resources utilized by the proposed system include 13,141 LUTs, 15,885 Flip-flops, 51 BRAMs, and 74 DSPs.","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"70 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141252051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0