大型光伏电站实时异常检测与分类的整体多模态方法

IF 9.6 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Energy and AI Pub Date : 2025-05-09 DOI:10.1016/j.egyai.2025.100525

Zoubir Barraz , Imane Sebari , Hicham Oufettoul , Kenza Ait el kadi , Nassim Lamrini , Ibtihal Ait Abdelmoula

{"title":"大型光伏电站实时异常检测与分类的整体多模态方法","authors":"Zoubir Barraz , Imane Sebari , Hicham Oufettoul , Kenza Ait el kadi , Nassim Lamrini , Ibtihal Ait Abdelmoula","doi":"10.1016/j.egyai.2025.100525","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a holistic multimodal approach for real-time anomaly detection and classification in large-scale photovoltaic plants. The approach encompasses segmentation, geolocation, and classification of individual photovoltaic modules. A fine-tuned Yolov7 model was trained for the individual module’s segmentation of both modalities; RGB and IR images. The localization of individual solar panels relies on photogrammetric measurements to facilitate maintenance operations. The localization process also links extracted images of the same panel using their geographical coordinates and preprocesses them for the multimodal model input. The study also focuses on optimizing pre-trained models using Bayesian search to improve and fine-tune them with our dataset. The dataset was collected from different systems and technologies within our research platform. It has been curated into 1841 images and classified into five anomaly classes. Grad-CAM, an explainable AI tool, is utilized to compare the use of multimodality to a single modality. Finally, for real-time optimization, the ONNX format was used to optimize the model further for deployment in real-time. The improved ConvNext-Tiny model performed well in both modalities, with 99 % precision, recall, and F1-score for binary classification and 85 % for multi-class classification. In terms of latency, the segmentation models have an inference time of 14 ms and 12 ms for RGB and IR images and 24 ms for detection and classification. The proposed holistic approach includes a built-in feedback loop to ensure the model’s robustness against domain shifts in the production environment.</div></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"21 ","pages":"Article 100525"},"PeriodicalIF":9.6000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A holistic multimodal approach for real-time anomaly detection and classification in large-scale photovoltaic plants\",\"authors\":\"Zoubir Barraz , Imane Sebari , Hicham Oufettoul , Kenza Ait el kadi , Nassim Lamrini , Ibtihal Ait Abdelmoula\",\"doi\":\"10.1016/j.egyai.2025.100525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper presents a holistic multimodal approach for real-time anomaly detection and classification in large-scale photovoltaic plants. The approach encompasses segmentation, geolocation, and classification of individual photovoltaic modules. A fine-tuned Yolov7 model was trained for the individual module’s segmentation of both modalities; RGB and IR images. The localization of individual solar panels relies on photogrammetric measurements to facilitate maintenance operations. The localization process also links extracted images of the same panel using their geographical coordinates and preprocesses them for the multimodal model input. The study also focuses on optimizing pre-trained models using Bayesian search to improve and fine-tune them with our dataset. The dataset was collected from different systems and technologies within our research platform. It has been curated into 1841 images and classified into five anomaly classes. Grad-CAM, an explainable AI tool, is utilized to compare the use of multimodality to a single modality. Finally, for real-time optimization, the ONNX format was used to optimize the model further for deployment in real-time. The improved ConvNext-Tiny model performed well in both modalities, with 99 % precision, recall, and F1-score for binary classification and 85 % for multi-class classification. In terms of latency, the segmentation models have an inference time of 14 ms and 12 ms for RGB and IR images and 24 ms for detection and classification. The proposed holistic approach includes a built-in feedback loop to ensure the model’s robustness against domain shifts in the production environment.</div></div>\",\"PeriodicalId\":34138,\"journal\":{\"name\":\"Energy and AI\",\"volume\":\"21 \",\"pages\":\"Article 100525\"},\"PeriodicalIF\":9.6000,\"publicationDate\":\"2025-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Energy and AI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666546825000576\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546825000576","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种大型光伏电站实时异常检测与分类的整体多模态方法。该方法包括对单个光伏组件的分割、地理定位和分类。训练了一个微调的Yolov7模型，用于两个模态的单个模块分割；RGB和IR图像。单个太阳能电池板的定位依赖于摄影测量，以方便维护操作。定位过程还使用同一面板的地理坐标链接提取的图像，并对其进行预处理，以用于多模态模型输入。该研究还侧重于使用贝叶斯搜索优化预训练模型，以改进和微调我们的数据集。数据集是从我们研究平台内的不同系统和技术中收集的。它被整理成1841张图片，并被分为五个异常类。Grad-CAM是一个可解释的人工智能工具，用于比较多模态和单一模态的使用。最后，在实时优化方面，采用ONNX格式对模型进行进一步优化，以便实时部署。改进的ConvNext-Tiny模型在两种模式下都表现良好，二元分类的准确率、召回率和f1得分为99%，多类分类的准确率为85%。在延迟方面，分割模型对RGB和IR图像的推理时间分别为14 ms和12 ms，对检测和分类的推理时间为24 ms。提出的整体方法包括一个内置的反馈循环，以确保模型在生产环境中对领域转移的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

A holistic multimodal approach for real-time anomaly detection and classification in large-scale photovoltaic plants

查看原文本刊更多论文

A holistic multimodal approach for real-time anomaly detection and classification in large-scale photovoltaic plants

This paper presents a holistic multimodal approach for real-time anomaly detection and classification in large-scale photovoltaic plants. The approach encompasses segmentation, geolocation, and classification of individual photovoltaic modules. A fine-tuned Yolov7 model was trained for the individual module’s segmentation of both modalities; RGB and IR images. The localization of individual solar panels relies on photogrammetric measurements to facilitate maintenance operations. The localization process also links extracted images of the same panel using their geographical coordinates and preprocesses them for the multimodal model input. The study also focuses on optimizing pre-trained models using Bayesian search to improve and fine-tune them with our dataset. The dataset was collected from different systems and technologies within our research platform. It has been curated into 1841 images and classified into five anomaly classes. Grad-CAM, an explainable AI tool, is utilized to compare the use of multimodality to a single modality. Finally, for real-time optimization, the ONNX format was used to optimize the model further for deployment in real-time. The improved ConvNext-Tiny model performed well in both modalities, with 99 % precision, recall, and F1-score for binary classification and 85 % for multi-class classification. In terms of latency, the segmentation models have an inference time of 14 ms and 12 ms for RGB and IR images and 24 ms for detection and classification. The proposed holistic approach includes a built-in feedback loop to ensure the model’s robustness against domain shifts in the production environment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Energy and AI Engineering-Engineering (miscellaneous)

CiteScore

16.50

自引率

0.00%

发文量

审稿时长

56 days