MSMD-YOLO: Multi-scale and multi-directional Mamba scanning infrared image object detection based on YOLO

IF 3.4 3区 物理与天体物理 Q2 INSTRUMENTS & INSTRUMENTATION
Boheng Tian , Zhiming Lu , Chen Zhang , Haiyan Li , Pengfei Yu
{"title":"MSMD-YOLO: Multi-scale and multi-directional Mamba scanning infrared image object detection based on YOLO","authors":"Boheng Tian ,&nbsp;Zhiming Lu ,&nbsp;Chen Zhang ,&nbsp;Haiyan Li ,&nbsp;Pengfei Yu","doi":"10.1016/j.infrared.2025.106011","DOIUrl":null,"url":null,"abstract":"<div><div>Infrared (IR) imaging technology, which operates independently of light and weather conditions and can penetrate clouds and soot, offers unique advantages for object detection. However, detecting objects of varying scales remains a significant challenge due to object size, distance, resolution, and scene complexity differences. To address these challenges, we propose a multi-directional and multi-scale localized feature-enhanced infrared object detection method based on YOLOv7. The proposed model introduces the Mamba module with a selective mechanism and multi-scale feature branching to effectively capture object details at different scales. The S-ELAN module integrates multi-directional scanning with a deep convolutional structure to enhance multi-scale feature extraction. Moreover, the local feature enhancement module expands the receptive field using dilated convolution, improving feature representation through the CBAM attention mechanism. It enhances the model’s semantic understanding of objects. Experimental results on a self-constructed multi-scale infrared object dataset demonstrate that the proposed model adeptly tackles the complexities inherent in detecting objects across various scales. Specifically, experiments on the MSIR dataset revealed an mAP0.5 score of 96.8%, which is 4.4% higher than the baseline model, YOLOv7. Furthermore, on the FLIR public dataset, the proposed model achieves an mAP0.5 score of 86.6%, outperforming YOLOv7 by 4.0%. These findings indicate significant performance improvements over prevalent object detection algorithms, highlighting the model’s effectiveness and strong generalization ability in infrared object detection. The code is available at <span><span>https://github.com/ELF233/MS-MD-YOLO</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"150 ","pages":"Article 106011"},"PeriodicalIF":3.4000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449525003044","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0

Abstract

Infrared (IR) imaging technology, which operates independently of light and weather conditions and can penetrate clouds and soot, offers unique advantages for object detection. However, detecting objects of varying scales remains a significant challenge due to object size, distance, resolution, and scene complexity differences. To address these challenges, we propose a multi-directional and multi-scale localized feature-enhanced infrared object detection method based on YOLOv7. The proposed model introduces the Mamba module with a selective mechanism and multi-scale feature branching to effectively capture object details at different scales. The S-ELAN module integrates multi-directional scanning with a deep convolutional structure to enhance multi-scale feature extraction. Moreover, the local feature enhancement module expands the receptive field using dilated convolution, improving feature representation through the CBAM attention mechanism. It enhances the model’s semantic understanding of objects. Experimental results on a self-constructed multi-scale infrared object dataset demonstrate that the proposed model adeptly tackles the complexities inherent in detecting objects across various scales. Specifically, experiments on the MSIR dataset revealed an mAP0.5 score of 96.8%, which is 4.4% higher than the baseline model, YOLOv7. Furthermore, on the FLIR public dataset, the proposed model achieves an mAP0.5 score of 86.6%, outperforming YOLOv7 by 4.0%. These findings indicate significant performance improvements over prevalent object detection algorithms, highlighting the model’s effectiveness and strong generalization ability in infrared object detection. The code is available at https://github.com/ELF233/MS-MD-YOLO.
MSMD-YOLO:基于YOLO的多尺度多向曼巴扫描红外图像目标检测
红外(IR)成像技术不受光线和天气条件的影响,可以穿透云层和烟尘,为目标探测提供了独特的优势。然而,由于物体大小、距离、分辨率和场景复杂性的差异,检测不同尺度的物体仍然是一个重大挑战。为了解决这些问题,我们提出了一种基于YOLOv7的多方向、多尺度局部特征增强红外目标检测方法。该模型引入了具有选择机制和多尺度特征分支的Mamba模块,可以有效地捕获不同尺度的目标细节。S-ELAN模块集成了多向扫描和深度卷积结构,以增强多尺度特征提取。此外,局部特征增强模块利用扩张卷积扩展感受野,通过CBAM注意机制改善特征表征。它增强了模型对对象的语义理解。在自构建的多尺度红外目标数据集上的实验结果表明,该模型能够很好地解决不同尺度目标检测的复杂性问题。具体而言,在MSIR数据集上的实验显示,mAP0.5得分为96.8%,比基线模型YOLOv7高4.4%。此外,在FLIR公共数据集上,该模型的mAP0.5得分为86.6%,比YOLOv7高出4.0%。这些研究结果表明,该模型在红外目标检测中具有较强的泛化能力和有效性。代码可在https://github.com/ELF233/MS-MD-YOLO上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.70
自引率
12.10%
发文量
400
审稿时长
67 days
期刊介绍: The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region. Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine. Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信