Multimodal Fusion Image Stabilization Algorithm for Bio-Inspired Flapping-Wing Aircraft.

IF 3.9 3区 医学 Q1 ENGINEERING, MULTIDISCIPLINARY
Zhikai Wang, Sen Wang, Yiwen Hu, Yangfan Zhou, Na Li, Xiaofeng Zhang
{"title":"Multimodal Fusion Image Stabilization Algorithm for Bio-Inspired Flapping-Wing Aircraft.","authors":"Zhikai Wang, Sen Wang, Yiwen Hu, Yangfan Zhou, Na Li, Xiaofeng Zhang","doi":"10.3390/biomimetics10070448","DOIUrl":null,"url":null,"abstract":"<p><p>This paper presents FWStab, a specialized video stabilization dataset tailored for flapping-wing platforms. The dataset encompasses five typical flight scenarios, featuring 48 video clips with intense dynamic jitter. The corresponding Inertial Measurement Unit (IMU) sensor data are synchronously collected, which jointly provide reliable support for multimodal modeling. Based on this, to address the issue of poor image acquisition quality due to severe vibrations in aerial vehicles, this paper proposes a multi-modal signal fusion video stabilization framework. This framework effectively integrates image features and inertial sensor features to predict smooth and stable camera poses. During the video stabilization process, the true camera motion originally estimated based on sensors is warped to the smooth trajectory predicted by the network, thereby optimizing the inter-frame stability. This approach maintains the global rigidity of scene motion, avoids visual artifacts caused by traditional dense optical flow-based spatiotemporal warping, and rectifies rolling shutter-induced distortions. Furthermore, the network is trained in an unsupervised manner by leveraging a joint loss function that integrates camera pose smoothness and optical flow residuals. When coupled with a multi-stage training strategy, this framework demonstrates remarkable stabilization adaptability across a wide range of scenarios. The entire framework employs Long Short-Term Memory (LSTM) to model the temporal characteristics of camera trajectories, enabling high-precision prediction of smooth trajectories.</p>","PeriodicalId":8907,"journal":{"name":"Biomimetics","volume":"10 7","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12292680/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomimetics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/biomimetics10070448","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

This paper presents FWStab, a specialized video stabilization dataset tailored for flapping-wing platforms. The dataset encompasses five typical flight scenarios, featuring 48 video clips with intense dynamic jitter. The corresponding Inertial Measurement Unit (IMU) sensor data are synchronously collected, which jointly provide reliable support for multimodal modeling. Based on this, to address the issue of poor image acquisition quality due to severe vibrations in aerial vehicles, this paper proposes a multi-modal signal fusion video stabilization framework. This framework effectively integrates image features and inertial sensor features to predict smooth and stable camera poses. During the video stabilization process, the true camera motion originally estimated based on sensors is warped to the smooth trajectory predicted by the network, thereby optimizing the inter-frame stability. This approach maintains the global rigidity of scene motion, avoids visual artifacts caused by traditional dense optical flow-based spatiotemporal warping, and rectifies rolling shutter-induced distortions. Furthermore, the network is trained in an unsupervised manner by leveraging a joint loss function that integrates camera pose smoothness and optical flow residuals. When coupled with a multi-stage training strategy, this framework demonstrates remarkable stabilization adaptability across a wide range of scenarios. The entire framework employs Long Short-Term Memory (LSTM) to model the temporal characteristics of camera trajectories, enabling high-precision prediction of smooth trajectories.

仿生扑翼飞机多模态融合稳像算法。
本文介绍了FWStab,一个专门为扑翼平台定制的视频稳定数据集。该数据集包括五种典型的飞行场景,包括48个带有强烈动态抖动的视频片段。同步采集相应的惯性测量单元(IMU)传感器数据,共同为多模态建模提供可靠支持。基于此,针对飞行器剧烈振动导致图像采集质量差的问题,本文提出了一种多模态信号融合视频稳定框架。该框架有效地集成了图像特征和惯性传感器特征,以预测平滑稳定的相机姿态。在稳像过程中,将原本基于传感器估计的真实摄像机运动扭曲为网络预测的平滑轨迹,从而优化帧间稳定性。该方法保持了场景运动的全局刚性,避免了传统的基于密集光流的时空扭曲造成的视觉伪影,并纠正了卷帘门引起的畸变。此外,通过利用集成相机姿态平滑和光流残差的联合损失函数,以无监督的方式训练网络。当与多阶段训练策略相结合时,该框架在广泛的场景中表现出显著的稳定适应性。整个框架采用长短期记忆(LSTM)来模拟相机轨迹的时间特征,从而实现对平滑轨迹的高精度预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biomimetics
Biomimetics Biochemistry, Genetics and Molecular Biology-Biotechnology
CiteScore
3.50
自引率
11.10%
发文量
189
审稿时长
11 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信