A self-supervised multimodal framework for 1D physiological data fusion in remote health monitoring

IF 15.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2025-06-20 DOI:10.1016/j.inffus.2025.103397

Manuel Lage Cañellas , Constantino Álvarez Casado , Le Nguyen , Miguel Bordallo López

{"title":"A self-supervised multimodal framework for 1D physiological data fusion in remote health monitoring","authors":"Manuel Lage Cañellas , Constantino Álvarez Casado , Le Nguyen , Miguel Bordallo López","doi":"10.1016/j.inffus.2025.103397","DOIUrl":null,"url":null,"abstract":"<div><div>The growth of labeled data for remote healthcare analysis lags far behind the rapid expansion of raw data, creating a significant bottleneck. To address this, we propose a multimodal self-supervised learning (SSL) framework for 1D signals that leverages unlabeled physiological data. Our architecture fuses heart and respiration waveforms from three sensors – mWave radar, RGB camera, and depth camera – while processing and augmenting each modality separately. It then uses contrastive learning to extract robust features from the data. This architecture enables effective downstream task training, with reduced labeled data even in scenarios where certain sensors or modalities are unavailable. We validate our approach using the OMuSense-23 multimodal biometric dataset, and evaluate its performance on tasks such as breathing pattern recognition and physiological classification. Our results show that the models perform comparably to fully supervised methods when using large amounts of labeled data and outperforms them when using only a small percentage. In particular, with 1% of the labels, the model achieves 64% accuracy in breathing pattern classification compared to 24 % with a fully supervised approach. This work highlights the scalability and adaptability of self-supervised learning for physiological monitoring, making it particularly valuable for healthcare and well-being applications with limited labels or sensor availability. The code is publicly available at: <span><span>https://gitlab.com/manulainen/ssl-physiological</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103397"},"PeriodicalIF":15.5000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525004701","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The growth of labeled data for remote healthcare analysis lags far behind the rapid expansion of raw data, creating a significant bottleneck. To address this, we propose a multimodal self-supervised learning (SSL) framework for 1D signals that leverages unlabeled physiological data. Our architecture fuses heart and respiration waveforms from three sensors – mWave radar, RGB camera, and depth camera – while processing and augmenting each modality separately. It then uses contrastive learning to extract robust features from the data. This architecture enables effective downstream task training, with reduced labeled data even in scenarios where certain sensors or modalities are unavailable. We validate our approach using the OMuSense-23 multimodal biometric dataset, and evaluate its performance on tasks such as breathing pattern recognition and physiological classification. Our results show that the models perform comparably to fully supervised methods when using large amounts of labeled data and outperforms them when using only a small percentage. In particular, with 1% of the labels, the model achieves 64% accuracy in breathing pattern classification compared to 24 % with a fully supervised approach. This work highlights the scalability and adaptability of self-supervised learning for physiological monitoring, making it particularly valuable for healthcare and well-being applications with limited labels or sensor availability. The code is publicly available at: https://gitlab.com/manulainen/ssl-physiological.

查看原文本刊更多论文

远程健康监测中一维生理数据融合的自监督多模态框架

用于远程医疗保健分析的标记数据的增长远远落后于原始数据的快速增长，从而形成了一个重大瓶颈。为了解决这个问题，我们提出了一个利用未标记生理数据的一维信号的多模态自监督学习（SSL）框架。我们的架构融合了来自三个传感器（毫米波雷达、RGB相机和深度相机）的心脏和呼吸波形，同时分别处理和增强每种模式。然后，它使用对比学习从数据中提取鲁棒特征。该架构支持有效的下游任务训练，即使在某些传感器或模式不可用的情况下也可以减少标记数据。我们使用OMuSense-23多模态生物特征数据集验证了我们的方法，并评估了其在呼吸模式识别和生理分类等任务中的性能。我们的结果表明，当使用大量标记数据时，模型的性能与完全监督方法相当，并且仅使用一小部分标记数据时优于它们。特别是，使用1%的标签，该模型在呼吸模式分类方面达到64%的准确率，而完全监督方法的准确率为24%。这项工作强调了自我监督学习用于生理监测的可扩展性和适应性，使其对标签或传感器可用性有限的医疗保健和福祉应用特别有价值。该代码可在https://gitlab.com/manulainen/ssl-physiological公开获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.