Manuel Lage Cañellas , Constantino Álvarez Casado , Le Nguyen , Miguel Bordallo López
{"title":"A self-supervised multimodal framework for 1D physiological data fusion in remote health monitoring","authors":"Manuel Lage Cañellas , Constantino Álvarez Casado , Le Nguyen , Miguel Bordallo López","doi":"10.1016/j.inffus.2025.103397","DOIUrl":null,"url":null,"abstract":"<div><div>The growth of labeled data for remote healthcare analysis lags far behind the rapid expansion of raw data, creating a significant bottleneck. To address this, we propose a multimodal self-supervised learning (SSL) framework for 1D signals that leverages unlabeled physiological data. Our architecture fuses heart and respiration waveforms from three sensors – mWave radar, RGB camera, and depth camera – while processing and augmenting each modality separately. It then uses contrastive learning to extract robust features from the data. This architecture enables effective downstream task training, with reduced labeled data even in scenarios where certain sensors or modalities are unavailable. We validate our approach using the OMuSense-23 multimodal biometric dataset, and evaluate its performance on tasks such as breathing pattern recognition and physiological classification. Our results show that the models perform comparably to fully supervised methods when using large amounts of labeled data and outperforms them when using only a small percentage. In particular, with 1% of the labels, the model achieves 64% accuracy in breathing pattern classification compared to 24 % with a fully supervised approach. This work highlights the scalability and adaptability of self-supervised learning for physiological monitoring, making it particularly valuable for healthcare and well-being applications with limited labels or sensor availability. The code is publicly available at: <span><span>https://gitlab.com/manulainen/ssl-physiological</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103397"},"PeriodicalIF":15.5000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525004701","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The growth of labeled data for remote healthcare analysis lags far behind the rapid expansion of raw data, creating a significant bottleneck. To address this, we propose a multimodal self-supervised learning (SSL) framework for 1D signals that leverages unlabeled physiological data. Our architecture fuses heart and respiration waveforms from three sensors – mWave radar, RGB camera, and depth camera – while processing and augmenting each modality separately. It then uses contrastive learning to extract robust features from the data. This architecture enables effective downstream task training, with reduced labeled data even in scenarios where certain sensors or modalities are unavailable. We validate our approach using the OMuSense-23 multimodal biometric dataset, and evaluate its performance on tasks such as breathing pattern recognition and physiological classification. Our results show that the models perform comparably to fully supervised methods when using large amounts of labeled data and outperforms them when using only a small percentage. In particular, with 1% of the labels, the model achieves 64% accuracy in breathing pattern classification compared to 24 % with a fully supervised approach. This work highlights the scalability and adaptability of self-supervised learning for physiological monitoring, making it particularly valuable for healthcare and well-being applications with limited labels or sensor availability. The code is publicly available at: https://gitlab.com/manulainen/ssl-physiological.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.