Generalized multisensor wearable signal fusion for emotion recognition from noisy and incomplete data

Q2 Health Professions

Smart Health Pub Date : 2025-03-24 DOI:10.1016/j.smhl.2025.100571

Vamsi Kumar Naidu Pallapothula , Sidharth Anand , Sreyasee Das Bhattacharjee, Junsong Yuan

{"title":"Generalized multisensor wearable signal fusion for emotion recognition from noisy and incomplete data","authors":"Vamsi Kumar Naidu Pallapothula , Sidharth Anand , Sreyasee Das Bhattacharjee, Junsong Yuan","doi":"10.1016/j.smhl.2025.100571","DOIUrl":null,"url":null,"abstract":"<div><div>Continual real-time monitoring of users’ health via noninvasive wearable devices (e.g., smartwatch, smartphone) demonstrates significant potential to enhance human well-being in everyday life. However, due to respective sampling rates, noise sensitivity, and data types, the inherent heterogeneity of the signals received from multiple sensors make the task of biosignal-based emotion recognition both complex and time-consuming. While how to optimally fuse multimode information (where each sensor produces a unique mode-specific input signal) to ensure a reliable inference performance remains difficult, the particular challenges in this problem setting is primarily threefold: (1) The data availability is limited due to several unique person/device-specific properties and high cost of labeling; (2) The acquired signals from wearable devices are often noisy or may as well be lossy due to users’ personal lifestyle choices or environmental interferences; (3) Due to several intra-individual and inter-individual signal variabilities, enabling model generalizability is always difficult. To this end, we propose a general-purpose multisensor fusion network, <em>GM-FuseNet</em> that can seamlessly integrate and transform multi-sensor signal information for a variety of tasks. Unlike a majority of existing works, which rely on a fundamental assumption that full multi-mode query information is present during inference, <em>GM-FuseNet</em>’s first-level preface multimodal transformer module is explicitly designed to enhance both unimodal and multimodal performance in the presence of partial modality details. We also utilize an effective <em>multimodal temporal correlation loss</em> that aligns the unimode signals pairwise in the temporal domain and encourages the model to learn the temporal correlation across multiple sensor-specific signals. Extensive evaluation using two public datasets WESAD and CASE reports outperformance (<span><math><mrow><mn>1</mn><mtext>–</mtext><mn>4</mn><mtext>%</mtext></mrow></math></span>) of the proposed <em>GM-FuseNet</em> against state-of-the-art supervised or self-supervised models while delivering a consistently robust generalization all-across. Additionally, by reporting another <span><math><mrow><mn>2</mn><mtext>–</mtext><mn>4</mn><mtext>%</mtext></mrow></math></span> improved accuracy and F1-scores, <em>GM-FuseNet</em> also demonstrates a significant promise in handling a variety of test environments including the missing and noisy multisensor query signals.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"36 ","pages":"Article 100571"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart Health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352648325000327","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Health Professions","Score":null,"Total":0}

引用次数: 0

Abstract

Continual real-time monitoring of users’ health via noninvasive wearable devices (e.g., smartwatch, smartphone) demonstrates significant potential to enhance human well-being in everyday life. However, due to respective sampling rates, noise sensitivity, and data types, the inherent heterogeneity of the signals received from multiple sensors make the task of biosignal-based emotion recognition both complex and time-consuming. While how to optimally fuse multimode information (where each sensor produces a unique mode-specific input signal) to ensure a reliable inference performance remains difficult, the particular challenges in this problem setting is primarily threefold: (1) The data availability is limited due to several unique person/device-specific properties and high cost of labeling; (2) The acquired signals from wearable devices are often noisy or may as well be lossy due to users’ personal lifestyle choices or environmental interferences; (3) Due to several intra-individual and inter-individual signal variabilities, enabling model generalizability is always difficult. To this end, we propose a general-purpose multisensor fusion network, GM-FuseNet that can seamlessly integrate and transform multi-sensor signal information for a variety of tasks. Unlike a majority of existing works, which rely on a fundamental assumption that full multi-mode query information is present during inference, GM-FuseNet’s first-level preface multimodal transformer module is explicitly designed to enhance both unimodal and multimodal performance in the presence of partial modality details. We also utilize an effective multimodal temporal correlation loss that aligns the unimode signals pairwise in the temporal domain and encourages the model to learn the temporal correlation across multiple sensor-specific signals. Extensive evaluation using two public datasets WESAD and CASE reports outperformance (

1 – 4 %

) of the proposed GM-FuseNet against state-of-the-art supervised or self-supervised models while delivering a consistently robust generalization all-across. Additionally, by reporting another

2 – 4 %

improved accuracy and F1-scores, GM-FuseNet also demonstrates a significant promise in handling a variety of test environments including the missing and noisy multisensor query signals.

查看原文本刊更多论文

基于噪声和不完整数据的广义多传感器可穿戴信号融合情感识别

通过无创可穿戴设备（如智能手表、智能手机）持续实时监测用户的健康状况，显示出在日常生活中提高人类福祉的巨大潜力。然而，由于各自的采样率、噪声灵敏度和数据类型，从多个传感器接收到的信号的固有异质性使得基于生物信号的情绪识别任务既复杂又耗时。虽然如何最佳地融合多模式信息（每个传感器产生独特的特定模式输入信号）以确保可靠的推理性能仍然很困难，但该问题设置中的特殊挑战主要有三个方面：(1)由于几个独特的个人/设备特定属性和高标签成本，数据可用性受到限制；(2)由于用户个人生活方式选择或环境干扰，可穿戴设备采集的信号往往存在噪声或有损；(3)由于个体内和个体间信号的多种可变性，使模型具有泛化性总是困难的。为此，我们提出了一种通用的多传感器融合网络GM-FuseNet，它可以无缝地集成和转换各种任务的多传感器信号信息。与大多数现有工作不同，这些工作依赖于一个基本假设，即在推理过程中存在完整的多模态查询信息，GM-FuseNet的一级序多模态变压器模块被明确设计为在存在部分模态细节的情况下增强单模态和多模态性能。我们还利用了有效的多模态时间相关损失，在时域中成对地对齐单模态信号，并鼓励模型学习多个传感器特定信号的时间相关性。使用两个公共数据集WESAD和CASE进行的广泛评估报告了所提出的GM-FuseNet优于最先进的监督或自监督模型（1-4%），同时提供了始终如一的鲁棒泛化。此外，GM-FuseNet在处理各种测试环境（包括丢失和有噪声的多传感器查询信号）方面也显示出了巨大的希望，其准确性和f1分数又提高了2-4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Smart Health Computer Science-Computer Science Applications

CiteScore

6.50

自引率

0.00%

发文量