基于视频彩色放大和时空自注意的远程心率测量

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL

Biomedical Signal Processing and Control Pub Date : 2025-02-21 DOI:10.1016/j.bspc.2025.107677

Ning Sun , Peixian He , Jixin Liu , Lei Chai , Cong Wu , Xiujuan Liu

{"title":"基于视频彩色放大和时空自注意的远程心率测量","authors":"Ning Sun , Peixian He , Jixin Liu , Lei Chai , Cong Wu , Xiujuan Liu","doi":"10.1016/j.bspc.2025.107677","DOIUrl":null,"url":null,"abstract":"<div><div>Remote photoplethysmography (rPPG) for heart rate measurement has garnered significant attention due to its non-contact advantages. The challenge in video-based remote heart rate measurement lies in accurately capturing subtle changes in facial color. We propose an end-to-end deep learning model named Video Color Magnification and Spatiotemporal Feature Extraction Network (VS-Net). VS-Net comprises three main modules: video color magnification, spatiotemporal self-attention feature extraction, and contrastive learning. The video color magnification module, implemented using a deep neural network, initially magnifies subtle facial color changes in the input video. The magnified color features are then fed into the spatiotemporal self-attention feature extraction module. This module utilizes a multi-head self-attention mechanism along with convolutional neural networks to locally and globally model information exchange across magnified video frames, capturing long-term dependencies and extracting spatiotemporal features. Additionally, the model incorporates a contrastive learning module designed to improve weak signal detection in facial videos. By generating positive and negative samples based on video frequency resampling, the model captures similarities and differences among input samples, thereby learning more robust semantic feature representations. Comprehensive experiments were conducted on three public datasets: UBFC-RPPG, PURE, and MAHNOB-HCI. The results demonstrate that VS-Net effectively extracts rPPG signals from facial videos and outperforms state-of-the-art methods in heart rate measurement.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"106 ","pages":"Article 107677"},"PeriodicalIF":4.9000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Remote heart rate measurement based on video color magnification and spatiotemporal self-attention\",\"authors\":\"Ning Sun , Peixian He , Jixin Liu , Lei Chai , Cong Wu , Xiujuan Liu\",\"doi\":\"10.1016/j.bspc.2025.107677\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Remote photoplethysmography (rPPG) for heart rate measurement has garnered significant attention due to its non-contact advantages. The challenge in video-based remote heart rate measurement lies in accurately capturing subtle changes in facial color. We propose an end-to-end deep learning model named Video Color Magnification and Spatiotemporal Feature Extraction Network (VS-Net). VS-Net comprises three main modules: video color magnification, spatiotemporal self-attention feature extraction, and contrastive learning. The video color magnification module, implemented using a deep neural network, initially magnifies subtle facial color changes in the input video. The magnified color features are then fed into the spatiotemporal self-attention feature extraction module. This module utilizes a multi-head self-attention mechanism along with convolutional neural networks to locally and globally model information exchange across magnified video frames, capturing long-term dependencies and extracting spatiotemporal features. Additionally, the model incorporates a contrastive learning module designed to improve weak signal detection in facial videos. By generating positive and negative samples based on video frequency resampling, the model captures similarities and differences among input samples, thereby learning more robust semantic feature representations. Comprehensive experiments were conducted on three public datasets: UBFC-RPPG, PURE, and MAHNOB-HCI. The results demonstrate that VS-Net effectively extracts rPPG signals from facial videos and outperforms state-of-the-art methods in heart rate measurement.</div></div>\",\"PeriodicalId\":55362,\"journal\":{\"name\":\"Biomedical Signal Processing and Control\",\"volume\":\"106 \",\"pages\":\"Article 107677\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-02-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomedical Signal Processing and Control\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1746809425001880\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809425001880","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

远程光电容积脉搏波（rPPG）由于其非接触的优点而引起了人们的广泛关注。基于视频的远程心率测量的挑战在于准确捕捉面部颜色的细微变化。我们提出了一个端到端的深度学习模型——视频颜色放大与时空特征提取网络（VS-Net）。VS-Net主要包括三个模块：视频色彩放大、时空自注意特征提取和对比学习。视频颜色放大模块使用深度神经网络实现，最初放大输入视频中细微的面部颜色变化。然后将放大后的颜色特征输入到时空自注意特征提取模块。该模块利用多头自注意机制以及卷积神经网络，在放大的视频帧之间进行局部和全局建模信息交换，捕获长期依赖关系并提取时空特征。此外，该模型结合了一个对比学习模块，旨在提高面部视频中的弱信号检测。该模型通过视频重采样生成正负样本，捕获输入样本之间的异同点，从而学习到更加鲁棒的语义特征表示。在UBFC-RPPG、PURE和MAHNOB-HCI三个公共数据集上进行了综合实验。结果表明，VS-Net有效地从面部视频中提取rPPG信号，并优于最先进的心率测量方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Remote heart rate measurement based on video color magnification and spatiotemporal self-attention

Remote photoplethysmography (rPPG) for heart rate measurement has garnered significant attention due to its non-contact advantages. The challenge in video-based remote heart rate measurement lies in accurately capturing subtle changes in facial color. We propose an end-to-end deep learning model named Video Color Magnification and Spatiotemporal Feature Extraction Network (VS-Net). VS-Net comprises three main modules: video color magnification, spatiotemporal self-attention feature extraction, and contrastive learning. The video color magnification module, implemented using a deep neural network, initially magnifies subtle facial color changes in the input video. The magnified color features are then fed into the spatiotemporal self-attention feature extraction module. This module utilizes a multi-head self-attention mechanism along with convolutional neural networks to locally and globally model information exchange across magnified video frames, capturing long-term dependencies and extracting spatiotemporal features. Additionally, the model incorporates a contrastive learning module designed to improve weak signal detection in facial videos. By generating positive and negative samples based on video frequency resampling, the model captures similarities and differences among input samples, thereby learning more robust semantic feature representations. Comprehensive experiments were conducted on three public datasets: UBFC-RPPG, PURE, and MAHNOB-HCI. The results demonstrate that VS-Net effectively extracts rPPG signals from facial videos and outperforms state-of-the-art methods in heart rate measurement.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biomedical Signal Processing and Control 工程技术-工程：生物医学

CiteScore

9.80

自引率

13.70%

发文量

822

审稿时长

4 months

期刊介绍： Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management. Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.