Syn-rPPG：利用生成模型改进合成视频的无监督远程光容积脉搏波提取

IF 8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2025-03-12 DOI:10.1016/j.engappai.2025.110504

Tianqi Liu , Hanguang Xiao , Yisha Sun , Kun Zuo , Qihang Zhang , Zhipeng Li , Feizhong Zhou

{"title":"Syn-rPPG：利用生成模型改进合成视频的无监督远程光容积脉搏波提取","authors":"Tianqi Liu , Hanguang Xiao , Yisha Sun , Kun Zuo , Qihang Zhang , Zhipeng Li , Feizhong Zhou","doi":"10.1016/j.engappai.2025.110504","DOIUrl":null,"url":null,"abstract":"<div><div>Remote photoplethysmography (rPPG) is a non-contact technology used to capture cardiac activity from the face, providing measurements of physiological parameters. Current unsupervised methods for rPPG tasks often focus on contrastive learning, which highlights relationships between samples but struggles with a lack of diverse training data, particularly in terms of varying skin colors and motion types. This limits model effectiveness in complex real-world scenarios. Generative models offer a potential solution by creating synthetic samples to enrich the training data. In this study, we explore the impact of using synthetic videos generated by style transfer and motion transfer techniques to enhance unsupervised rPPG tasks. We generate two types of synthetic videos: skin color synthetic videos and motion synthetic videos. These address the key challenges in rPPG, namely skin color variations and motion artifacts. Our analysis shows that these synthetic videos provide valuable physiological information, improving the performance and robustness of unsupervised models. Additionally, we propose a novel lightweight rPPG network, Style-Aware rPPG Fusion Net (SAFNet), based on an encoder–decoder structure, which is optimized for joint training with synthetic videos. By incorporating a feature fusion approach, SAFNet captures richer spatiotemporal information, resulting in superior performance and robustness. Extensive experiments on four public benchmark datasets demonstrate that our method achieves excellent results, particularly in challenging conditions, proving the effectiveness of using synthetic data to enhance remote physiological measurements.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"149 ","pages":"Article 110504"},"PeriodicalIF":8.0000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Syn-rPPG: Improving unsupervised remote photoplethysmography extraction with synthesized videos using generative models\",\"authors\":\"Tianqi Liu , Hanguang Xiao , Yisha Sun , Kun Zuo , Qihang Zhang , Zhipeng Li , Feizhong Zhou\",\"doi\":\"10.1016/j.engappai.2025.110504\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Remote photoplethysmography (rPPG) is a non-contact technology used to capture cardiac activity from the face, providing measurements of physiological parameters. Current unsupervised methods for rPPG tasks often focus on contrastive learning, which highlights relationships between samples but struggles with a lack of diverse training data, particularly in terms of varying skin colors and motion types. This limits model effectiveness in complex real-world scenarios. Generative models offer a potential solution by creating synthetic samples to enrich the training data. In this study, we explore the impact of using synthetic videos generated by style transfer and motion transfer techniques to enhance unsupervised rPPG tasks. We generate two types of synthetic videos: skin color synthetic videos and motion synthetic videos. These address the key challenges in rPPG, namely skin color variations and motion artifacts. Our analysis shows that these synthetic videos provide valuable physiological information, improving the performance and robustness of unsupervised models. Additionally, we propose a novel lightweight rPPG network, Style-Aware rPPG Fusion Net (SAFNet), based on an encoder–decoder structure, which is optimized for joint training with synthetic videos. By incorporating a feature fusion approach, SAFNet captures richer spatiotemporal information, resulting in superior performance and robustness. Extensive experiments on four public benchmark datasets demonstrate that our method achieves excellent results, particularly in challenging conditions, proving the effectiveness of using synthetic data to enhance remote physiological measurements.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"149 \",\"pages\":\"Article 110504\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625005044\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625005044","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

远程光电容积脉搏波描记（rPPG）是一种非接触式技术，用于从面部捕捉心脏活动，提供生理参数的测量。目前用于rPPG任务的无监督方法通常侧重于对比学习，这强调了样本之间的关系，但缺乏多样化的训练数据，特别是在不同的肤色和运动类型方面。这限制了模型在复杂现实场景中的有效性。生成模型通过创建合成样本来丰富训练数据提供了一个潜在的解决方案。在本研究中，我们探讨了使用由风格迁移和动作迁移技术生成的合成视频来增强无监督rPPG任务的影响。我们生成两种类型的合成视频：肤色合成视频和运动合成视频。这些解决了rPPG中的关键挑战，即皮肤颜色变化和运动伪影。我们的分析表明，这些合成视频提供了有价值的生理信息，提高了无监督模型的性能和鲁棒性。此外，我们提出了一种新的轻量级rPPG网络，基于编码器-解码器结构的风格感知rPPG融合网络（SAFNet），该网络针对与合成视频的联合训练进行了优化。通过结合特征融合方法，SAFNet捕获了更丰富的时空信息，从而获得了卓越的性能和鲁棒性。在四个公共基准数据集上进行的大量实验表明，我们的方法取得了出色的结果，特别是在具有挑战性的条件下，证明了使用合成数据增强远程生理测量的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Syn-rPPG: Improving unsupervised remote photoplethysmography extraction with synthesized videos using generative models

Remote photoplethysmography (rPPG) is a non-contact technology used to capture cardiac activity from the face, providing measurements of physiological parameters. Current unsupervised methods for rPPG tasks often focus on contrastive learning, which highlights relationships between samples but struggles with a lack of diverse training data, particularly in terms of varying skin colors and motion types. This limits model effectiveness in complex real-world scenarios. Generative models offer a potential solution by creating synthetic samples to enrich the training data. In this study, we explore the impact of using synthetic videos generated by style transfer and motion transfer techniques to enhance unsupervised rPPG tasks. We generate two types of synthetic videos: skin color synthetic videos and motion synthetic videos. These address the key challenges in rPPG, namely skin color variations and motion artifacts. Our analysis shows that these synthetic videos provide valuable physiological information, improving the performance and robustness of unsupervised models. Additionally, we propose a novel lightweight rPPG network, Style-Aware rPPG Fusion Net (SAFNet), based on an encoder–decoder structure, which is optimized for joint training with synthetic videos. By incorporating a feature fusion approach, SAFNet captures richer spatiotemporal information, resulting in superior performance and robustness. Extensive experiments on four public benchmark datasets demonstrate that our method achieves excellent results, particularly in challenging conditions, proving the effectiveness of using synthetic data to enhance remote physiological measurements.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.