Ning Sun , Ningbin Wang , Jixin Liu , Lei Chai , Haian Sun
{"title":"基于时空自我注意的远程心率测量","authors":"Ning Sun , Ningbin Wang , Jixin Liu , Lei Chai , Haian Sun","doi":"10.1016/j.compeleceng.2025.110557","DOIUrl":null,"url":null,"abstract":"<div><div>Remote photoplethysmography (rPPG), a non-contact technique for measuring heart rate, has gained significant traction due to its convenience and non-invasive nature. However, rPPG signals are inherently weak and exhibit regional variations across facial areas. Accurate heart rate estimation necessitates analysis of extended-duration video sequences (exceeding 100 frames). To address these challenges, this paper proposes a novel deep neural network (DNN) based on self-attention mechanism, named spatial–temporal self-attention network (STSA-Net). This convolution-free DNN primarily adopts a transformer architecture. Initially, differential processing is applied to the input video to accentuate frame-to-frame variations and amplify subtle rPPG signals. These differential frames are subsequently passed through a spatial self-attention encoding module. This module models spatial dependencies within facial regions, allowing the network to focus on informative areas while suppressing irrelevant noise. Following spatial encoding, the features are processed by a temporal self-attention module, which captures long-term dependencies across frames using transformer-based techniques. The proposed method is rigorously evaluated through comprehensive experiments, including ablation studies, intra-database evaluations, and cross-database comparisons, using three benchmark databases: UBFC-RPPG, PURE, and MAHNOB-HCI. Our model demonstrates performance on par with state-of-the-art methods for remote heart rate measurement, achieving a MAE of 0.4 bpm on PURE, 0.48 bpm on UBFC, and 3.75 bpm on MAHNOB-HCI in intra-database experiments, and 1.36 bpm (UBFC<span><math><mo>→</mo></math></span> PURE) and 1.27 bpm (PURE<span><math><mo>→</mo></math></span> UBFC) in cross-database experiments. Additionally, an outlier in the PURE database is identified, and its cause and impact are analyzed.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"127 ","pages":"Article 110557"},"PeriodicalIF":4.0000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Remote heart rate measurement based on spatial–temporal self-attention\",\"authors\":\"Ning Sun , Ningbin Wang , Jixin Liu , Lei Chai , Haian Sun\",\"doi\":\"10.1016/j.compeleceng.2025.110557\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Remote photoplethysmography (rPPG), a non-contact technique for measuring heart rate, has gained significant traction due to its convenience and non-invasive nature. However, rPPG signals are inherently weak and exhibit regional variations across facial areas. Accurate heart rate estimation necessitates analysis of extended-duration video sequences (exceeding 100 frames). To address these challenges, this paper proposes a novel deep neural network (DNN) based on self-attention mechanism, named spatial–temporal self-attention network (STSA-Net). This convolution-free DNN primarily adopts a transformer architecture. Initially, differential processing is applied to the input video to accentuate frame-to-frame variations and amplify subtle rPPG signals. These differential frames are subsequently passed through a spatial self-attention encoding module. This module models spatial dependencies within facial regions, allowing the network to focus on informative areas while suppressing irrelevant noise. Following spatial encoding, the features are processed by a temporal self-attention module, which captures long-term dependencies across frames using transformer-based techniques. The proposed method is rigorously evaluated through comprehensive experiments, including ablation studies, intra-database evaluations, and cross-database comparisons, using three benchmark databases: UBFC-RPPG, PURE, and MAHNOB-HCI. Our model demonstrates performance on par with state-of-the-art methods for remote heart rate measurement, achieving a MAE of 0.4 bpm on PURE, 0.48 bpm on UBFC, and 3.75 bpm on MAHNOB-HCI in intra-database experiments, and 1.36 bpm (UBFC<span><math><mo>→</mo></math></span> PURE) and 1.27 bpm (PURE<span><math><mo>→</mo></math></span> UBFC) in cross-database experiments. Additionally, an outlier in the PURE database is identified, and its cause and impact are analyzed.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"127 \",\"pages\":\"Article 110557\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625005002\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625005002","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Remote heart rate measurement based on spatial–temporal self-attention
Remote photoplethysmography (rPPG), a non-contact technique for measuring heart rate, has gained significant traction due to its convenience and non-invasive nature. However, rPPG signals are inherently weak and exhibit regional variations across facial areas. Accurate heart rate estimation necessitates analysis of extended-duration video sequences (exceeding 100 frames). To address these challenges, this paper proposes a novel deep neural network (DNN) based on self-attention mechanism, named spatial–temporal self-attention network (STSA-Net). This convolution-free DNN primarily adopts a transformer architecture. Initially, differential processing is applied to the input video to accentuate frame-to-frame variations and amplify subtle rPPG signals. These differential frames are subsequently passed through a spatial self-attention encoding module. This module models spatial dependencies within facial regions, allowing the network to focus on informative areas while suppressing irrelevant noise. Following spatial encoding, the features are processed by a temporal self-attention module, which captures long-term dependencies across frames using transformer-based techniques. The proposed method is rigorously evaluated through comprehensive experiments, including ablation studies, intra-database evaluations, and cross-database comparisons, using three benchmark databases: UBFC-RPPG, PURE, and MAHNOB-HCI. Our model demonstrates performance on par with state-of-the-art methods for remote heart rate measurement, achieving a MAE of 0.4 bpm on PURE, 0.48 bpm on UBFC, and 3.75 bpm on MAHNOB-HCI in intra-database experiments, and 1.36 bpm (UBFC PURE) and 1.27 bpm (PURE UBFC) in cross-database experiments. Additionally, an outlier in the PURE database is identified, and its cause and impact are analyzed.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.