Mohammad Moulaeifard, Peter H Charlton, Nils Strodthoff
{"title":"Generalizable deep learning for photoplethysmography-based blood pressure estimation-A benchmarking study.","authors":"Mohammad Moulaeifard, Peter H Charlton, Nils Strodthoff","doi":"10.1088/3049-477X/ae01a8","DOIUrl":null,"url":null,"abstract":"<p><p>Photoplethysmography (PPG)-based blood pressure (BP) estimation represents a promising alternative to cuff-based BP measurements. Recently, an increasing number of deep learning (DL) models have been proposed to infer BP from the raw PPG waveform. However, these models have been predominantly evaluated on in-distribution (ID) test sets, which immediately raises the question of the generalizability of these models to external datasets. To investigate this question, we trained five DL models on the recently released PulseDB dataset, provided ID benchmarking results on this dataset, and then assessed their out-of-distribution (OOD) performance on several external datasets. The best model (XResNet1d101) achieved ID mean absolute errors (MAEs) of 9.0 and 5.8 mmHg for systolic and diastolic BP, respectively, on PulseDB with subject-specific calibration, and 13.9 and 8.5 mmHg, respectively, without calibration. The equivalent MAEs on external test datasets without calibration ranged from 10.0 to 18.6 mmHg (SBP) and 5.9 to 10.3 mmHg (DBP). Our results indicate that performance is strongly influenced by the differences in BP distributions between datasets. We investigated a simple way of improving performance through sample-based domain adaptation and put forward recommendations for training models with good generalization properties. With this work, we hope to educate more researchers about the importance and challenges of OOD generalization.</p>","PeriodicalId":521035,"journal":{"name":"Machine Learning. Health","volume":"1 1","pages":"010501"},"PeriodicalIF":0.0000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12435175/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning. Health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/3049-477X/ae01a8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/15 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Photoplethysmography (PPG)-based blood pressure (BP) estimation represents a promising alternative to cuff-based BP measurements. Recently, an increasing number of deep learning (DL) models have been proposed to infer BP from the raw PPG waveform. However, these models have been predominantly evaluated on in-distribution (ID) test sets, which immediately raises the question of the generalizability of these models to external datasets. To investigate this question, we trained five DL models on the recently released PulseDB dataset, provided ID benchmarking results on this dataset, and then assessed their out-of-distribution (OOD) performance on several external datasets. The best model (XResNet1d101) achieved ID mean absolute errors (MAEs) of 9.0 and 5.8 mmHg for systolic and diastolic BP, respectively, on PulseDB with subject-specific calibration, and 13.9 and 8.5 mmHg, respectively, without calibration. The equivalent MAEs on external test datasets without calibration ranged from 10.0 to 18.6 mmHg (SBP) and 5.9 to 10.3 mmHg (DBP). Our results indicate that performance is strongly influenced by the differences in BP distributions between datasets. We investigated a simple way of improving performance through sample-based domain adaptation and put forward recommendations for training models with good generalization properties. With this work, we hope to educate more researchers about the importance and challenges of OOD generalization.