Vedha Penmetcha, Lekaashree Rambabu, Brandon G Smith, Orla Mantle, Thomas Edmiston, Laura Hobbs, Shobhana Nagraj, Peter H Charlton, Tom Bashford
{"title":"Evaluating Diversity in Open Photoplethysmography Datasets: Protocol for a Systematic Review.","authors":"Vedha Penmetcha, Lekaashree Rambabu, Brandon G Smith, Orla Mantle, Thomas Edmiston, Laura Hobbs, Shobhana Nagraj, Peter H Charlton, Tom Bashford","doi":"10.2196/73040","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Photoplethysmography (PPG) is an optical method for measuring blood volume changes in microcirculation through noninvasive photodetection. It has become a widespread and essential clinical tool, used in pulse oximeters and wearable devices. However, technical aspects of PPG make it susceptible to intrinsic bias, with the potential to adversely affect particular patient and consumer populations. Developments in PPG technology, increasingly driven by openly accessible datasets as opposed to de novo experimentation, have the potential to help monitor an array of physiological variables. However, some populations may be underrepresented in PPG datasets. We describe a protocol for a systematic review to assess the biases within open access PPG datasets.</p><p><strong>Objective: </strong>This review aims to evaluate the underlying reporting patterns and structure of openly accessible PPG datasets. We will provide insight into the measured biosignals and demographic variables included in the datasets in the hope of shedding light on what PPG data parameters are being used to develop medical devices. Therefore, we can elucidate current gaps and areas for improvement to reduce bias in medical device development.</p><p><strong>Methods: </strong>This review will be reported in accordance with the standard PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We will include primary studies that mention PPG and specifically reference openly accessible datasets since 2000. The datasets must contain physiological parameters such as heart rate, blood pressure, or respiratory rate, as well as the PPG waveform data, collected from humans. Searches will be conducted in literature databases and data repositories, including MedLine OVID, IEEE Xplore, Scopus, and PhysioNet. Studies will be evaluated in accordance with the Standing Together Initiative recommendations, which are urging for health care technologies supported by representative data. Biosignal and demographic variables will be extracted from the PPG datasets, with steps taken to harmonize and store this information. Statistical analysis will be performed, including descriptive statistics and the chi-square test for comparisons. Additional statistical analyses will be performed after data extraction is completed and the level of heterogeneity is characterized.</p><p><strong>Results: </strong>We will analyze the dataset diversity and the structural basis of PPG datasets. This includes statistically analyzing the demographic and biosignal variables in the datasets. By using statistical test fit for nominal variable comparisons, we will evaluate the frequencies of characteristics like the devices used, biosignals collected, clinical parameters, demographic characteristics, and geographic information. This systematic review is expected to be completed by September 2025. The screening and review of the articles is currently being conducted.</p><p><strong>Conclusions: </strong>This review will provide insight into the potential gaps of existing open access PPG datasets. It will inform future data collection and design of openly available PPG datasets for training medical devices, including wearables, to avoid perpetuating biases, allowing for application in diverse clinical settings.</p>","PeriodicalId":14755,"journal":{"name":"JMIR Research Protocols","volume":"14 ","pages":"e73040"},"PeriodicalIF":1.5000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12488168/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Research Protocols","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/73040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Photoplethysmography (PPG) is an optical method for measuring blood volume changes in microcirculation through noninvasive photodetection. It has become a widespread and essential clinical tool, used in pulse oximeters and wearable devices. However, technical aspects of PPG make it susceptible to intrinsic bias, with the potential to adversely affect particular patient and consumer populations. Developments in PPG technology, increasingly driven by openly accessible datasets as opposed to de novo experimentation, have the potential to help monitor an array of physiological variables. However, some populations may be underrepresented in PPG datasets. We describe a protocol for a systematic review to assess the biases within open access PPG datasets.
Objective: This review aims to evaluate the underlying reporting patterns and structure of openly accessible PPG datasets. We will provide insight into the measured biosignals and demographic variables included in the datasets in the hope of shedding light on what PPG data parameters are being used to develop medical devices. Therefore, we can elucidate current gaps and areas for improvement to reduce bias in medical device development.
Methods: This review will be reported in accordance with the standard PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We will include primary studies that mention PPG and specifically reference openly accessible datasets since 2000. The datasets must contain physiological parameters such as heart rate, blood pressure, or respiratory rate, as well as the PPG waveform data, collected from humans. Searches will be conducted in literature databases and data repositories, including MedLine OVID, IEEE Xplore, Scopus, and PhysioNet. Studies will be evaluated in accordance with the Standing Together Initiative recommendations, which are urging for health care technologies supported by representative data. Biosignal and demographic variables will be extracted from the PPG datasets, with steps taken to harmonize and store this information. Statistical analysis will be performed, including descriptive statistics and the chi-square test for comparisons. Additional statistical analyses will be performed after data extraction is completed and the level of heterogeneity is characterized.
Results: We will analyze the dataset diversity and the structural basis of PPG datasets. This includes statistically analyzing the demographic and biosignal variables in the datasets. By using statistical test fit for nominal variable comparisons, we will evaluate the frequencies of characteristics like the devices used, biosignals collected, clinical parameters, demographic characteristics, and geographic information. This systematic review is expected to be completed by September 2025. The screening and review of the articles is currently being conducted.
Conclusions: This review will provide insight into the potential gaps of existing open access PPG datasets. It will inform future data collection and design of openly available PPG datasets for training medical devices, including wearables, to avoid perpetuating biases, allowing for application in diverse clinical settings.