Ritika R Chaturvedi, Marco Angrisani, Wendy M Troxel, Monika Jain, Tania Gutsche, Eva Ortega, Adrien Boch, Citina Liang, Shiyang Sima, Aziz Mezlini, Eric J Daza, Miad Boodaghidizaji, Sze-Chuan Suen, Alok R Chaturvedi, Hossein Ghasemkhani, Arezoo M Ardekani, Arie Kapteyn
{"title":"美国生活实时:基准,公开可用的个人生成的健康数据,以实现精准健康的公平性。","authors":"Ritika R Chaturvedi, Marco Angrisani, Wendy M Troxel, Monika Jain, Tania Gutsche, Eva Ortega, Adrien Boch, Citina Liang, Shiyang Sima, Aziz Mezlini, Eric J Daza, Miad Boodaghidizaji, Sze-Chuan Suen, Alok R Chaturvedi, Hossein Ghasemkhani, Arezoo M Ardekani, Arie Kapteyn","doi":"10.1093/pnasnexus/pgaf295","DOIUrl":null,"url":null,"abstract":"<p><p>Person-generated health data (PGHD) from smartphones/wearables are invaluable for precision health, a field promoting health equity through tailored disease prevention, detection, and intervention strategies. However, pervasive convenience sampling in extant PGHD research introduces selection biases that systematically underrepresent disadvantaged groups, limit model generalizability, and risk exacerbating health disparities. Benchmark PGHD (representative, validated, longitudinal, and frequently repeated) are urgently needed to support model equity. To address this fieldwide limitation, we established American Life in Realtime (ALiR), a longitudinal population health study involving PGHD collected from a probability-based, nationally representative cohort using study-provided Fitbits and (as needed) 4G tablets. As a result, ALiR's 1,038 participants are broadly representative across comprehensive sociodemographic, behavioral, and health-related US population norms, overcoming disparities in established convenience samples (e.g. NIH's <i>All of Us</i>; <i>AoU</i>). Only two sources of differential enrollment remained: older age (odds ratio [OR]: 1.27, 99% CI: 1.12-1.45) during consent, lower education (OR: 0.86, 99% CI: 0.79-0.94) during enrollment, though oversampling individuals without bachelor's degrees sufficiently counterbalanced the latter. An illustrative coronavirus disease 2019 classification model-chosen for global significance, known disparities in experience and outcomes, and methodological relevance-trained using ALiR performed equivalently when tested in sample (area under the curve [AUC] = 0.84, 95% CI: 0.79-0.89) and out of sample on <i>AoU</i> (AUC = 0.83, 95% CI: 0.78-0.89) overall, and in historically underserved subgroups (AUC = 0.82-1.0). Conversely, an identically trained classification model using <i>AoU</i> underperformed by 35% out of sample on ALiR (overall AUC = 0.68, 95% CI: 0.61-0.75 vs. AUC = 0.93, 95% CI: 0.91-0.96 in sample), with worse performance in older female and non-White subgroups (by 22-40%). Our results suggest that probability sampling and hardware provisioning enabled cohort inclusivity and generalizable model performance, supporting ALiR's benchmarking potential for equitable recruitment, PGHD collection, and precision health application.</p>","PeriodicalId":74468,"journal":{"name":"PNAS nexus","volume":"4 10","pages":"pgaf295"},"PeriodicalIF":3.8000,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12501968/pdf/","citationCount":"0","resultStr":"{\"title\":\"American Life in Realtime: Benchmark, publicly available person-generated health data for equity in precision health.\",\"authors\":\"Ritika R Chaturvedi, Marco Angrisani, Wendy M Troxel, Monika Jain, Tania Gutsche, Eva Ortega, Adrien Boch, Citina Liang, Shiyang Sima, Aziz Mezlini, Eric J Daza, Miad Boodaghidizaji, Sze-Chuan Suen, Alok R Chaturvedi, Hossein Ghasemkhani, Arezoo M Ardekani, Arie Kapteyn\",\"doi\":\"10.1093/pnasnexus/pgaf295\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Person-generated health data (PGHD) from smartphones/wearables are invaluable for precision health, a field promoting health equity through tailored disease prevention, detection, and intervention strategies. However, pervasive convenience sampling in extant PGHD research introduces selection biases that systematically underrepresent disadvantaged groups, limit model generalizability, and risk exacerbating health disparities. Benchmark PGHD (representative, validated, longitudinal, and frequently repeated) are urgently needed to support model equity. To address this fieldwide limitation, we established American Life in Realtime (ALiR), a longitudinal population health study involving PGHD collected from a probability-based, nationally representative cohort using study-provided Fitbits and (as needed) 4G tablets. As a result, ALiR's 1,038 participants are broadly representative across comprehensive sociodemographic, behavioral, and health-related US population norms, overcoming disparities in established convenience samples (e.g. NIH's <i>All of Us</i>; <i>AoU</i>). Only two sources of differential enrollment remained: older age (odds ratio [OR]: 1.27, 99% CI: 1.12-1.45) during consent, lower education (OR: 0.86, 99% CI: 0.79-0.94) during enrollment, though oversampling individuals without bachelor's degrees sufficiently counterbalanced the latter. An illustrative coronavirus disease 2019 classification model-chosen for global significance, known disparities in experience and outcomes, and methodological relevance-trained using ALiR performed equivalently when tested in sample (area under the curve [AUC] = 0.84, 95% CI: 0.79-0.89) and out of sample on <i>AoU</i> (AUC = 0.83, 95% CI: 0.78-0.89) overall, and in historically underserved subgroups (AUC = 0.82-1.0). Conversely, an identically trained classification model using <i>AoU</i> underperformed by 35% out of sample on ALiR (overall AUC = 0.68, 95% CI: 0.61-0.75 vs. AUC = 0.93, 95% CI: 0.91-0.96 in sample), with worse performance in older female and non-White subgroups (by 22-40%). Our results suggest that probability sampling and hardware provisioning enabled cohort inclusivity and generalizable model performance, supporting ALiR's benchmarking potential for equitable recruitment, PGHD collection, and precision health application.</p>\",\"PeriodicalId\":74468,\"journal\":{\"name\":\"PNAS nexus\",\"volume\":\"4 10\",\"pages\":\"pgaf295\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12501968/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PNAS nexus\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/pnasnexus/pgaf295\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/10/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PNAS nexus","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/pnasnexus/pgaf295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/10/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
American Life in Realtime: Benchmark, publicly available person-generated health data for equity in precision health.
Person-generated health data (PGHD) from smartphones/wearables are invaluable for precision health, a field promoting health equity through tailored disease prevention, detection, and intervention strategies. However, pervasive convenience sampling in extant PGHD research introduces selection biases that systematically underrepresent disadvantaged groups, limit model generalizability, and risk exacerbating health disparities. Benchmark PGHD (representative, validated, longitudinal, and frequently repeated) are urgently needed to support model equity. To address this fieldwide limitation, we established American Life in Realtime (ALiR), a longitudinal population health study involving PGHD collected from a probability-based, nationally representative cohort using study-provided Fitbits and (as needed) 4G tablets. As a result, ALiR's 1,038 participants are broadly representative across comprehensive sociodemographic, behavioral, and health-related US population norms, overcoming disparities in established convenience samples (e.g. NIH's All of Us; AoU). Only two sources of differential enrollment remained: older age (odds ratio [OR]: 1.27, 99% CI: 1.12-1.45) during consent, lower education (OR: 0.86, 99% CI: 0.79-0.94) during enrollment, though oversampling individuals without bachelor's degrees sufficiently counterbalanced the latter. An illustrative coronavirus disease 2019 classification model-chosen for global significance, known disparities in experience and outcomes, and methodological relevance-trained using ALiR performed equivalently when tested in sample (area under the curve [AUC] = 0.84, 95% CI: 0.79-0.89) and out of sample on AoU (AUC = 0.83, 95% CI: 0.78-0.89) overall, and in historically underserved subgroups (AUC = 0.82-1.0). Conversely, an identically trained classification model using AoU underperformed by 35% out of sample on ALiR (overall AUC = 0.68, 95% CI: 0.61-0.75 vs. AUC = 0.93, 95% CI: 0.91-0.96 in sample), with worse performance in older female and non-White subgroups (by 22-40%). Our results suggest that probability sampling and hardware provisioning enabled cohort inclusivity and generalizable model performance, supporting ALiR's benchmarking potential for equitable recruitment, PGHD collection, and precision health application.