美国生活实时：基准，公开可用的个人生成的健康数据，以实现精准健康的公平性。

IF 3.8 Q2 MULTIDISCIPLINARY SCIENCES

PNAS nexus Pub Date : 2025-10-07 eCollection Date: 2025-10-01 DOI:10.1093/pnasnexus/pgaf295

Ritika R Chaturvedi, Marco Angrisani, Wendy M Troxel, Monika Jain, Tania Gutsche, Eva Ortega, Adrien Boch, Citina Liang, Shiyang Sima, Aziz Mezlini, Eric J Daza, Miad Boodaghidizaji, Sze-Chuan Suen, Alok R Chaturvedi, Hossein Ghasemkhani, Arezoo M Ardekani, Arie Kapteyn

{"title":"美国生活实时：基准，公开可用的个人生成的健康数据，以实现精准健康的公平性。","authors":"Ritika R Chaturvedi, Marco Angrisani, Wendy M Troxel, Monika Jain, Tania Gutsche, Eva Ortega, Adrien Boch, Citina Liang, Shiyang Sima, Aziz Mezlini, Eric J Daza, Miad Boodaghidizaji, Sze-Chuan Suen, Alok R Chaturvedi, Hossein Ghasemkhani, Arezoo M Ardekani, Arie Kapteyn","doi":"10.1093/pnasnexus/pgaf295","DOIUrl":null,"url":null,"abstract":"Person-generated health data (PGHD) from smartphones/wearables are invaluable for precision health, a field promoting health equity through tailored disease prevention, detection, and intervention strategies. However, pervasive convenience sampling in extant PGHD research introduces selection biases that systematically underrepresent disadvantaged groups, limit model generalizability, and risk exacerbating health disparities. Benchmark PGHD (representative, validated, longitudinal, and frequently repeated) are urgently needed to support model equity. To address this fieldwide limitation, we established American Life in Realtime (ALiR), a longitudinal population health study involving PGHD collected from a probability-based, nationally representative cohort using study-provided Fitbits and (as needed) 4G tablets. As a result, ALiR's 1,038 participants are broadly representative across comprehensive sociodemographic, behavioral, and health-related US population norms, overcoming disparities in established convenience samples (e.g. NIH's All of Us; AoU). Only two sources of differential enrollment remained: older age (odds ratio [OR]: 1.27, 99% CI: 1.12-1.45) during consent, lower education (OR: 0.86, 99% CI: 0.79-0.94) during enrollment, though oversampling individuals without bachelor's degrees sufficiently counterbalanced the latter. An illustrative coronavirus disease 2019 classification model-chosen for global significance, known disparities in experience and outcomes, and methodological relevance-trained using ALiR performed equivalently when tested in sample (area under the curve [AUC] = 0.84, 95% CI: 0.79-0.89) and out of sample on AoU (AUC = 0.83, 95% CI: 0.78-0.89) overall, and in historically underserved subgroups (AUC = 0.82-1.0). Conversely, an identically trained classification model using AoU underperformed by 35% out of sample on ALiR (overall AUC = 0.68, 95% CI: 0.61-0.75 vs. AUC = 0.93, 95% CI: 0.91-0.96 in sample), with worse performance in older female and non-White subgroups (by 22-40%). Our results suggest that probability sampling and hardware provisioning enabled cohort inclusivity and generalizable model performance, supporting ALiR's benchmarking potential for equitable recruitment, PGHD collection, and precision health application.","PeriodicalId":74468,"journal":{"name":"PNAS nexus","volume":"4 10","pages":"pgaf295"},"PeriodicalIF":3.8000,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12501968/pdf/","citationCount":"0","resultStr":"{\"title\":\"American Life in Realtime: Benchmark, publicly available person-generated health data for equity in precision health.\",\"authors\":\"Ritika R Chaturvedi, Marco Angrisani, Wendy M Troxel, Monika Jain, Tania Gutsche, Eva Ortega, Adrien Boch, Citina Liang, Shiyang Sima, Aziz Mezlini, Eric J Daza, Miad Boodaghidizaji, Sze-Chuan Suen, Alok R Chaturvedi, Hossein Ghasemkhani, Arezoo M Ardekani, Arie Kapteyn\",\"doi\":\"10.1093/pnasnexus/pgaf295\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Person-generated health data (PGHD) from smartphones/wearables are invaluable for precision health, a field promoting health equity through tailored disease prevention, detection, and intervention strategies. However, pervasive convenience sampling in extant PGHD research introduces selection biases that systematically underrepresent disadvantaged groups, limit model generalizability, and risk exacerbating health disparities. Benchmark PGHD (representative, validated, longitudinal, and frequently repeated) are urgently needed to support model equity. To address this fieldwide limitation, we established American Life in Realtime (ALiR), a longitudinal population health study involving PGHD collected from a probability-based, nationally representative cohort using study-provided Fitbits and (as needed) 4G tablets. As a result, ALiR's 1,038 participants are broadly representative across comprehensive sociodemographic, behavioral, and health-related US population norms, overcoming disparities in established convenience samples (e.g. NIH's All of Us; AoU). Only two sources of differential enrollment remained: older age (odds ratio [OR]: 1.27, 99% CI: 1.12-1.45) during consent, lower education (OR: 0.86, 99% CI: 0.79-0.94) during enrollment, though oversampling individuals without bachelor's degrees sufficiently counterbalanced the latter. An illustrative coronavirus disease 2019 classification model-chosen for global significance, known disparities in experience and outcomes, and methodological relevance-trained using ALiR performed equivalently when tested in sample (area under the curve [AUC] = 0.84, 95% CI: 0.79-0.89) and out of sample on AoU (AUC = 0.83, 95% CI: 0.78-0.89) overall, and in historically underserved subgroups (AUC = 0.82-1.0). Conversely, an identically trained classification model using AoU underperformed by 35% out of sample on ALiR (overall AUC = 0.68, 95% CI: 0.61-0.75 vs. AUC = 0.93, 95% CI: 0.91-0.96 in sample), with worse performance in older female and non-White subgroups (by 22-40%). Our results suggest that probability sampling and hardware provisioning enabled cohort inclusivity and generalizable model performance, supporting ALiR's benchmarking potential for equitable recruitment, PGHD collection, and precision health application.\",\"PeriodicalId\":74468,\"journal\":{\"name\":\"PNAS nexus\",\"volume\":\"4 10\",\"pages\":\"pgaf295\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12501968/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PNAS nexus\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/pnasnexus/pgaf295\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/10/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PNAS nexus","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/pnasnexus/pgaf295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/10/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

来自智能手机/可穿戴设备的个人生成健康数据（PGHD）对于精准健康是非常宝贵的，精准健康是一个通过量身定制的疾病预防、检测和干预策略促进健康公平的领域。然而，在现有的PGHD研究中，普遍的便利抽样引入了选择偏差，系统性地低估了弱势群体，限制了模型的普遍性，并有加剧健康差距的风险。迫切需要基准PGHD（代表性的、经过验证的、纵向的、经常重复的）来支持模型公平性。为了解决这一领域的局限性，我们建立了美国实时生活（ALiR），这是一项涉及PGHD的纵向人口健康研究，该研究从基于概率的全国代表性队列中收集数据，使用研究提供的fitbit和（必要时）4G平板电脑。因此，ALiR的1038名参与者在全面的社会人口学、行为和健康相关的美国人口规范中具有广泛的代表性，克服了已建立的便利样本（例如NIH的All of US； AoU）中的差异。差异入组只剩下两个来源：同意时的年龄较大（比值比[OR]: 1.27, 99% CI: 1.12-1.45），入组时的教育程度较低（比值比[OR]: 0.86, 99% CI: 0.79-0.94），尽管没有学士学位的过采样个体足以抵消后者。在样本（曲线下面积[AUC] = 0.84, 95% CI: 0.79-0.89）和样本外的总体AoU （AUC = 0.83, 95% CI: 0.78-0.89）以及历史上服务不足的亚组（AUC = 0.82-1.0）中进行测试时，使用ALiR训练的说明性冠状病毒疾病2019年分类模型的全球意义、已知经验和结果差异以及方法相关性都是相同的。相反，使用AoU进行相同训练的分类模型在ALiR上的表现差35%（总体AUC = 0.68, 95% CI: 0.61-0.75，样本中AUC = 0.93, 95% CI: 0.91-0.96），在老年女性和非白人亚组中表现更差（差22-40%）。我们的研究结果表明，概率抽样和硬件配置实现了队列包容性和可推广的模型性能，支持ALiR在公平招聘、PGHD收集和精确健康应用方面的基准测试潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

American Life in Realtime: Benchmark, publicly available person-generated health data for equity in precision health.

Person-generated health data (PGHD) from smartphones/wearables are invaluable for precision health, a field promoting health equity through tailored disease prevention, detection, and intervention strategies. However, pervasive convenience sampling in extant PGHD research introduces selection biases that systematically underrepresent disadvantaged groups, limit model generalizability, and risk exacerbating health disparities. Benchmark PGHD (representative, validated, longitudinal, and frequently repeated) are urgently needed to support model equity. To address this fieldwide limitation, we established American Life in Realtime (ALiR), a longitudinal population health study involving PGHD collected from a probability-based, nationally representative cohort using study-provided Fitbits and (as needed) 4G tablets. As a result, ALiR's 1,038 participants are broadly representative across comprehensive sociodemographic, behavioral, and health-related US population norms, overcoming disparities in established convenience samples (e.g. NIH's All of Us; AoU). Only two sources of differential enrollment remained: older age (odds ratio [OR]: 1.27, 99% CI: 1.12-1.45) during consent, lower education (OR: 0.86, 99% CI: 0.79-0.94) during enrollment, though oversampling individuals without bachelor's degrees sufficiently counterbalanced the latter. An illustrative coronavirus disease 2019 classification model-chosen for global significance, known disparities in experience and outcomes, and methodological relevance-trained using ALiR performed equivalently when tested in sample (area under the curve [AUC] = 0.84, 95% CI: 0.79-0.89) and out of sample on AoU (AUC = 0.83, 95% CI: 0.78-0.89) overall, and in historically underserved subgroups (AUC = 0.82-1.0). Conversely, an identically trained classification model using AoU underperformed by 35% out of sample on ALiR (overall AUC = 0.68, 95% CI: 0.61-0.75 vs. AUC = 0.93, 95% CI: 0.91-0.96 in sample), with worse performance in older female and non-White subgroups (by 22-40%). Our results suggest that probability sampling and hardware provisioning enabled cohort inclusivity and generalizable model performance, supporting ALiR's benchmarking potential for equitable recruitment, PGHD collection, and precision health application.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PNAS nexus

CiteScore

1.80

自引率

0.00%

发文量