Yizi Cheng, Cole Brokamp, Erika Rasnick Manning, Elizabeth L Kramer, Patrick H Ryan, Rhonda D Szczesniak, Emrah Gecili
{"title":"Hypercubes to identify geomarkers of rapid cystic fibrosis lung disease progression.","authors":"Yizi Cheng, Cole Brokamp, Erika Rasnick Manning, Elizabeth L Kramer, Patrick H Ryan, Rhonda D Szczesniak, Emrah Gecili","doi":"10.1186/s12911-025-03097-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Prior research has shown that place-based environmental exposures and community characteristics, known as geomarkers, are associated with accelerated lung function decline and increased mortality in individuals with cystic fibrosis (CF). Although geomarkers have been linked to pulmonary outcomes in other respiratory diseases, it is unknown which have the greatest predictive power for rapid lung function decline in CF.</p><p><strong>Methods: </strong>We adapted an existing statistical procedure, which arranges candidate variables in a k-dimensional hypercube, where the hypercube forms a set of variables for a multi-stage selection process involving complex longitudinal data. We embedded the hypercube within a dynamic prediction model of rapid lung function decline, in order to accommodate complexity in lung function trajectories. This practical approach simultaneously selects a handful of genuinely predictive markers among candidates and accounts for complex correlations in longitudinal marker data. Our method is applied to actual geomarker and lung-function outcomes data from the existing Cystic Fibrosis Patient Registry and Cincinnati Cystic Fibrosis Center datasets.</p><p><strong>Results: </strong>We applied a 4 × 4 × 4 3-D hypercube to the national and local datasets and selected a subset of geomarkers using p-values from testing coefficients of the association between each geomarker and lung function decline in the dynamic prediction model. Based on the national data analyses, some road density-related geomarkers were selected, including some air pollution-related and greenspace-related variables. Simulations showed the proposed method's variable selection efficacy and robust performance in identifying true predictors, particularly under weak correlation (ρ≤0.6), although performance dipped with stronger correlations (ρ=0.9).</p><p><strong>Conclusions: </strong>The proposed method is a useful approach for selecting a small set of truly relevant demographic, clinical, and place-based predictors of rapid lung function decline while accounting for the complex correlations inherent in longitudinal lung-function data. We found that selection results differed according to spatial resolution of the geomarkers. Our findings have potential to improve care decisions for people with CF.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"304"},"PeriodicalIF":3.8000,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12344988/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03097-2","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Prior research has shown that place-based environmental exposures and community characteristics, known as geomarkers, are associated with accelerated lung function decline and increased mortality in individuals with cystic fibrosis (CF). Although geomarkers have been linked to pulmonary outcomes in other respiratory diseases, it is unknown which have the greatest predictive power for rapid lung function decline in CF.
Methods: We adapted an existing statistical procedure, which arranges candidate variables in a k-dimensional hypercube, where the hypercube forms a set of variables for a multi-stage selection process involving complex longitudinal data. We embedded the hypercube within a dynamic prediction model of rapid lung function decline, in order to accommodate complexity in lung function trajectories. This practical approach simultaneously selects a handful of genuinely predictive markers among candidates and accounts for complex correlations in longitudinal marker data. Our method is applied to actual geomarker and lung-function outcomes data from the existing Cystic Fibrosis Patient Registry and Cincinnati Cystic Fibrosis Center datasets.
Results: We applied a 4 × 4 × 4 3-D hypercube to the national and local datasets and selected a subset of geomarkers using p-values from testing coefficients of the association between each geomarker and lung function decline in the dynamic prediction model. Based on the national data analyses, some road density-related geomarkers were selected, including some air pollution-related and greenspace-related variables. Simulations showed the proposed method's variable selection efficacy and robust performance in identifying true predictors, particularly under weak correlation (ρ≤0.6), although performance dipped with stronger correlations (ρ=0.9).
Conclusions: The proposed method is a useful approach for selecting a small set of truly relevant demographic, clinical, and place-based predictors of rapid lung function decline while accounting for the complex correlations inherent in longitudinal lung-function data. We found that selection results differed according to spatial resolution of the geomarkers. Our findings have potential to improve care decisions for people with CF.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.