Jeremy Rubin, Laura Mariani, Abigail Smith, Jarcy Zee
{"title":"用于肾小球疾病临床结果连续预测因子功能形态识别的岭回归","authors":"Jeremy Rubin, Laura Mariani, Abigail Smith, Jarcy Zee","doi":"10.1159/000528847","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Penalized regression models can be used to identify and rank risk factors for poor quality of life or other outcomes. They often assume linear covariate associations, but the true associations may be nonlinear. There is no standard, automated method for determining optimal functional forms (shapes of relationships) between predictors and the outcome in high-dimensional data settings.</p><p><strong>Methods: </strong>We propose a novel algorithm, ridge regression for functional form identification of continuous predictors (RIPR) that models each continuous covariate with linear, quadratic, quartile, and cubic spline basis components in a ridge regression model to capture potential nonlinear relationships between continuous predictors and outcomes. We used a simulation study to test the performance of RIPR compared to standard and spline ridge regression models. Then, we applied RIPR to identify top predictors of Patient-Reported Outcomes Measurement Information System (PROMIS) adult global mental and physical health scores using demographic and clinical characteristics among <i>N</i> = 107 glomerular disease patients enrolled in the Nephrotic Syndrome Study Network (NEPTUNE).</p><p><strong>Results: </strong>RIPR resulted in better predictive accuracy than the standard and spline ridge regression methods in 56-80% of simulation repetitions under a variety of data characteristics. When applied to PROMIS scores in NEPTUNE, RIPR resulted in the lowest error for predicting physical scores, and the second-lowest error for mental scores. Further, RIPR identified hemoglobin quartiles as an important predictor of physical health that was missed by the other models.</p><p><strong>Conclusion: </strong>The RIPR algorithm can capture nonlinear functional forms of predictors that are missed by standard ridge regression models. The top predictors of PROMIS scores vary greatly across methods. RIPR should be considered alongside other machine learning models in the prediction of patient-reported outcomes and other continuous outcomes.</p>","PeriodicalId":73177,"journal":{"name":"Glomerular diseases","volume":"3 1","pages":"47-55"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/df/0d/gdz-0003-0047.PMC10126734.pdf","citationCount":"0","resultStr":"{\"title\":\"Ridge Regression for Functional Form Identification of Continuous Predictors of Clinical Outcomes in Glomerular Disease.\",\"authors\":\"Jeremy Rubin, Laura Mariani, Abigail Smith, Jarcy Zee\",\"doi\":\"10.1159/000528847\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Penalized regression models can be used to identify and rank risk factors for poor quality of life or other outcomes. They often assume linear covariate associations, but the true associations may be nonlinear. There is no standard, automated method for determining optimal functional forms (shapes of relationships) between predictors and the outcome in high-dimensional data settings.</p><p><strong>Methods: </strong>We propose a novel algorithm, ridge regression for functional form identification of continuous predictors (RIPR) that models each continuous covariate with linear, quadratic, quartile, and cubic spline basis components in a ridge regression model to capture potential nonlinear relationships between continuous predictors and outcomes. We used a simulation study to test the performance of RIPR compared to standard and spline ridge regression models. Then, we applied RIPR to identify top predictors of Patient-Reported Outcomes Measurement Information System (PROMIS) adult global mental and physical health scores using demographic and clinical characteristics among <i>N</i> = 107 glomerular disease patients enrolled in the Nephrotic Syndrome Study Network (NEPTUNE).</p><p><strong>Results: </strong>RIPR resulted in better predictive accuracy than the standard and spline ridge regression methods in 56-80% of simulation repetitions under a variety of data characteristics. When applied to PROMIS scores in NEPTUNE, RIPR resulted in the lowest error for predicting physical scores, and the second-lowest error for mental scores. Further, RIPR identified hemoglobin quartiles as an important predictor of physical health that was missed by the other models.</p><p><strong>Conclusion: </strong>The RIPR algorithm can capture nonlinear functional forms of predictors that are missed by standard ridge regression models. The top predictors of PROMIS scores vary greatly across methods. RIPR should be considered alongside other machine learning models in the prediction of patient-reported outcomes and other continuous outcomes.</p>\",\"PeriodicalId\":73177,\"journal\":{\"name\":\"Glomerular diseases\",\"volume\":\"3 1\",\"pages\":\"47-55\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/df/0d/gdz-0003-0047.PMC10126734.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Glomerular diseases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1159/000528847\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Glomerular diseases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1159/000528847","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ridge Regression for Functional Form Identification of Continuous Predictors of Clinical Outcomes in Glomerular Disease.
Introduction: Penalized regression models can be used to identify and rank risk factors for poor quality of life or other outcomes. They often assume linear covariate associations, but the true associations may be nonlinear. There is no standard, automated method for determining optimal functional forms (shapes of relationships) between predictors and the outcome in high-dimensional data settings.
Methods: We propose a novel algorithm, ridge regression for functional form identification of continuous predictors (RIPR) that models each continuous covariate with linear, quadratic, quartile, and cubic spline basis components in a ridge regression model to capture potential nonlinear relationships between continuous predictors and outcomes. We used a simulation study to test the performance of RIPR compared to standard and spline ridge regression models. Then, we applied RIPR to identify top predictors of Patient-Reported Outcomes Measurement Information System (PROMIS) adult global mental and physical health scores using demographic and clinical characteristics among N = 107 glomerular disease patients enrolled in the Nephrotic Syndrome Study Network (NEPTUNE).
Results: RIPR resulted in better predictive accuracy than the standard and spline ridge regression methods in 56-80% of simulation repetitions under a variety of data characteristics. When applied to PROMIS scores in NEPTUNE, RIPR resulted in the lowest error for predicting physical scores, and the second-lowest error for mental scores. Further, RIPR identified hemoglobin quartiles as an important predictor of physical health that was missed by the other models.
Conclusion: The RIPR algorithm can capture nonlinear functional forms of predictors that are missed by standard ridge regression models. The top predictors of PROMIS scores vary greatly across methods. RIPR should be considered alongside other machine learning models in the prediction of patient-reported outcomes and other continuous outcomes.