A machine learning-based approach for estimating and testing associations with multivariate outcomes.

IF 1.2 4区 数学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY
David Benkeser, Andrew Mertens, John M Colford, Alan Hubbard, Benjamin F Arnold, Aryeh Stein, Mark J van der Laan
{"title":"A machine learning-based approach for estimating and testing associations with multivariate outcomes.","authors":"David Benkeser,&nbsp;Andrew Mertens,&nbsp;John M Colford,&nbsp;Alan Hubbard,&nbsp;Benjamin F Arnold,&nbsp;Aryeh Stein,&nbsp;Mark J van der Laan","doi":"10.1515/ijb-2019-0061","DOIUrl":null,"url":null,"abstract":"<p><p>We propose a method for summarizing the strength of association between a set of variables and a multivariate outcome. Classical summary measures are appropriate when linear relationships exist between covariates and outcomes, while our approach provides an alternative that is useful in situations where complex relationships may be present. We utilize machine learning to detect nonlinear relationships and covariate interactions and propose a measure of association that captures these relationships. A hypothesis test about the proposed associative measure can be used to test the strong null hypothesis of no association between a set of variables and a multivariate outcome. Simulations demonstrate that this hypothesis test has greater power than existing methods against alternatives where covariates have nonlinear relationships with outcomes. We additionally propose measures of variable importance for groups of variables, which summarize each groups' association with the outcome. We demonstrate our methodology using data from a birth cohort study on childhood health and nutrition in the Philippines.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 1","pages":"7-21"},"PeriodicalIF":1.2000,"publicationDate":"2020-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2019-0061","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biostatistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/ijb-2019-0061","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 7

Abstract

We propose a method for summarizing the strength of association between a set of variables and a multivariate outcome. Classical summary measures are appropriate when linear relationships exist between covariates and outcomes, while our approach provides an alternative that is useful in situations where complex relationships may be present. We utilize machine learning to detect nonlinear relationships and covariate interactions and propose a measure of association that captures these relationships. A hypothesis test about the proposed associative measure can be used to test the strong null hypothesis of no association between a set of variables and a multivariate outcome. Simulations demonstrate that this hypothesis test has greater power than existing methods against alternatives where covariates have nonlinear relationships with outcomes. We additionally propose measures of variable importance for groups of variables, which summarize each groups' association with the outcome. We demonstrate our methodology using data from a birth cohort study on childhood health and nutrition in the Philippines.

一种基于机器学习的方法,用于估计和测试与多变量结果的关联。
我们提出了一种方法来总结一组变量和多变量结果之间的关联强度。当协变量和结果之间存在线性关系时,经典的总结度量是合适的,而我们的方法提供了一种替代方法,在可能存在复杂关系的情况下很有用。我们利用机器学习来检测非线性关系和协变量相互作用,并提出一种捕获这些关系的关联度量。关于所提出的关联度量的假设检验可用于检验一组变量与多变量结果之间无关联的强零假设。模拟表明,这种假设检验比现有的方法对协变量与结果具有非线性关系的替代方案具有更大的能力。我们还为变量组提出了变量重要性的度量,这些变量组总结了每个组与结果的关联。我们使用来自菲律宾儿童健康和营养出生队列研究的数据来证明我们的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Biostatistics
International Journal of Biostatistics MATHEMATICAL & COMPUTATIONAL BIOLOGY-STATISTICS & PROBABILITY
CiteScore
2.10
自引率
8.30%
发文量
28
审稿时长
>12 weeks
期刊介绍: The International Journal of Biostatistics (IJB) seeks to publish new biostatistical models and methods, new statistical theory, as well as original applications of statistical methods, for important practical problems arising from the biological, medical, public health, and agricultural sciences with an emphasis on semiparametric methods. Given many alternatives to publish exist within biostatistics, IJB offers a place to publish for research in biostatistics focusing on modern methods, often based on machine-learning and other data-adaptive methodologies, as well as providing a unique reading experience that compels the author to be explicit about the statistical inference problem addressed by the paper. IJB is intended that the journal cover the entire range of biostatistics, from theoretical advances to relevant and sensible translations of a practical problem into a statistical framework. Electronic publication also allows for data and software code to be appended, and opens the door for reproducible research allowing readers to easily replicate analyses described in a paper. Both original research and review articles will be warmly received, as will articles applying sound statistical methods to practical problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信