非链接单调回归

F. Balabdaoui, Charles R. Doss, C. Durot
{"title":"非链接单调回归","authors":"F. Balabdaoui, Charles R. Doss, C. Durot","doi":"10.3929/ETHZ-B-000501663","DOIUrl":null,"url":null,"abstract":"We consider so-called univariate unlinked (sometimes \"decoupled,\" or \"shuffled\") regression when the unknown regression curve is monotone. In standard monotone regression, one observes a pair $(X,Y)$ where a response $Y$ is linked to a covariate $X$ through the model $Y= m_0(X) + \\epsilon$, with $m_0$ the (unknown) monotone regression function and $\\epsilon$ the unobserved error (assumed to be independent of $X$). In the unlinked regression setting one gets only to observe a vector of realizations from both the response $Y$ and from the covariate $X$ where now $Y \\stackrel{d}{=} m_0(X) + \\epsilon$. There is no (observed) pairing of $X$ and $Y$. Despite this, it is actually still possible to derive a consistent non-parametric estimator of $m_0$ under the assumption of monotonicity of $m_0$ and knowledge of the distribution of the noise $\\epsilon$. In this paper, we establish an upper bound on the rate of convergence of such an estimator under minimal assumption on the distribution of the covariate $X$. We discuss extensions to the case in which the distribution of the noise is unknown. We develop a gradient-descent-based algorithm for its computation, and we demonstrate its use on synthetic data. Finally, we apply our method (in a fully data driven way, without knowledge of the error distribution) on longitudinal data from the US Consumer Expenditure Survey.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"28 6 1","pages":"172:1-172:60"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Unlinked Monotone Regression\",\"authors\":\"F. Balabdaoui, Charles R. Doss, C. Durot\",\"doi\":\"10.3929/ETHZ-B-000501663\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider so-called univariate unlinked (sometimes \\\"decoupled,\\\" or \\\"shuffled\\\") regression when the unknown regression curve is monotone. In standard monotone regression, one observes a pair $(X,Y)$ where a response $Y$ is linked to a covariate $X$ through the model $Y= m_0(X) + \\\\epsilon$, with $m_0$ the (unknown) monotone regression function and $\\\\epsilon$ the unobserved error (assumed to be independent of $X$). In the unlinked regression setting one gets only to observe a vector of realizations from both the response $Y$ and from the covariate $X$ where now $Y \\\\stackrel{d}{=} m_0(X) + \\\\epsilon$. There is no (observed) pairing of $X$ and $Y$. Despite this, it is actually still possible to derive a consistent non-parametric estimator of $m_0$ under the assumption of monotonicity of $m_0$ and knowledge of the distribution of the noise $\\\\epsilon$. In this paper, we establish an upper bound on the rate of convergence of such an estimator under minimal assumption on the distribution of the covariate $X$. We discuss extensions to the case in which the distribution of the noise is unknown. We develop a gradient-descent-based algorithm for its computation, and we demonstrate its use on synthetic data. Finally, we apply our method (in a fully data driven way, without knowledge of the error distribution) on longitudinal data from the US Consumer Expenditure Survey.\",\"PeriodicalId\":14794,\"journal\":{\"name\":\"J. Mach. Learn. Res.\",\"volume\":\"28 6 1\",\"pages\":\"172:1-172:60\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Mach. Learn. Res.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3929/ETHZ-B-000501663\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Mach. Learn. Res.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3929/ETHZ-B-000501663","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

当未知的回归曲线是单调的时,我们考虑所谓的单变量非链接(有时“解耦”或“洗牌”)回归。在标准单调回归中,人们观察到一对$(X,Y)$,其中响应$Y$通过模型$Y= m_0(X) + \epsilon$链接到协变量$X$,其中$m_0$是(未知的)单调回归函数,$\epsilon$是未观察到的误差(假设与$X$无关)。在非链接回归设置中,人们只能从响应$Y$和协变量$X$中观察到实现向量,其中现在$Y \stackrel{d}{=} m_0(X) + \epsilon$。没有(观察到的)$X$和$Y$的配对。尽管如此,在假设$m_0$单调性和知道噪声$\epsilon$分布的情况下,实际上仍然可以推导出$m_0$的一致非参数估计量。本文在协变量$X$分布的最小假设下,建立了这类估计量收敛速率的上界。我们讨论了噪声分布未知情况下的扩展。我们开发了一种基于梯度下降的算法来计算它,并演示了它在合成数据上的应用。最后,我们将我们的方法(以完全数据驱动的方式,不知道误差分布)应用于美国消费者支出调查的纵向数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Unlinked Monotone Regression
We consider so-called univariate unlinked (sometimes "decoupled," or "shuffled") regression when the unknown regression curve is monotone. In standard monotone regression, one observes a pair $(X,Y)$ where a response $Y$ is linked to a covariate $X$ through the model $Y= m_0(X) + \epsilon$, with $m_0$ the (unknown) monotone regression function and $\epsilon$ the unobserved error (assumed to be independent of $X$). In the unlinked regression setting one gets only to observe a vector of realizations from both the response $Y$ and from the covariate $X$ where now $Y \stackrel{d}{=} m_0(X) + \epsilon$. There is no (observed) pairing of $X$ and $Y$. Despite this, it is actually still possible to derive a consistent non-parametric estimator of $m_0$ under the assumption of monotonicity of $m_0$ and knowledge of the distribution of the noise $\epsilon$. In this paper, we establish an upper bound on the rate of convergence of such an estimator under minimal assumption on the distribution of the covariate $X$. We discuss extensions to the case in which the distribution of the noise is unknown. We develop a gradient-descent-based algorithm for its computation, and we demonstrate its use on synthetic data. Finally, we apply our method (in a fully data driven way, without knowledge of the error distribution) on longitudinal data from the US Consumer Expenditure Survey.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信