{"title":"基于高自适应套索的通用高效目标最小损失估计器。","authors":"Mark van der Laan","doi":"10.1515/ijb-2015-0097","DOIUrl":null,"url":null,"abstract":"<p><p>Suppose we observe n $n$ independent and identically distributed observations of a finite dimensional bounded random variable. This article is concerned with the construction of an efficient targeted minimum loss-based estimator (TMLE) of a pathwise differentiable target parameter of the data distribution based on a realistic statistical model. The only smoothness condition we will enforce on the statistical model is that the nuisance parameters of the data distribution that are needed to evaluate the canonical gradient of the pathwise derivative of the target parameter are multivariate real valued cadlag functions (right-continuous and left-hand limits, (G. Neuhaus. On weak convergence of stochastic processes with multidimensional time parameter. Ann Stat 1971;42:1285-1295.) and have a finite supremum and (sectional) variation norm. Each nuisance parameter is defined as a minimizer of the expectation of a loss function over over all functions it its parameter space. For each nuisance parameter, we propose a new minimum loss based estimator that minimizes the loss-specific empirical risk over the functions in its parameter space under the additional constraint that the variation norm of the function is bounded by a set constant. The constant is selected with cross-validation. We show such an MLE can be represented as the minimizer of the empirical risk over linear combinations of indicator basis functions under the constraint that the sum of the absolute value of the coefficients is bounded by the constant: i.e., the variation norm corresponds with this L1 $L_1$-norm of the vector of coefficients. We will refer to this estimator as the highly adaptive Lasso (HAL)-estimator. We prove that for all models the HAL-estimator converges to the true nuisance parameter value at a rate that is faster than n-1/4 $n^{-1/4}$ w.r.t. square-root of the loss-based dissimilarity. We also show that if this HAL-estimator is included in the library of an ensemble super-learner, then the super-learner will at minimal achieve the rate of convergence of the HAL, but, by previous results, it will actually be asymptotically equivalent with the oracle (i.e., in some sense best) estimator in the library. Subsequently, we establish that a one-step TMLE using such a super-learner as initial estimator for each of the nuisance parameters is asymptotically efficient at any data generating distribution in the model, under weak structural conditions on the target parameter mapping and model and a strong positivity assumption (e.g., the canonical gradient is uniformly bounded). We demonstrate our general theorem by constructing such a one-step TMLE of the average causal effect in a nonparametric model, and establishing that it is asymptotically efficient.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"13 2","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2017-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0097","citationCount":"52","resultStr":"{\"title\":\"A Generally Efficient Targeted Minimum Loss Based Estimator based on the Highly Adaptive Lasso.\",\"authors\":\"Mark van der Laan\",\"doi\":\"10.1515/ijb-2015-0097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Suppose we observe n $n$ independent and identically distributed observations of a finite dimensional bounded random variable. This article is concerned with the construction of an efficient targeted minimum loss-based estimator (TMLE) of a pathwise differentiable target parameter of the data distribution based on a realistic statistical model. The only smoothness condition we will enforce on the statistical model is that the nuisance parameters of the data distribution that are needed to evaluate the canonical gradient of the pathwise derivative of the target parameter are multivariate real valued cadlag functions (right-continuous and left-hand limits, (G. Neuhaus. On weak convergence of stochastic processes with multidimensional time parameter. Ann Stat 1971;42:1285-1295.) and have a finite supremum and (sectional) variation norm. Each nuisance parameter is defined as a minimizer of the expectation of a loss function over over all functions it its parameter space. For each nuisance parameter, we propose a new minimum loss based estimator that minimizes the loss-specific empirical risk over the functions in its parameter space under the additional constraint that the variation norm of the function is bounded by a set constant. The constant is selected with cross-validation. We show such an MLE can be represented as the minimizer of the empirical risk over linear combinations of indicator basis functions under the constraint that the sum of the absolute value of the coefficients is bounded by the constant: i.e., the variation norm corresponds with this L1 $L_1$-norm of the vector of coefficients. We will refer to this estimator as the highly adaptive Lasso (HAL)-estimator. We prove that for all models the HAL-estimator converges to the true nuisance parameter value at a rate that is faster than n-1/4 $n^{-1/4}$ w.r.t. square-root of the loss-based dissimilarity. We also show that if this HAL-estimator is included in the library of an ensemble super-learner, then the super-learner will at minimal achieve the rate of convergence of the HAL, but, by previous results, it will actually be asymptotically equivalent with the oracle (i.e., in some sense best) estimator in the library. Subsequently, we establish that a one-step TMLE using such a super-learner as initial estimator for each of the nuisance parameters is asymptotically efficient at any data generating distribution in the model, under weak structural conditions on the target parameter mapping and model and a strong positivity assumption (e.g., the canonical gradient is uniformly bounded). We demonstrate our general theorem by constructing such a one-step TMLE of the average causal effect in a nonparametric model, and establishing that it is asymptotically efficient.</p>\",\"PeriodicalId\":49058,\"journal\":{\"name\":\"International Journal of Biostatistics\",\"volume\":\"13 2\",\"pages\":\"\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2017-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1515/ijb-2015-0097\",\"citationCount\":\"52\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Biostatistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1515/ijb-2015-0097\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biostatistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/ijb-2015-0097","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 52
摘要
假设我们观察一个有限维有界随机变量的n$ n$独立且同分布的观测值。本文研究了基于现实统计模型的数据分布路径可微目标参数的有效目标最小损失估计器的构造。我们将在统计模型上强制执行的唯一平滑条件是,用于评估目标参数的路径导数的正则梯度所需的数据分布的干扰参数是多元实值cadlag函数(右连续和左极限,G. Neuhaus)。具有多维时间参数的随机过程的弱收敛性。Ann Stat 1971;42:1285-1295.),并具有有限的最高和(分段)变异范数。每个干扰参数被定义为损失函数对其参数空间中所有函数的期望的最小化。对于每个干扰参数,我们提出了一个新的基于最小损失的估计器,该估计器在函数的变异范数由一组常数限定的附加约束下,在其参数空间中使函数上的损失特定经验风险最小化。该常数是通过交叉验证选择的。我们证明这样的最大似然可以表示为指标基函数线性组合上的经验风险的最小值,其约束条件是系数绝对值的和有常数的边界:即变异范数对应于系数向量的L1 $L_1$-范数。我们将这个估计器称为高度自适应Lasso (HAL)估计器。我们证明了对于所有模型,hal估计器收敛到真正的妨害参数值的速度快于基于损失的不相似度的n-1/4 $n^{-1/4}$ w.r.t.平方根。我们还表明,如果这个HAL估计器包含在集成超级学习器的库中,那么超级学习器将至少达到HAL的收敛速度,但是,根据之前的结果,它实际上与库中的oracle(即某种意义上的最佳)估计器渐近等价。随后,我们建立了在目标参数映射和模型上的弱结构条件以及强正性假设(例如,正则梯度是一致有界的)下,使用这种超级学习器作为每个干扰参数的初始估计的一步TMLE在模型中的任何数据生成分布上都是渐近有效的。我们通过构造非参数模型中平均因果效应的这样一个一步TMLE来证明我们的一般定理,并证明它是渐近有效的。
A Generally Efficient Targeted Minimum Loss Based Estimator based on the Highly Adaptive Lasso.
Suppose we observe n $n$ independent and identically distributed observations of a finite dimensional bounded random variable. This article is concerned with the construction of an efficient targeted minimum loss-based estimator (TMLE) of a pathwise differentiable target parameter of the data distribution based on a realistic statistical model. The only smoothness condition we will enforce on the statistical model is that the nuisance parameters of the data distribution that are needed to evaluate the canonical gradient of the pathwise derivative of the target parameter are multivariate real valued cadlag functions (right-continuous and left-hand limits, (G. Neuhaus. On weak convergence of stochastic processes with multidimensional time parameter. Ann Stat 1971;42:1285-1295.) and have a finite supremum and (sectional) variation norm. Each nuisance parameter is defined as a minimizer of the expectation of a loss function over over all functions it its parameter space. For each nuisance parameter, we propose a new minimum loss based estimator that minimizes the loss-specific empirical risk over the functions in its parameter space under the additional constraint that the variation norm of the function is bounded by a set constant. The constant is selected with cross-validation. We show such an MLE can be represented as the minimizer of the empirical risk over linear combinations of indicator basis functions under the constraint that the sum of the absolute value of the coefficients is bounded by the constant: i.e., the variation norm corresponds with this L1 $L_1$-norm of the vector of coefficients. We will refer to this estimator as the highly adaptive Lasso (HAL)-estimator. We prove that for all models the HAL-estimator converges to the true nuisance parameter value at a rate that is faster than n-1/4 $n^{-1/4}$ w.r.t. square-root of the loss-based dissimilarity. We also show that if this HAL-estimator is included in the library of an ensemble super-learner, then the super-learner will at minimal achieve the rate of convergence of the HAL, but, by previous results, it will actually be asymptotically equivalent with the oracle (i.e., in some sense best) estimator in the library. Subsequently, we establish that a one-step TMLE using such a super-learner as initial estimator for each of the nuisance parameters is asymptotically efficient at any data generating distribution in the model, under weak structural conditions on the target parameter mapping and model and a strong positivity assumption (e.g., the canonical gradient is uniformly bounded). We demonstrate our general theorem by constructing such a one-step TMLE of the average causal effect in a nonparametric model, and establishing that it is asymptotically efficient.
期刊介绍:
The International Journal of Biostatistics (IJB) seeks to publish new biostatistical models and methods, new statistical theory, as well as original applications of statistical methods, for important practical problems arising from the biological, medical, public health, and agricultural sciences with an emphasis on semiparametric methods. Given many alternatives to publish exist within biostatistics, IJB offers a place to publish for research in biostatistics focusing on modern methods, often based on machine-learning and other data-adaptive methodologies, as well as providing a unique reading experience that compels the author to be explicit about the statistical inference problem addressed by the paper. IJB is intended that the journal cover the entire range of biostatistics, from theoretical advances to relevant and sensible translations of a practical problem into a statistical framework. Electronic publication also allows for data and software code to be appended, and opens the door for reproducible research allowing readers to easily replicate analyses described in a paper. Both original research and review articles will be warmly received, as will articles applying sound statistical methods to practical problems.