有限数据污染下的广义核密度估计

IF 2.6 2区 数学 Q1 MATHEMATICS, APPLIED
Jerome Krief
{"title":"有限数据污染下的广义核密度估计","authors":"Jerome Krief","doi":"10.1016/j.cam.2025.116937","DOIUrl":null,"url":null,"abstract":"<div><div>This paper treats the deconvolution model <span><math><mrow><mi>Y</mi><mo>=</mo><mi>X</mi><mo>+</mo><mrow><mo>(</mo><mn>1</mn><mo>−</mo><mi>S</mi><mo>)</mo></mrow><mi>U</mi></mrow></math></span>, where <span><math><mi>X</mi></math></span> has Lebesgue density <span><math><msub><mrow><mi>f</mi></mrow><mrow><mi>X</mi></mrow></msub></math></span>, <span><math><mi>U</mi></math></span> has a known distribution, and <span><math><mi>S</mi></math></span> has a known Bernoulli distribution with <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>∈</mo><mrow><mo>(</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>)</mo></mrow></mrow></math></span>. The aim is to estimate <span><math><msub><mrow><mi>f</mi></mrow><mrow><mi>X</mi></mrow></msub></math></span> using observations from <span><math><mi>Y</mi></math></span>. Unlike the classic deconvolution model where <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>=</mo><mn>0</mn></mrow></math></span> (Fan 1991, Annals of Statistics), this estimation problem is well-posed. This substantially reduces the difficulty of the estimation problem. Existing estimators require the characteristic function of <span><math><mi>U</mi></math></span> to be real-valued or else <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>&gt;</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></math></span> but the implementation in that case requires selecting three tuning parameters which is not very appealing in applied works. I present an easily implementable nonparametric methodology which removes these restrictions concerning the distribution of <span><math><mrow><mo>(</mo><mi>X</mi><mo>,</mo><mi>U</mi><mo>,</mo><mi>S</mi><mo>)</mo></mrow></math></span>. If <span><math><mi>U</mi></math></span> is noisy in a certain sense or else if <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>&gt;</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></math></span> then a target density belonging to the classic Holder class can be identified as the solution of a well-posed Fredholm integral equation of the second kind. The proposed estimator has a Mean Integrated Square Error converging at a rate which is equal to the optimal nonparametric rate without data contamination (i.e. if <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>=</mo><mn>1</mn></mrow></math></span>). Moreover, if the distribution of <span><math><mi>S</mi></math></span> is unknown, then a feasible estimator is proposed assuming that either the first moment or the second moment of <span><math><mi>X</mi></math></span> is known. The feasible estimator has an Integrated Square Error displaying the same speed of convergence in probability. A Monte Carlo experiment reveals good finite-sample properties for the proposed estimators when the distribution of <span><math><mi>U</mi></math></span> is supersmooth or skewed</div></div>","PeriodicalId":50226,"journal":{"name":"Journal of Computational and Applied Mathematics","volume":"474 ","pages":"Article 116937"},"PeriodicalIF":2.6000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generalized kernel density estimation with limited data contamination\",\"authors\":\"Jerome Krief\",\"doi\":\"10.1016/j.cam.2025.116937\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper treats the deconvolution model <span><math><mrow><mi>Y</mi><mo>=</mo><mi>X</mi><mo>+</mo><mrow><mo>(</mo><mn>1</mn><mo>−</mo><mi>S</mi><mo>)</mo></mrow><mi>U</mi></mrow></math></span>, where <span><math><mi>X</mi></math></span> has Lebesgue density <span><math><msub><mrow><mi>f</mi></mrow><mrow><mi>X</mi></mrow></msub></math></span>, <span><math><mi>U</mi></math></span> has a known distribution, and <span><math><mi>S</mi></math></span> has a known Bernoulli distribution with <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>∈</mo><mrow><mo>(</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>)</mo></mrow></mrow></math></span>. The aim is to estimate <span><math><msub><mrow><mi>f</mi></mrow><mrow><mi>X</mi></mrow></msub></math></span> using observations from <span><math><mi>Y</mi></math></span>. Unlike the classic deconvolution model where <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>=</mo><mn>0</mn></mrow></math></span> (Fan 1991, Annals of Statistics), this estimation problem is well-posed. This substantially reduces the difficulty of the estimation problem. Existing estimators require the characteristic function of <span><math><mi>U</mi></math></span> to be real-valued or else <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>&gt;</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></math></span> but the implementation in that case requires selecting three tuning parameters which is not very appealing in applied works. I present an easily implementable nonparametric methodology which removes these restrictions concerning the distribution of <span><math><mrow><mo>(</mo><mi>X</mi><mo>,</mo><mi>U</mi><mo>,</mo><mi>S</mi><mo>)</mo></mrow></math></span>. If <span><math><mi>U</mi></math></span> is noisy in a certain sense or else if <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>&gt;</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></math></span> then a target density belonging to the classic Holder class can be identified as the solution of a well-posed Fredholm integral equation of the second kind. The proposed estimator has a Mean Integrated Square Error converging at a rate which is equal to the optimal nonparametric rate without data contamination (i.e. if <span><math><mrow><mi>P</mi><mrow><mo>[</mo><mi>S</mi><mo>=</mo><mn>1</mn><mo>]</mo></mrow><mo>=</mo><mn>1</mn></mrow></math></span>). Moreover, if the distribution of <span><math><mi>S</mi></math></span> is unknown, then a feasible estimator is proposed assuming that either the first moment or the second moment of <span><math><mi>X</mi></math></span> is known. The feasible estimator has an Integrated Square Error displaying the same speed of convergence in probability. A Monte Carlo experiment reveals good finite-sample properties for the proposed estimators when the distribution of <span><math><mi>U</mi></math></span> is supersmooth or skewed</div></div>\",\"PeriodicalId\":50226,\"journal\":{\"name\":\"Journal of Computational and Applied Mathematics\",\"volume\":\"474 \",\"pages\":\"Article 116937\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational and Applied Mathematics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0377042725004510\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational and Applied Mathematics","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0377042725004510","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

摘要

本文处理反卷积模型Y=X+(1−S)U,其中X具有勒贝格密度fX, U具有已知分布,S具有已知伯努利分布,且P[S=1]∈(0,1)。目的是使用y的观测值来估计fX。与P[S=1]=0的经典反卷积模型(Fan 1991, Annals of Statistics)不同,这个估计问题是适定的。这大大降低了估计问题的难度。现有的估计器要求U的特征函数为实值或P[S=1]>1/2,但在这种情况下的实现需要选择三个调谐参数,这在应用工作中不是很有吸引力。我提出了一种易于实现的非参数方法,它消除了有关(X,U,S)分布的这些限制。如果U在一定意义上是有噪声的,或者如果P[S=1]>1/2,则属于经典Holder类的目标密度可以被识别为第二类适定Fredholm积分方程的解。所提出的估计器具有平均积分平方误差,其收敛速率等于无数据污染的最优非参数速率(即,如果P[S=1]=1)。此外,如果S的分布未知,则假设X的一阶矩或二阶矩已知,则提出可行估计量。可行估计量在概率上具有相同的收敛速度的积分平方误差。蒙特卡罗实验表明,当U的分布超光滑或偏态时,所提估计量具有良好的有限样本性质
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Generalized kernel density estimation with limited data contamination
This paper treats the deconvolution model Y=X+(1S)U, where X has Lebesgue density fX, U has a known distribution, and S has a known Bernoulli distribution with P[S=1](0,1). The aim is to estimate fX using observations from Y. Unlike the classic deconvolution model where P[S=1]=0 (Fan 1991, Annals of Statistics), this estimation problem is well-posed. This substantially reduces the difficulty of the estimation problem. Existing estimators require the characteristic function of U to be real-valued or else P[S=1]>1/2 but the implementation in that case requires selecting three tuning parameters which is not very appealing in applied works. I present an easily implementable nonparametric methodology which removes these restrictions concerning the distribution of (X,U,S). If U is noisy in a certain sense or else if P[S=1]>1/2 then a target density belonging to the classic Holder class can be identified as the solution of a well-posed Fredholm integral equation of the second kind. The proposed estimator has a Mean Integrated Square Error converging at a rate which is equal to the optimal nonparametric rate without data contamination (i.e. if P[S=1]=1). Moreover, if the distribution of S is unknown, then a feasible estimator is proposed assuming that either the first moment or the second moment of X is known. The feasible estimator has an Integrated Square Error displaying the same speed of convergence in probability. A Monte Carlo experiment reveals good finite-sample properties for the proposed estimators when the distribution of U is supersmooth or skewed
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.40
自引率
4.20%
发文量
437
审稿时长
3.0 months
期刊介绍: The Journal of Computational and Applied Mathematics publishes original papers of high scientific value in all areas of computational and applied mathematics. The main interest of the Journal is in papers that describe and analyze new computational techniques for solving scientific or engineering problems. Also the improved analysis, including the effectiveness and applicability, of existing methods and algorithms is of importance. The computational efficiency (e.g. the convergence, stability, accuracy, ...) should be proved and illustrated by nontrivial numerical examples. Papers describing only variants of existing methods, without adding significant new computational properties are not of interest. The audience consists of: applied mathematicians, numerical analysts, computational scientists and engineers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信