函数线性模型的测度选择

IF 1.6 3区数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computational Statistics & Data Analysis Pub Date : 2025-09-08 DOI:10.1016/j.csda.2025.108270

Su I Iao, Hans-Georg Müller

{"title":"函数线性模型的测度选择","authors":"Su I Iao, Hans-Georg Müller","doi":"10.1016/j.csda.2025.108270","DOIUrl":null,"url":null,"abstract":"<div><div>Advancements in modern science have led to an increased prevalence of functional data, which are usually viewed as elements of the space of square-integrable functions <span><math><msup><mi>L</mi><mn>2</mn></msup></math></span>. Core methods in functional data analysis, such as functional principal component analysis, are typically grounded in the Hilbert structure of <span><math><msup><mi>L</mi><mn>2</mn></msup></math></span> and rely on inner products based on integrals with respect to the Lebesgue measure over a fixed domain. A more flexible framework is proposed, where the measure can be arbitrary, allowing natural extensions to unbounded domains and prompting the question of optimal measure choice. Specifically, a novel functional linear model is introduced that incorporates a data-adaptive choice of the measure that defines the space, alongside an enhanced function principal component analysis. Selecting a good measure can improve the model’s predictive performance, especially when the underlying processes are not well-represented when adopting the default Lebesgue measure. Simulations, as well as applications to COVID-19 data and the National Health and Nutrition Examination Survey data, show that the proposed approach consistently outperforms the conventional functional linear model.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"214 ","pages":"Article 108270"},"PeriodicalIF":1.6000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Measure selection for functional linear model\",\"authors\":\"Su I Iao, Hans-Georg Müller\",\"doi\":\"10.1016/j.csda.2025.108270\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Advancements in modern science have led to an increased prevalence of functional data, which are usually viewed as elements of the space of square-integrable functions <span><math><msup><mi>L</mi><mn>2</mn></msup></math></span>. Core methods in functional data analysis, such as functional principal component analysis, are typically grounded in the Hilbert structure of <span><math><msup><mi>L</mi><mn>2</mn></msup></math></span> and rely on inner products based on integrals with respect to the Lebesgue measure over a fixed domain. A more flexible framework is proposed, where the measure can be arbitrary, allowing natural extensions to unbounded domains and prompting the question of optimal measure choice. Specifically, a novel functional linear model is introduced that incorporates a data-adaptive choice of the measure that defines the space, alongside an enhanced function principal component analysis. Selecting a good measure can improve the model’s predictive performance, especially when the underlying processes are not well-represented when adopting the default Lebesgue measure. Simulations, as well as applications to COVID-19 data and the National Health and Nutrition Examination Survey data, show that the proposed approach consistently outperforms the conventional functional linear model.</div></div>\",\"PeriodicalId\":55225,\"journal\":{\"name\":\"Computational Statistics & Data Analysis\",\"volume\":\"214 \",\"pages\":\"Article 108270\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2025-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Statistics & Data Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S016794732500146X\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics & Data Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016794732500146X","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

现代科学的进步导致了函数数据的日益流行，它通常被视为平方可积函数L2空间的元素。功能数据分析的核心方法，如功能主成分分析，通常以L2的希尔伯特结构为基础，并依赖于基于固定域上关于勒贝格测度的积分的内积。提出了一个更灵活的框架，其中度量可以是任意的，允许自然扩展到无界域，并提出了最优度量选择的问题。具体来说，介绍了一种新的功能线性模型，该模型结合了定义空间的测量的数据自适应选择，以及增强的功能主成分分析。选择一个好的度量可以提高模型的预测性能，特别是当采用默认的Lebesgue度量时，底层过程没有得到很好的表示。模拟以及对COVID-19数据和国家健康与营养检查调查数据的应用表明，所提出的方法始终优于传统的功能线性模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Measure selection for functional linear model

Advancements in modern science have led to an increased prevalence of functional data, which are usually viewed as elements of the space of square-integrable functions

L^{2}

. Core methods in functional data analysis, such as functional principal component analysis, are typically grounded in the Hilbert structure of

L^{2}

and rely on inner products based on integrals with respect to the Lebesgue measure over a fixed domain. A more flexible framework is proposed, where the measure can be arbitrary, allowing natural extensions to unbounded domains and prompting the question of optimal measure choice. Specifically, a novel functional linear model is introduced that incorporates a data-adaptive choice of the measure that defines the space, alongside an enhanced function principal component analysis. Selecting a good measure can improve the model’s predictive performance, especially when the underlying processes are not well-represented when adopting the default Lebesgue measure. Simulations, as well as applications to COVID-19 data and the National Health and Nutrition Examination Survey data, show that the proposed approach consistently outperforms the conventional functional linear model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computational Statistics & Data Analysis 数学-计算机：跨学科应用

CiteScore

3.70

自引率

5.60%

发文量

167

审稿时长

60 days

期刊介绍： Computational Statistics and Data Analysis (CSDA), an Official Publication of the network Computational and Methodological Statistics (CMStatistics) and of the International Association for Statistical Computing (IASC), is an international journal dedicated to the dissemination of methodological research and applications in the areas of computational statistics and data analysis. The journal consists of four refereed sections which are divided into the following subject areas: I) Computational Statistics - Manuscripts dealing with: 1) the explicit impact of computers on statistical methodology (e.g., Bayesian computing, bioinformatics,computer graphics, computer intensive inferential methods, data exploration, data mining, expert systems, heuristics, knowledge based systems, machine learning, neural networks, numerical and optimization methods, parallel computing, statistical databases, statistical systems), and 2) the development, evaluation and validation of statistical software and algorithms. Software and algorithms can be submitted with manuscripts and will be stored together with the online article. II) Statistical Methodology for Data Analysis - Manuscripts dealing with novel and original data analytical strategies and methodologies applied in biostatistics (design and analytic methods for clinical trials, epidemiological studies, statistical genetics, or genetic/environmental interactions), chemometrics, classification, data exploration, density estimation, design of experiments, environmetrics, education, image analysis, marketing, model free data exploration, pattern recognition, psychometrics, statistical physics, image processing, robust procedures. [...] III) Special Applications - [...] IV) Annals of Statistical Data Science [...]