{"title":"基于 B 样条的多变量非参数回归模型中函数估计的数据驱动方法及其在地球科学中的应用","authors":"Mary Edith Savino , Céline Lévy-Leduc","doi":"10.1016/j.apm.2024.115783","DOIUrl":null,"url":null,"abstract":"<div><div>In this paper, we will outline a novel data-driven method for estimating functions in a multivariate nonparametric regression model based on an adaptive knot selection for B-splines. The underlying idea of our approach for selecting knots is to apply the generalized lasso, since the knots of the B-spline basis can be seen as locations of the changes in the derivatives of the function to be estimated. This method was then extended to functions depending on several variables by processing each dimension independently, thus reducing the problem to a univariate setting. The regularization parameters were chosen by means of a criterion based on the Extended Bayesian Information Criterion. The nonparametric estimator was obtained using a multivariate B-spline regression with the corresponding selected knots. Our procedure was validated through numerical experiments by varying the number of observations, the level of noise and the observation sampling to investigate its behavior under such conditions. Our method was applied to two distinct classical geochemical cases. For each different example considered in this paper, our approach performed better than state-of-the-art methods. Our completely data-driven method is implemented in the <span>glober</span> R package which is available on the Comprehensive R Archive Network.</div></div>","PeriodicalId":50980,"journal":{"name":"Applied Mathematical Modelling","volume":"138 ","pages":"Article 115783"},"PeriodicalIF":4.4000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A data-driven approach for estimating functions in a multivariate nonparametric regression model based on B-splines with an application to geoscience\",\"authors\":\"Mary Edith Savino , Céline Lévy-Leduc\",\"doi\":\"10.1016/j.apm.2024.115783\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In this paper, we will outline a novel data-driven method for estimating functions in a multivariate nonparametric regression model based on an adaptive knot selection for B-splines. The underlying idea of our approach for selecting knots is to apply the generalized lasso, since the knots of the B-spline basis can be seen as locations of the changes in the derivatives of the function to be estimated. This method was then extended to functions depending on several variables by processing each dimension independently, thus reducing the problem to a univariate setting. The regularization parameters were chosen by means of a criterion based on the Extended Bayesian Information Criterion. The nonparametric estimator was obtained using a multivariate B-spline regression with the corresponding selected knots. Our procedure was validated through numerical experiments by varying the number of observations, the level of noise and the observation sampling to investigate its behavior under such conditions. Our method was applied to two distinct classical geochemical cases. For each different example considered in this paper, our approach performed better than state-of-the-art methods. Our completely data-driven method is implemented in the <span>glober</span> R package which is available on the Comprehensive R Archive Network.</div></div>\",\"PeriodicalId\":50980,\"journal\":{\"name\":\"Applied Mathematical Modelling\",\"volume\":\"138 \",\"pages\":\"Article 115783\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2024-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Mathematical Modelling\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0307904X24005365\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Mathematical Modelling","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0307904X24005365","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
在本文中,我们将概述一种新颖的数据驱动方法,该方法基于 B-样条曲线的自适应节选择,用于估计多元非参数回归模型中的函数。我们选择结点方法的基本思想是应用广义套索,因为 B-样条曲线基础的结点可视为待估计函数导数变化的位置。然后,通过对每个维度进行独立处理,将此方法扩展到取决于多个变量的函数,从而将问题简化为单变量设置。正则化参数是通过基于扩展贝叶斯信息准则的标准来选择的。非参数估计器是通过使用相应的选定结点进行多元 B-样条回归获得的。我们通过数值实验验证了这一方法,实验中我们改变了观测数据的数量、噪声水平和观测取样,以研究其在这些条件下的行为。我们的方法适用于两种不同的经典地球化学案例。对于本文考虑的每个不同案例,我们的方法都比最先进的方法表现更好。我们完全由数据驱动的方法是在 glober R 软件包中实现的,该软件包可在综合 R 档案网络上获取。
A data-driven approach for estimating functions in a multivariate nonparametric regression model based on B-splines with an application to geoscience
In this paper, we will outline a novel data-driven method for estimating functions in a multivariate nonparametric regression model based on an adaptive knot selection for B-splines. The underlying idea of our approach for selecting knots is to apply the generalized lasso, since the knots of the B-spline basis can be seen as locations of the changes in the derivatives of the function to be estimated. This method was then extended to functions depending on several variables by processing each dimension independently, thus reducing the problem to a univariate setting. The regularization parameters were chosen by means of a criterion based on the Extended Bayesian Information Criterion. The nonparametric estimator was obtained using a multivariate B-spline regression with the corresponding selected knots. Our procedure was validated through numerical experiments by varying the number of observations, the level of noise and the observation sampling to investigate its behavior under such conditions. Our method was applied to two distinct classical geochemical cases. For each different example considered in this paper, our approach performed better than state-of-the-art methods. Our completely data-driven method is implemented in the glober R package which is available on the Comprehensive R Archive Network.
期刊介绍:
Applied Mathematical Modelling focuses on research related to the mathematical modelling of engineering and environmental processes, manufacturing, and industrial systems. A significant emerging area of research activity involves multiphysics processes, and contributions in this area are particularly encouraged.
This influential publication covers a wide spectrum of subjects including heat transfer, fluid mechanics, CFD, and transport phenomena; solid mechanics and mechanics of metals; electromagnets and MHD; reliability modelling and system optimization; finite volume, finite element, and boundary element procedures; modelling of inventory, industrial, manufacturing and logistics systems for viable decision making; civil engineering systems and structures; mineral and energy resources; relevant software engineering issues associated with CAD and CAE; and materials and metallurgical engineering.
Applied Mathematical Modelling is primarily interested in papers developing increased insights into real-world problems through novel mathematical modelling, novel applications or a combination of these. Papers employing existing numerical techniques must demonstrate sufficient novelty in the solution of practical problems. Papers on fuzzy logic in decision-making or purely financial mathematics are normally not considered. Research on fractional differential equations, bifurcation, and numerical methods needs to include practical examples. Population dynamics must solve realistic scenarios. Papers in the area of logistics and business modelling should demonstrate meaningful managerial insight. Submissions with no real-world application will not be considered.