{"title":"精确贝叶斯回归:利用多维空间划分树逼近最优性","authors":"Amin Vahedian","doi":"10.1109/TKDE.2025.3592074","DOIUrl":null,"url":null,"abstract":"The Conditional Expectation Function (CEF) is an optimal estimator in real space. Artificial Neural Networks (ANN), as the current state-of-the-art method, lack interpretability. Estimating CEF offers a path to achieve both accuracy and interpretability. Previous attempts to estimate CEF rely on limiting assumptions such as independence and distributional form or perform the expensive nearest neighbor search. We propose Dynamically Ordered Precise Bayes Regression (DO-PBR), a novel method to estimate CEF in discrete space. We prove DO-PBR approaches optimality with increasing number of samples. DO-PBR dynamically learns importance rankings for the predictors, which are region-specific, allowing the importance of a predictor vary across the space. DO-PBR is fully interpretable and makes no assumptions on independence or the distributional form, while requiring minimal parameter setting. In addition, DO-PBR avoids the costly nearest-neighbor search, by using a hierarchy of binary trees. Our experiments confirm our theoretical claims on approaching optimality and show that DO-PBR achieves substantially higher accuracy compared to ANN, when given the same amount of time. Our experiments show that on average, ANN takes 32 times longer to achieve the same level of accuracy as DO-PBR.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 10","pages":"6107-6119"},"PeriodicalIF":10.4000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Precise Bayes Regression: Approaching Optimality, Using Multi-Dimensional Space Partitioning Trees\",\"authors\":\"Amin Vahedian\",\"doi\":\"10.1109/TKDE.2025.3592074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Conditional Expectation Function (CEF) is an optimal estimator in real space. Artificial Neural Networks (ANN), as the current state-of-the-art method, lack interpretability. Estimating CEF offers a path to achieve both accuracy and interpretability. Previous attempts to estimate CEF rely on limiting assumptions such as independence and distributional form or perform the expensive nearest neighbor search. We propose Dynamically Ordered Precise Bayes Regression (DO-PBR), a novel method to estimate CEF in discrete space. We prove DO-PBR approaches optimality with increasing number of samples. DO-PBR dynamically learns importance rankings for the predictors, which are region-specific, allowing the importance of a predictor vary across the space. DO-PBR is fully interpretable and makes no assumptions on independence or the distributional form, while requiring minimal parameter setting. In addition, DO-PBR avoids the costly nearest-neighbor search, by using a hierarchy of binary trees. Our experiments confirm our theoretical claims on approaching optimality and show that DO-PBR achieves substantially higher accuracy compared to ANN, when given the same amount of time. Our experiments show that on average, ANN takes 32 times longer to achieve the same level of accuracy as DO-PBR.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"37 10\",\"pages\":\"6107-6119\"},\"PeriodicalIF\":10.4000,\"publicationDate\":\"2025-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11091545/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11091545/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Precise Bayes Regression: Approaching Optimality, Using Multi-Dimensional Space Partitioning Trees
The Conditional Expectation Function (CEF) is an optimal estimator in real space. Artificial Neural Networks (ANN), as the current state-of-the-art method, lack interpretability. Estimating CEF offers a path to achieve both accuracy and interpretability. Previous attempts to estimate CEF rely on limiting assumptions such as independence and distributional form or perform the expensive nearest neighbor search. We propose Dynamically Ordered Precise Bayes Regression (DO-PBR), a novel method to estimate CEF in discrete space. We prove DO-PBR approaches optimality with increasing number of samples. DO-PBR dynamically learns importance rankings for the predictors, which are region-specific, allowing the importance of a predictor vary across the space. DO-PBR is fully interpretable and makes no assumptions on independence or the distributional form, while requiring minimal parameter setting. In addition, DO-PBR avoids the costly nearest-neighbor search, by using a hierarchy of binary trees. Our experiments confirm our theoretical claims on approaching optimality and show that DO-PBR achieves substantially higher accuracy compared to ANN, when given the same amount of time. Our experiments show that on average, ANN takes 32 times longer to achieve the same level of accuracy as DO-PBR.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.