Machine Learning最新文献_第7页

Exploiting residual errors in nonlinear online prediction 利用非线性在线预测中的残余误差

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-05-29 DOI: 10.1007/s10994-024-06554-7

Emirhan Ilhan, Ahmet B. Koc, Suleyman S. Kozat

{"title":"Exploiting residual errors in nonlinear online prediction","authors":"Emirhan Ilhan, Ahmet B. Koc, Suleyman S. Kozat","doi":"10.1007/s10994-024-06554-7","DOIUrl":"https://doi.org/10.1007/s10994-024-06554-7","url":null,"abstract":"We introduce a novel online (or sequential) nonlinear prediction approach that incorporates the residuals, i.e., prediction errors in the past observations, as additional features for the current data. Including the past error terms in an online prediction algorithm naturally improves prediction performance significantly since this information is essential for an algorithm to adjust itself based on its past errors. These terms are well exploited in many linear statistical models such as ARMA, SES, and Holts-Winters models. However, the past error terms are rarely or in a certain sense not optimally exploited in nonlinear prediction models since training them requires complex nonlinear state-space modeling. To this end, for the first time in the literature, we introduce a nonlinear prediction framework that utilizes not only the current features but also the past error terms as additional features, thereby exploiting the residual state information in the error terms, i.e., the model’s performance on the past samples. Since the new feature vectors contain error terms that change with every update, our algorithm jointly optimizes the model parameters and the feature vectors simultaneously. We achieve this by introducing new update equations that handle the effects resulting from the changes in the feature vectors in an online manner. We use soft decision trees and neural networks as the nonlinear prediction algorithms since these are the most widely used methods in highly publicized competitions. However, as we show, our methods are generic and any algorithm supporting gradient calculations can be straightforwardly used. We show through our experiments on the well-known real-life competition datasets that our method significantly outperforms the state-of-the-art. We also provide the implementation of our approach including the source code to facilitate reproducibility (https://github.com/ahmetberkerkoc/SDT-ARMA).","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"34 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141197672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Meta-learning for heterogeneous treatment effect estimation with closed-form solvers 利用闭式求解器进行异质治疗效果估计的元学习

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-05-29 DOI: 10.1007/s10994-024-06546-7

Tomoharu Iwata, Yoichi Chikahara

引用次数: 0

Probabilistic grammars for modeling dynamical systems from coarse, noisy, and partial data 从粗略、嘈杂和部分数据为动力系统建模的概率语法

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-05-29 DOI: 10.1007/s10994-024-06522-1

Nina Omejc, Boštjan Gec, Jure Brence, Ljupčo Todorovski, Sašo Džeroski

{"title":"Probabilistic grammars for modeling dynamical systems from coarse, noisy, and partial data","authors":"Nina Omejc, Boštjan Gec, Jure Brence, Ljupčo Todorovski, Sašo Džeroski","doi":"10.1007/s10994-024-06522-1","DOIUrl":"https://doi.org/10.1007/s10994-024-06522-1","url":null,"abstract":"Ordinary differential equations (ODEs) are a widely used formalism for the mathematical modeling of dynamical systems, a task omnipresent in scientific domains. The paper introduces a novel method for inferring ODEs from data, which extends ProGED, a method for equation discovery that allows users to formalize domain-specific knowledge as probabilistic context-free grammars and use it for constraining the space of candidate equations. The extended method can discover ODEs from partial observations of dynamical systems, where only a subset of state variables can be observed. To evaluate the performance of the newly proposed method, we perform a systematic empirical comparison with alternative state-of-the-art methods for equation discovery and system identification from complete and partial observations. The comparison uses Dynobench, a set of ten dynamical systems that extends the standard Strogatz benchmark. We compare the ability of the considered methods to reconstruct the known ODEs from synthetic data simulated at different temporal resolutions. We also consider data with different levels of noise, i.e., signal-to-noise ratios. The improved ProGED compares favourably to state-of-the-art methods for inferring ODEs from data regarding reconstruction abilities and robustness to data coarseness, noise, and completeness.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"43 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141197670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating feature attribution methods in the image domain 评估图像领域的特征归属方法

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-05-24 DOI: 10.1007/s10994-024-06550-x

Arne Gevaert, Axel-Jan Rousseau, Thijs Becker, Dirk Valkenborg, Tijl De Bie, Yvan Saeys

{"title":"Evaluating feature attribution methods in the image domain","authors":"Arne Gevaert, Axel-Jan Rousseau, Thijs Becker, Dirk Valkenborg, Tijl De Bie, Yvan Saeys","doi":"10.1007/s10994-024-06550-x","DOIUrl":"https://doi.org/10.1007/s10994-024-06550-x","url":null,"abstract":"Feature attribution maps are a popular approach to highlight the most important pixels in an image for a given prediction of a model. Despite a recent growth in popularity and available methods, the objective evaluation of such attribution maps remains an open problem. Building on previous work in this domain, we investigate existing quality metrics and propose new variants of metrics for the evaluation of attribution maps. We confirm a recent finding that different quality metrics seem to measure different underlying properties of attribution maps, and extend this finding to a larger selection of attribution methods, quality metrics, and datasets. We also find that metric results on one dataset do not necessarily generalize to other datasets, and methods with desirable theoretical properties do not necessarily outperform computationally cheaper alternatives in practice. Based on these findings, we propose a general benchmarking approach to help guide the selection of attribution methods for a given use case. Implementations of attribution metrics and our experiments are available online (https://github.com/arnegevaert/benchmark-general-imaging).<h3 data-test=\"abstract-sub-heading\">Graphical abstract</h3>\u0000","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"17 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Classification with costly features in hierarchical deep sets 在分层深度集合中使用代价高昂的特征进行分类

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-05-22 DOI: 10.1007/s10994-024-06565-4

Jaromír Janisch, Tomáš Pevný, Viliam Lisý

引用次数: 0

CoMadOut—a robust outlier detection algorithm based on CoMAD CoMadOut - 基于 CoMAD 的鲁棒离群点检测算法

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-05-07 DOI: 10.1007/s10994-024-06521-2

Andreas Lohrer, Daniyal Kazempour, Maximilian Hünemörder, Peer Kröger

{"title":"CoMadOut—a robust outlier detection algorithm based on CoMAD","authors":"Andreas Lohrer, Daniyal Kazempour, Maximilian Hünemörder, Peer Kröger","doi":"10.1007/s10994-024-06521-2","DOIUrl":"https://doi.org/10.1007/s10994-024-06521-2","url":null,"abstract":"Unsupervised learning methods are well established in the area of anomaly detection and achieve state of the art performances on outlier datasets. Outliers play a significant role, since they bear the potential to distort the predictions of a machine learning algorithm on a given dataset. Especially among PCA-based methods, outliers have an additional destructive potential regarding the result: they may not only distort the orientation and translation of the principal components, they also make it more complicated to detect outliers. To address this problem, we propose the robust outlier detection algorithm CoMadOut, which satisfies two required properties: (1) being robust towards outliers and (2) detecting them. Our CoMadOut outlier detection variants using comedian PCA define, dependent on its variant, an inlier region with a robust noise margin by measures of in-distribution (variant CMO) and optimized scores by measures of out-of-distribution (variants CMO*), e.g. kurtosis-weighting by CMO+k. These measures allow distribution based outlier scoring for each principal component, and thus, an appropriate alignment of the degree of outlierness between normal and abnormal instances. Experiments comparing CoMadOut with traditional, deep and other comparable robust outlier detection methods showed that the performance of the introduced CoMadOut approach is competitive to well established methods related to average precision (AP), area under the precision recall curve (AUPRC) and area under the receiver operating characteristic (AUROC) curve. In summary our approach can be seen as a robust alternative for outlier detection tasks.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"1 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SWoTTeD: an extension of tensor decomposition to temporal phenotyping SWoTTeD：将张量分解扩展到时间表型分析

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-04-30 DOI: 10.1007/s10994-024-06545-8

Hana Sebia, Thomas Guyet, Etienne Audureau

引用次数: 0

Finite-time error bounds for Greedy-GQ Greedy-GQ 的有限时间误差边界

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-04-30 DOI: 10.1007/s10994-024-06542-x

Yue Wang, Yi Zhou, Shaofeng Zou

{"title":"Finite-time error bounds for Greedy-GQ","authors":"Yue Wang, Yi Zhou, Shaofeng Zou","doi":"10.1007/s10994-024-06542-x","DOIUrl":"https://doi.org/10.1007/s10994-024-06542-x","url":null,"abstract":"Greedy-GQ with linear function approximation, originally proposed in Maei et al. (in: Proceedings of the international conference on machine learning (ICML), 2010), is a value-based off-policy algorithm for optimal control in reinforcement learning, and it has a non-linear two timescale structure with non-convex objective function. This paper develops its tightest finite-time error bounds. We show that the Greedy-GQ algorithm converges as fast as (mathcal {O}({1}/{sqrt{T}})) under the i.i.d. setting and (mathcal {O}({log T}/{sqrt{T}})) under the Markovian setting. We further design variant of the vanilla Greedy-GQ algorithm using the nested-loop approach, and show that its sample complexity is (mathcal {O}({log (1/epsilon )epsilon ^{-2}})), which matches with the one of the vanilla Greedy-GQ. Our finite-time error bounds match with the one of the stochastic gradient descent algorithm for general smooth non-convex optimization problems, despite of its additonal challenge in the two time-scale updates. Our finite-sample analysis provides theoretical guidance on choosing step-sizes for faster convergence in practice, and suggests the trade-off between the convergence rate and the quality of the obtained policy. Our techniques provide a general approach for finite-sample analysis of non-convex two timescale value-based reinforcement learning algorithms.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"41 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140841521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semantic-enhanced graph neural networks with global context representation 具有全局上下文表示的语义增强图神经网络

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-04-29 DOI: 10.1007/s10994-024-06523-0

Youcheng Qian, Xueyan Yin

{"title":"Semantic-enhanced graph neural networks with global context representation","authors":"Youcheng Qian, Xueyan Yin","doi":"10.1007/s10994-024-06523-0","DOIUrl":"https://doi.org/10.1007/s10994-024-06523-0","url":null,"abstract":"Node classification is a crucial task for efficiently analyzing graph-structured data. Related semi-supervised methods have been extensively studied to address the scarcity of labeled data in emerging classes. However, two fundamental weaknesses hinder the performance: lacking the ability to mine latent semantic information between nodes, or ignoring to simultaneously capture local and global coupling dependencies between different nodes. To solve these limitations, we propose a novel semantic-enhanced graph neural networks with global context representation for semi-supervised node classification. Specifically, we first use graph convolution network to learn short-range local dependencies, which not only considers the spatial topological structure relationship between nodes, but also takes into account the semantic correlation between nodes to enhance the representation ability of nodes. Second, an improved Transformer model is introduced to reasoning the long-range global pairwise relationships, which has linear computational complexity and is particularly important for large datasets. Finally, the proposed model shows strong performance on various open datasets, demonstrating the superiority of our solutions.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"53 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140841106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Explaining Siamese networks in few-shot learning 解释少儿学习中的连体网络

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-04-29 DOI: 10.1007/s10994-024-06529-8

Andrea Fedele, Riccardo Guidotti, Dino Pedreschi

{"title":"Explaining Siamese networks in few-shot learning","authors":"Andrea Fedele, Riccardo Guidotti, Dino Pedreschi","doi":"10.1007/s10994-024-06529-8","DOIUrl":"https://doi.org/10.1007/s10994-024-06529-8","url":null,"abstract":"Machine learning models often struggle to generalize accurately when tested on new class distributions that were not present in their training data. This is a significant challenge for real-world applications that require quick adaptation without the need for retraining. To address this issue, few-shot learning frameworks, which includes models such as Siamese Networks, have been proposed. Siamese Networks learn similarity between pairs of records through a metric that can be easily extended to new, unseen classes. However, these systems lack interpretability, which can hinder their use in certain applications. To address this, we propose a data-agnostic method to explain the outcomes of Siamese Networks in the context of few-shot learning. Our explanation method is based on a post-hoc perturbation-based procedure that evaluates the contribution of individual input features to the final outcome. As such, it falls under the category of post-hoc explanation methods. We present two variants, one that considers each input feature independently, and another that evaluates the interplay between features. Additionally, we propose two perturbation procedures to evaluate feature contributions. Qualitative and quantitative results demonstrate that our method is able to identify highly discriminant intra-class and inter-class characteristics, as well as predictive behaviors that lead to misclassification by relying on incorrect features.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"38 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140841001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0