Machine Learning最新文献

Deep latent force models: ODE-based process convolutions for Bayesian deep learning. 深度潜力模型：基于ode的贝叶斯深度学习过程卷积。

IF 4.3 3区计算机科学

Machine Learning Pub Date : 2025-01-01 Epub Date: 2025-07-15 DOI: 10.1007/s10994-025-06824-y

Thomas Baldwin-McDonald, Xinxing Shi, Mingxin Shen, Mauricio A Álvarez

{"title":"Deep latent force models: ODE-based process convolutions for Bayesian deep learning.","authors":"Thomas Baldwin-McDonald, Xinxing Shi, Mingxin Shen, Mauricio A Álvarez","doi":"10.1007/s10994-025-06824-y","DOIUrl":"10.1007/s10994-025-06824-y","url":null,"abstract":"Modelling the behaviour of highly nonlinear dynamical systems with robust uncertainty quantification is a challenging task which typically requires approaches specifically designed to address the problem at hand. We introduce a domain-agnostic model to address this issue termed the deep latent force model (DLFM), a deep Gaussian process with physics-informed kernels at each layer, derived from ordinary differential equations using the framework of process convolutions. Two distinct formulations of the DLFM are presented which utilise weight-space and variational inducing points-based Gaussian process approximations, both of which are amenable to doubly stochastic variational inference. We present empirical evidence of the capability of the DLFM to capture the dynamics present in highly nonlinear real-world multi-output time series data. Additionally, we find that the DLFM is capable of achieving comparable performance to a range of non-physics-informed probabilistic models on benchmark univariate regression tasks. We also empirically assess the negative impact of the inducing points framework on the extrapolation capabilities of LFM-based models.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"114 8","pages":"192"},"PeriodicalIF":4.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12263784/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144660909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computing the distance between unbalanced distributions: the flat metric. 计算不平衡分布之间的距离：平坦度量。

IF 2.9 3区计算机科学

Machine Learning Pub Date : 2025-01-01 Epub Date: 2025-07-24 DOI: 10.1007/s10994-025-06828-8

Henri Schmidt, Christian Düll

引用次数: 0

Interpretable optimisation-based approach for hyper-box classification. 基于可解释优化的超箱分类方法

IF 4.3 3区计算机科学

Machine Learning Pub Date : 2025-01-01 Epub Date: 2025-02-06 DOI: 10.1007/s10994-024-06643-7

Georgios I Liapis, Sophia Tsoka, Lazaros G Papageorgiou

{"title":"Interpretable optimisation-based approach for hyper-box classification.","authors":"Georgios I Liapis, Sophia Tsoka, Lazaros G Papageorgiou","doi":"10.1007/s10994-024-06643-7","DOIUrl":"10.1007/s10994-024-06643-7","url":null,"abstract":"Data classification is considered a fundamental research subject within the machine learning community. Researchers seek the improvement of machine learning algorithms in not only accuracy, but also interpretability. Interpretable algorithms allow humans to easily understand the decisions that a machine learning model makes, which is challenging for black box models. Mathematical programming-based classification algorithms have attracted considerable attention due to their ability to effectively compete with leading-edge algorithms in terms of both accuracy and interpretability. Meanwhile, the training of a hyper-box classifier can be mathematically formulated as a Mixed Integer Linear Programming (MILP) model and the predictions combine accuracy and interpretability. In this work, an optimisation-based approach is proposed for multi-class data classification using a hyper-box representation, thus facilitating the extraction of compact IF-THEN rules. The key novelty of our approach lies in the minimisation of the number and length of the generated rules for enhanced interpretability. Through a number of real-world datasets, it is demonstrated that the algorithm exhibits favorable performance when compared to well-known alternatives in terms of prediction accuracy and rule set simplicity.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"114 3","pages":"51"},"PeriodicalIF":4.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11861270/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143525101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Offline reinforcement learning for learning to dispatch for job shop scheduling. 离线强化学习学习调度作业车间调度。

IF 4.3 3区计算机科学

Machine Learning Pub Date : 2025-01-01 Epub Date: 2025-07-15 DOI: 10.1007/s10994-025-06826-w

Jesse van Remmerden, Zaharah Bukhsh, Yingqian Zhang

{"title":"Offline reinforcement learning for learning to dispatch for job shop scheduling.","authors":"Jesse van Remmerden, Zaharah Bukhsh, Yingqian Zhang","doi":"10.1007/s10994-025-06826-w","DOIUrl":"10.1007/s10994-025-06826-w","url":null,"abstract":"The Job Shop Scheduling Problem (JSSP) is a complex combinatorial optimization problem. While online Reinforcement Learning (RL) has shown promise by quickly finding acceptable solutions for JSSP, it faces key limitations: it requires extensive training interactions from scratch leading to sample inefficiency, cannot leverage existing high-quality solutions from traditional methods like Constraint Programming (CP), and require simulated environments to train in, which are impracticable to build for complex scheduling environments. We introduce Offline Learned Dispatching (Offline-LD), an offline reinforcement learning approach for JSSP, which addresses these limitations by learning from historical scheduling data. Our approach is motivated by scenarios where historical scheduling data and expert solutions are available or scenarios where online training of RL approaches with simulated environments is impracticable. Offline-LD introduces maskable variants of two Q-learning methods, namely, Maskable Quantile Regression DQN (mQRDQN) and discrete maskable Soft Actor-Critic (d-mSAC), that are able to learn from historical data, through Conservative Q-Learning (CQL), whereby we present a novel entropy bonus modification for d-mSAC, for maskable action spaces. Moreover, we introduce a novel reward normalization method for JSSP in an offline RL setting. Our experiments demonstrate that Offline-LD outperforms online RL on both generated and benchmark instances when trained on only 100 solutions generated by CP. Notably, introducing noise to the expert dataset yields comparable or superior results to using the expert dataset, with the same amount of instances, a promising finding for real-world applications, where data is inherently noisy and imperfect.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"114 8","pages":"191"},"PeriodicalIF":4.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12263752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144660910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On metafeatures’ ability of implicit concept identification 关于元特征的内隐概念识别能力

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-09-18 DOI: 10.1007/s10994-024-06612-0

Joanna Komorniczak, Paweł Ksieniewicz

引用次数: 0

Towards a foundation large events model for soccer 建立足球大型活动基础模型

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-09-13 DOI: 10.1007/s10994-024-06606-y

Tiago Mendes-Neves, Luís Meireles, João Mendes-Moreira

引用次数: 0

Persistent Laplacian-enhanced algorithm for scarcely labeled data classification 用于稀少标记数据分类的持续拉普拉斯增强算法

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-09-13 DOI: 10.1007/s10994-024-06616-w

Gokul Bhusal, Ekaterina Merkurjev, Guo-Wei Wei

{"title":"Persistent Laplacian-enhanced algorithm for scarcely labeled data classification","authors":"Gokul Bhusal, Ekaterina Merkurjev, Guo-Wei Wei","doi":"10.1007/s10994-024-06616-w","DOIUrl":"https://doi.org/10.1007/s10994-024-06616-w","url":null,"abstract":"The success of many machine learning (ML) methods depends crucially on having large amounts of labeled data. However, obtaining enough labeled data can be expensive, time-consuming, and subject to ethical constraints for many applications. One approach that has shown tremendous value in addressing this challenge is semi-supervised learning (SSL); this technique utilizes both labeled and unlabeled data during training, often with much less labeled data than unlabeled data, which is often relatively easy and inexpensive to obtain. In fact, SSL methods are particularly useful in applications where the cost of labeling data is especially expensive, such as medical analysis, natural language processing, or speech recognition. A subset of SSL methods that have achieved great success in various domains involves algorithms that integrate graph-based techniques. These procedures are popular due to the vast amount of information provided by the graphical framework. In this work, we propose an algebraic topology-based semi-supervised method called persistent Laplacian-enhanced graph MBO by integrating persistent spectral graph theory with the classical Merriman–Bence–Osher (MBO) scheme. Specifically, we use a filtration procedure to generate a sequence of chain complexes and associated families of simplicial complexes, from which we construct a family of persistent Laplacians. Overall, it is a very efficient procedure that requires much less labeled data to perform well compared to many ML techniques, and it can be adapted for both small and large datasets. We evaluate the performance of our method on classification, and the results indicate that the technique outperforms other existing semi-supervised algorithms.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"176 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conformal prediction for regression models with asymmetrically distributed errors: application to aircraft navigation during landing maneuver 具有非对称分布误差的回归模型的共形预测：应用于着陆机动过程中的飞机导航

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-09-09 DOI: 10.1007/s10994-024-06615-x

Solène Vilfroy, Lionel Bombrun, Thierry Urruty, Florence De Grancey, Jean-Philippe Lebrat, Philippe Carré

{"title":"Conformal prediction for regression models with asymmetrically distributed errors: application to aircraft navigation during landing maneuver","authors":"Solène Vilfroy, Lionel Bombrun, Thierry Urruty, Florence De Grancey, Jean-Philippe Lebrat, Philippe Carré","doi":"10.1007/s10994-024-06615-x","DOIUrl":"https://doi.org/10.1007/s10994-024-06615-x","url":null,"abstract":"Semi-autonomous aircraft navigation is a high-risk domain where confidence on the prediction is required. For that, this paper introduces the use of conformal predictions strategies for regression problems. While standard approaches use an absolute nonconformity scores, we aim at introducing a signed version of the nonconformity scores. Experimental results on synthetic data have shown their interest for non-centered errors. Moreover, in order to reduce the width of the prediction interval, we introduce an optimization procedure which learn the optimal alpha risks for the lower and upper bounds of the interval. In practice, we show that a line search algorithm can be employed to solve it. Practically, this novel adaptive conformal prediction strategy has revealed to be well adapted for skew distributed errors. In addition, an extension of these conformal prediction strategies is introduced to incorporate numeric and categorical auxiliary variables describing the acquisition context. Based on a quantile regression model, they allow to maintain the coverage for each metadata value. All these strategies have then been applied on a real use case of runway localization from data acquired by an aircraft during landing maneuver. Extensive experiments on multiple airports have shown the interest of the proposed conformal prediction strategies, in particular for runways equipped with a very long ramp approach where asymmetric angular deviation error are observed.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"38 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

In-game soccer outcome prediction with offline reinforcement learning 利用离线强化学习预测场内足球比赛结果

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-09-06 DOI: 10.1007/s10994-024-06611-1

Pegah Rahimian, Balazs Mark Mihalyi, Laszlo Toka

引用次数: 0

Evaluating large language models for user stance detection on X (Twitter) 评估用于 X（推特）上用户立场检测的大型语言模型

IF 7.5 3区计算机科学

Machine Learning Pub Date : 2024-09-06 DOI: 10.1007/s10994-024-06587-y

Margherita Gambini, Caterina Senette, Tiziano Fagni, Maurizio Tesconi

{"title":"Evaluating large language models for user stance detection on X (Twitter)","authors":"Margherita Gambini, Caterina Senette, Tiziano Fagni, Maurizio Tesconi","doi":"10.1007/s10994-024-06587-y","DOIUrl":"https://doi.org/10.1007/s10994-024-06587-y","url":null,"abstract":"Current stance detection methods employ topic-aligned data, resulting in many unexplored topics due to insufficient training samples. Large Language Models (LLMs) pre-trained on a vast amount of web data offer a viable solution when training data is unavailable. This work introduces Tweets2Stance - T2S, an unsupervised stance detection framework based on zero-shot classification, i.e. leveraging an LLM pre-trained on Natural Language Inference tasks. T2S detects a five-valued user’s stance on social-political statements by analyzing their X (Twitter) timeline. The Ground Truth of a user’s stance is obtained from Voting Advice Applications (VAAs). Through comprehensive experiments, a T2S’s optimal setting was identified for each election. Linguistic limitations related to the language model are further addressed by integrating state-of-the-art LLMs like GPT-4 and Mixtral into the T2S framework. The T2S framework’s generalization potential is demonstrated by measuring its performance (F1 and MAE scores) across nine datasets. These datasets were built by collecting tweets from competing parties’ Twitter accounts in nine political elections held in different countries from 2019 to 2021. The results, in terms of F1 and MAE scores, outperformed all baselines and approached the best scores for each election. This showcases the ability of T2S, particularly when combined with state-of-the-art LLMs, to generalize across different cultural-political contexts.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"26 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142226584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0