Machine LearningPub Date : 2025-01-01Epub Date: 2025-07-15DOI: 10.1007/s10994-025-06824-y
Thomas Baldwin-McDonald, Xinxing Shi, Mingxin Shen, Mauricio A Álvarez
{"title":"Deep latent force models: ODE-based process convolutions for Bayesian deep learning.","authors":"Thomas Baldwin-McDonald, Xinxing Shi, Mingxin Shen, Mauricio A Álvarez","doi":"10.1007/s10994-025-06824-y","DOIUrl":"10.1007/s10994-025-06824-y","url":null,"abstract":"<p><p>Modelling the behaviour of highly nonlinear dynamical systems with robust uncertainty quantification is a challenging task which typically requires approaches specifically designed to address the problem at hand. We introduce a domain-agnostic model to address this issue termed the deep latent force model (DLFM), a deep Gaussian process with physics-informed kernels at each layer, derived from ordinary differential equations using the framework of process convolutions. Two distinct formulations of the DLFM are presented which utilise weight-space and variational inducing points-based Gaussian process approximations, both of which are amenable to doubly stochastic variational inference. We present empirical evidence of the capability of the DLFM to capture the dynamics present in highly nonlinear real-world multi-output time series data. Additionally, we find that the DLFM is capable of achieving comparable performance to a range of non-physics-informed probabilistic models on benchmark univariate regression tasks. We also empirically assess the negative impact of the inducing points framework on the extrapolation capabilities of LFM-based models.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"114 8","pages":"192"},"PeriodicalIF":4.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12263784/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144660909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2025-01-01Epub Date: 2025-07-24DOI: 10.1007/s10994-025-06828-8
Henri Schmidt, Christian Düll
{"title":"Computing the distance between unbalanced distributions: the flat metric.","authors":"Henri Schmidt, Christian Düll","doi":"10.1007/s10994-025-06828-8","DOIUrl":"10.1007/s10994-025-06828-8","url":null,"abstract":"<p><p>We provide an implementation to compute the flat metric in any dimension. The flat metric, also called dual bounded Lipschitz distance, generalizes the well-known Wasserstein distance <math><msub><mi>W</mi> <mn>1</mn></msub> </math> to the case that the distributions are of unequal total mass. Thus, our implementation adapts very well to mass differences and uses them to distinguish between different distributions. This is of particular interest for unbalanced optimal transport tasks and for the analysis of data distributions where the sample size is important or normalization is not possible. The core of the method is based on a neural network to determine an optimal test function realizing the distance between two given measures. Special focus was put on achieving comparability of pairwise computed distances from independently trained networks. We tested the quality of the output in several experiments where ground truth was available as well as with simulated data.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"114 8","pages":"195"},"PeriodicalIF":2.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12289810/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144734905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2025-01-01Epub Date: 2025-02-06DOI: 10.1007/s10994-024-06643-7
Georgios I Liapis, Sophia Tsoka, Lazaros G Papageorgiou
{"title":"Interpretable optimisation-based approach for hyper-box classification.","authors":"Georgios I Liapis, Sophia Tsoka, Lazaros G Papageorgiou","doi":"10.1007/s10994-024-06643-7","DOIUrl":"10.1007/s10994-024-06643-7","url":null,"abstract":"<p><p>Data classification is considered a fundamental research subject within the machine learning community. Researchers seek the improvement of machine learning algorithms in not only accuracy, but also interpretability. Interpretable algorithms allow humans to easily understand the decisions that a machine learning model makes, which is challenging for black box models. Mathematical programming-based classification algorithms have attracted considerable attention due to their ability to effectively compete with leading-edge algorithms in terms of both accuracy and interpretability. Meanwhile, the training of a hyper-box classifier can be mathematically formulated as a Mixed Integer Linear Programming (MILP) model and the predictions combine accuracy and interpretability. In this work, an optimisation-based approach is proposed for multi-class data classification using a hyper-box representation, thus facilitating the extraction of compact IF-THEN rules. The key novelty of our approach lies in the minimisation of the number and length of the generated rules for enhanced interpretability. Through a number of real-world datasets, it is demonstrated that the algorithm exhibits favorable performance when compared to well-known alternatives in terms of prediction accuracy and rule set simplicity.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"114 3","pages":"51"},"PeriodicalIF":4.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11861270/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143525101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2025-01-01Epub Date: 2025-07-15DOI: 10.1007/s10994-025-06826-w
Jesse van Remmerden, Zaharah Bukhsh, Yingqian Zhang
{"title":"Offline reinforcement learning for learning to dispatch for job shop scheduling.","authors":"Jesse van Remmerden, Zaharah Bukhsh, Yingqian Zhang","doi":"10.1007/s10994-025-06826-w","DOIUrl":"10.1007/s10994-025-06826-w","url":null,"abstract":"<p><p>The Job Shop Scheduling Problem (JSSP) is a complex combinatorial optimization problem. While online Reinforcement Learning (RL) has shown promise by quickly finding acceptable solutions for JSSP, it faces key limitations: it requires extensive training interactions from scratch leading to sample inefficiency, cannot leverage existing high-quality solutions from traditional methods like Constraint Programming (CP), and require simulated environments to train in, which are impracticable to build for complex scheduling environments. We introduce Offline Learned Dispatching (Offline-LD), an offline reinforcement learning approach for JSSP, which addresses these limitations by learning from historical scheduling data. Our approach is motivated by scenarios where historical scheduling data and expert solutions are available or scenarios where online training of RL approaches with simulated environments is impracticable. Offline-LD introduces maskable variants of two Q-learning methods, namely, Maskable Quantile Regression DQN (mQRDQN) and discrete maskable Soft Actor-Critic (d-mSAC), that are able to learn from historical data, through Conservative Q-Learning (CQL), whereby we present a novel entropy bonus modification for d-mSAC, for maskable action spaces. Moreover, we introduce a novel reward normalization method for JSSP in an offline RL setting. Our experiments demonstrate that Offline-LD outperforms online RL on both generated and benchmark instances when trained on only 100 solutions generated by CP. Notably, introducing noise to the expert dataset yields comparable or superior results to using the expert dataset, with the same amount of instances, a promising finding for real-world applications, where data is inherently noisy and imperfect.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"114 8","pages":"191"},"PeriodicalIF":4.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12263752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144660910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-09-18DOI: 10.1007/s10994-024-06612-0
Joanna Komorniczak, Paweł Ksieniewicz
{"title":"On metafeatures’ ability of implicit concept identification","authors":"Joanna Komorniczak, Paweł Ksieniewicz","doi":"10.1007/s10994-024-06612-0","DOIUrl":"https://doi.org/10.1007/s10994-024-06612-0","url":null,"abstract":"<p>Concept drift in data stream processing remains an intriguing challenge and states a popular research topic. Methods that actively process data streams usually employ drift detectors, whose performance is often based on monitoring the variability of different stream properties. This publication provides an overview and analysis of metafeatures variability describing data streams with concept drifts. Five experiments conducted on synthetic, semi-synthetic, and real-world data streams examine the ability of over 160 metafeatures from 9 categories to recognize concepts in non-stationary data streams. The work reveals the distinctions in the considered sources of streams and specifies 17 metafeatures with a high ability of concept identification.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"51 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-09-13DOI: 10.1007/s10994-024-06606-y
Tiago Mendes-Neves, Luís Meireles, João Mendes-Moreira
{"title":"Towards a foundation large events model for soccer","authors":"Tiago Mendes-Neves, Luís Meireles, João Mendes-Moreira","doi":"10.1007/s10994-024-06606-y","DOIUrl":"https://doi.org/10.1007/s10994-024-06606-y","url":null,"abstract":"<p>This paper introduces the Large Events Model (LEM) for soccer, a novel deep learning framework for generating and analyzing soccer matches. The framework can simulate games from a given game state, with its primary output being the ensuing probabilities and events from multiple simulations. These can provide insights into match dynamics and underlying mechanisms. We discuss the framework’s design, features, and methodologies, including model optimization, data processing, and evaluation techniques. The models within this framework are developed to predict specific aspects of soccer events, such as event type, success likelihood, and further details. In an applied context, we showcase the estimation of xP+, a metric estimating a player’s contribution to the team’s points earned. This work ultimately enhances the field of sports event prediction and practical applications and emphasizes the potential for this kind of method.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"23 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-09-13DOI: 10.1007/s10994-024-06616-w
Gokul Bhusal, Ekaterina Merkurjev, Guo-Wei Wei
{"title":"Persistent Laplacian-enhanced algorithm for scarcely labeled data classification","authors":"Gokul Bhusal, Ekaterina Merkurjev, Guo-Wei Wei","doi":"10.1007/s10994-024-06616-w","DOIUrl":"https://doi.org/10.1007/s10994-024-06616-w","url":null,"abstract":"<p>The success of many machine learning (ML) methods depends crucially on having large amounts of labeled data. However, obtaining enough labeled data can be expensive, time-consuming, and subject to ethical constraints for many applications. One approach that has shown tremendous value in addressing this challenge is semi-supervised learning (SSL); this technique utilizes both labeled and unlabeled data during training, often with much less labeled data than unlabeled data, which is often relatively easy and inexpensive to obtain. In fact, SSL methods are particularly useful in applications where the cost of labeling data is especially expensive, such as medical analysis, natural language processing, or speech recognition. A subset of SSL methods that have achieved great success in various domains involves algorithms that integrate graph-based techniques. These procedures are popular due to the vast amount of information provided by the graphical framework. In this work, we propose an algebraic topology-based semi-supervised method called persistent Laplacian-enhanced graph MBO by integrating persistent spectral graph theory with the classical Merriman–Bence–Osher (MBO) scheme. Specifically, we use a filtration procedure to generate a sequence of chain complexes and associated families of simplicial complexes, from which we construct a family of persistent Laplacians. Overall, it is a very efficient procedure that requires much less labeled data to perform well compared to many ML techniques, and it can be adapted for both small and large datasets. We evaluate the performance of our method on classification, and the results indicate that the technique outperforms other existing semi-supervised algorithms.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"176 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-09-09DOI: 10.1007/s10994-024-06615-x
Solène Vilfroy, Lionel Bombrun, Thierry Urruty, Florence De Grancey, Jean-Philippe Lebrat, Philippe Carré
{"title":"Conformal prediction for regression models with asymmetrically distributed errors: application to aircraft navigation during landing maneuver","authors":"Solène Vilfroy, Lionel Bombrun, Thierry Urruty, Florence De Grancey, Jean-Philippe Lebrat, Philippe Carré","doi":"10.1007/s10994-024-06615-x","DOIUrl":"https://doi.org/10.1007/s10994-024-06615-x","url":null,"abstract":"<p>Semi-autonomous aircraft navigation is a high-risk domain where confidence on the prediction is required. For that, this paper introduces the use of conformal predictions strategies for regression problems. While standard approaches use an absolute nonconformity scores, we aim at introducing a signed version of the nonconformity scores. Experimental results on synthetic data have shown their interest for non-centered errors. Moreover, in order to reduce the width of the prediction interval, we introduce an optimization procedure which learn the optimal alpha risks for the lower and upper bounds of the interval. In practice, we show that a line search algorithm can be employed to solve it. Practically, this novel adaptive conformal prediction strategy has revealed to be well adapted for skew distributed errors. In addition, an extension of these conformal prediction strategies is introduced to incorporate numeric and categorical auxiliary variables describing the acquisition context. Based on a quantile regression model, they allow to maintain the coverage for each metadata value. All these strategies have then been applied on a real use case of runway localization from data acquired by an aircraft during landing maneuver. Extensive experiments on multiple airports have shown the interest of the proposed conformal prediction strategies, in particular for runways equipped with a very long ramp approach where asymmetric angular deviation error are observed.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"38 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-09-06DOI: 10.1007/s10994-024-06611-1
Pegah Rahimian, Balazs Mark Mihalyi, Laszlo Toka
{"title":"In-game soccer outcome prediction with offline reinforcement learning","authors":"Pegah Rahimian, Balazs Mark Mihalyi, Laszlo Toka","doi":"10.1007/s10994-024-06611-1","DOIUrl":"https://doi.org/10.1007/s10994-024-06611-1","url":null,"abstract":"<p>Predicting outcomes in soccer is crucial for various stakeholders, including teams, leagues, bettors, the betting industry, media, and fans. With advancements in computer vision, player tracking data has become abundant, leading to the development of sophisticated soccer analytics models. However, existing models often rely solely on spatiotemporal features derived from player tracking data, which may not fully capture the complexities of in-game dynamics. In this paper, we present an end-to-end system that leverages raw event and tracking data to predict both offensive and defensive actions, along with the optimal decision for each game scenario, based solely on historical game data. Our model incorporates the effectiveness of these actions to accurately predict win probabilities at every minute of the game. Experimental results demonstrate the effectiveness of our approach, achieving an accuracy of 87% in predicting offensive and defensive actions. Furthermore, our in-game outcome prediction model exhibits an error rate of 0.1, outperforming counterpart models and bookmakers’ odds.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"32 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating large language models for user stance detection on X (Twitter)","authors":"Margherita Gambini, Caterina Senette, Tiziano Fagni, Maurizio Tesconi","doi":"10.1007/s10994-024-06587-y","DOIUrl":"https://doi.org/10.1007/s10994-024-06587-y","url":null,"abstract":"<p>Current stance detection methods employ topic-aligned data, resulting in many unexplored topics due to insufficient training samples. Large Language Models (LLMs) pre-trained on a vast amount of web data offer a viable solution when training data is unavailable. This work introduces <i>Tweets2Stance - T2S</i>, an unsupervised stance detection framework based on zero-shot classification, i.e. leveraging an LLM pre-trained on Natural Language Inference tasks. T2S detects a five-valued user’s stance on social-political statements by analyzing their X (Twitter) timeline. The Ground Truth of a user’s stance is obtained from Voting Advice Applications (VAAs). Through comprehensive experiments, a T2S’s optimal setting was identified for each election. Linguistic limitations related to the language model are further addressed by integrating state-of-the-art LLMs like GPT-4 and Mixtral into the <i>T2S</i> framework. The <i>T2S</i> framework’s generalization potential is demonstrated by measuring its performance (F1 and MAE scores) across nine datasets. These datasets were built by collecting tweets from competing parties’ Twitter accounts in nine political elections held in different countries from 2019 to 2021. The results, in terms of F1 and MAE scores, outperformed all baselines and approached the best scores for each election. This showcases the ability of T2S, particularly when combined with state-of-the-art LLMs, to generalize across different cultural-political contexts.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"26 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142226584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}