Machine Learning最新文献

筛选
英文 中文
Explainable dating of greek papyri images 希腊纸莎草纸图像的可解释年代
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-11 DOI: 10.1007/s10994-024-06589-w
John Pavlopoulos, Maria Konstantinidou, Elpida Perdiki, Isabelle Marthot-Santaniello, Holger Essler, Georgios Vardakas, Aristidis Likas
{"title":"Explainable dating of greek papyri images","authors":"John Pavlopoulos, Maria Konstantinidou, Elpida Perdiki, Isabelle Marthot-Santaniello, Holger Essler, Georgios Vardakas, Aristidis Likas","doi":"10.1007/s10994-024-06589-w","DOIUrl":"https://doi.org/10.1007/s10994-024-06589-w","url":null,"abstract":"<p>Greek literary papyri, which are unique witnesses of antique literature, do not usually bear a date. They are thus currently dated based on palaeographical methods, with broad approximations which often span more than a century. We created a dataset of 242 images of papyri written in “bookhand” scripts whose date can be securely assigned, and we used it to train algorithms for the task of dating, showing its challenging nature. To address data scarcity, we extended our dataset by segmenting each image into its respective text lines. By using the line-based version of our dataset, we trained a Convolutional Neural Network, equipped with a fragmentation-based augmentation strategy, and we achieved a mean absolute error of 54 years. The results improve further when the task is cast as a multi-class classification problem, predicting the century. Using our network, we computed precise date estimations for papyri whose date is disputed or vaguely defined, employing explainability to understand dating-driving features.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"29 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Moreau-Yoshida variational transport: a general framework for solving regularized distributional optimization problems 莫罗-吉田变分传输:解决正则分布优化问题的一般框架
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-10 DOI: 10.1007/s10994-024-06586-z
Dai Hai Nguyen, Tetsuya Sakurai
{"title":"Moreau-Yoshida variational transport: a general framework for solving regularized distributional optimization problems","authors":"Dai Hai Nguyen, Tetsuya Sakurai","doi":"10.1007/s10994-024-06586-z","DOIUrl":"https://doi.org/10.1007/s10994-024-06586-z","url":null,"abstract":"<p>We address a general optimization problem involving the minimization of a composite objective functional defined over a class of probability distributions. The objective function consists of two components: one assumed to have a variational representation, and the other expressed in terms of the expectation operator of a possibly nonsmooth convex regularizer function. Such a regularized distributional optimization problem widely appears in machine learning and statistics, including proximal Monte-Carlo sampling, Bayesian inference, and generative modeling for regularized estimation and generation. Our proposed method, named Moreau-Yoshida Variational Transport (MYVT), introduces a novel approach to tackle this regularized distributional optimization problem. First, as the name suggests, our method utilizes the Moreau-Yoshida envelope to provide a smooth approximation of the nonsmooth function in the objective. Second, we reformulate the approximate problem as a concave-convex saddle point problem by leveraging the variational representation. Subsequently, we develop an efficient primal–dual algorithm to approximate the saddle point. Furthermore, we provide theoretical analyses and present experimental results to showcase the effectiveness of the proposed method.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"20 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141584981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Permutation-invariant linear classifiers 置换不变线性分类器
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06561-8
Ludwig Lausser, Robin Szekely, Hans A. Kestler
{"title":"Permutation-invariant linear classifiers","authors":"Ludwig Lausser, Robin Szekely, Hans A. Kestler","doi":"10.1007/s10994-024-06561-8","DOIUrl":"https://doi.org/10.1007/s10994-024-06561-8","url":null,"abstract":"<p>Invariant concept classes form the backbone of classification algorithms immune to specific data transformations, ensuring consistent predictions regardless of these alterations. However, this robustness can come at the cost of limited access to the original sample information, potentially impacting generalization performance. This study introduces an addition to these classes—the permutation-invariant linear classifiers. Distinguished by their structural characteristics, permutation-invariant linear classifiers are unaffected by permutations on feature vectors, a property not guaranteed by other non-constant linear classifiers. The study characterizes this new concept class, highlighting its constant capacity, independent of input dimensionality. In practical assessments using linear support vector machines, the permutation-invariant classifiers exhibit superior performance in permutation experiments on artificial datasets and real mutation profiles. Interestingly, they outperform general linear classifiers not only in permutation experiments but also in permutation-free settings, surpassing unconstrained counterparts. Additionally, findings from real mutation profiles support the significance of tumor mutational burden as a biomarker.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"65 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regional bias in monolingual English language models 单语英语语言模型中的地区偏差
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06555-6
Jiachen Lyu, Katharina Dost, Yun Sing Koh, Jörg Wicker
{"title":"Regional bias in monolingual English language models","authors":"Jiachen Lyu, Katharina Dost, Yun Sing Koh, Jörg Wicker","doi":"10.1007/s10994-024-06555-6","DOIUrl":"https://doi.org/10.1007/s10994-024-06555-6","url":null,"abstract":"<p>In Natural Language Processing (NLP), pre-trained language models (LLMs) are widely employed and refined for various tasks. These models have shown considerable social and geographic biases creating skewed or even unfair representations of certain groups. Research focuses on biases toward L2 (English as a second language) regions but neglects bias within L1 (first language) regions. In this work, we ask if there is regional bias within L1 regions already inherent in pre-trained LLMs and, if so, what the consequences are in terms of downstream model performance. We contribute an investigation framework specifically tailored for low-resource regions, offering a method to identify bias without imposing strict requirements for labeled datasets. Our research reveals subtle geographic variations in the word embeddings of BERT, even in cultures traditionally perceived as similar. These nuanced features, once captured, have the potential to significantly impact downstream tasks. Generally, models exhibit comparable performance on datasets that share similarities, and conversely, performance may diverge when datasets differ in their nuanced features embedded within the language. It is crucial to note that estimating model performance solely based on standard benchmark datasets may not necessarily apply to the datasets with distinct features from the benchmark datasets. Our proposed framework plays a pivotal role in identifying and addressing biases detected in word embeddings, particularly evident in low-resource regions such as New Zealand.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"35 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conformal predictions for probabilistically robust scalable machine learning classification 针对概率稳健可扩展机器学习分类的共形预测
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06571-6
Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli
{"title":"Conformal predictions for probabilistically robust scalable machine learning classification","authors":"Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli","doi":"10.1007/s10994-024-06571-6","DOIUrl":"https://doi.org/10.1007/s10994-024-06571-6","url":null,"abstract":"<p>Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined <span>(varepsilon)</span> error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"72 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural discovery of balance-aware polarized communities 神经发现平衡感知极化群落
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06581-4
Francesco Gullo, Domenico Mandaglio, Andrea Tagarelli
{"title":"Neural discovery of balance-aware polarized communities","authors":"Francesco Gullo, Domenico Mandaglio, Andrea Tagarelli","doi":"10.1007/s10994-024-06581-4","DOIUrl":"https://doi.org/10.1007/s10994-024-06581-4","url":null,"abstract":"<p><i>Signed graphs</i> are a model to depict friendly (<i>positive</i>) or antagonistic (<i>negative</i>) interactions (edges) among users (nodes). <span>2-Polarized-Communities</span> (<span>2pc</span>) is a well-established combinatorial-optimization problem whose goal is to find two <i>polarized</i> communities from a signed graph, i.e., two subsets of nodes (disjoint, but not necessarily covering the entire node set) which exhibit a high number of both intra-community positive edges and negative inter-community edges. The state of the art in <span>2pc</span> suffers from the limitations that (<i>i</i>) existing methods rely on a single (optimal) solution to a continuous relaxation of the problem in order to produce the ultimate discrete solution via rounding, and (<i>ii</i>) <span>2pc</span> objective function comes with no control on size balance among communities. In this paper, we provide advances to the <span>2pc</span> problem by addressing both these limitations, with a twofold contribution. First, we devise a novel neural approach that allows for soundly and elegantly explore a variety of suboptimal solutions to the relaxed <span>2pc</span> problem, so as to pick the one that leads to the best discrete solution after rounding. Second, we introduce a generalization of <span>2pc</span> objective function – termed <span>(gamma )</span>-<i>polarity </i>– which fosters size balance among communities, and we incorporate it into the proposed machine-learning framework. Extensive experiments attest high accuracy of our approach, its superiority over the state of the art, and capability of function <span>(gamma )</span>-polarity to discover high-quality size-balanced communities.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"179 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141577849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FairMOE: counterfactually-fair mixture of experts with levels of interpretability FairMOE: 具有可解释性水平的反事实公平专家混合物
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-08 DOI: 10.1007/s10994-024-06583-2
Joe Germino, Nuno Moniz, Nitesh V. Chawla
{"title":"FairMOE: counterfactually-fair mixture of experts with levels of interpretability","authors":"Joe Germino, Nuno Moniz, Nitesh V. Chawla","doi":"10.1007/s10994-024-06583-2","DOIUrl":"https://doi.org/10.1007/s10994-024-06583-2","url":null,"abstract":"<p>With the rise of artificial intelligence in our everyday lives, the need for human interpretation of machine learning models’ predictions emerges as a critical issue. Generally, interpretability is viewed as a binary notion with a performance trade-off. Either a model is fully-interpretable but lacks the ability to capture more complex patterns in the data, or it is a black box. In this paper, we argue that this view is severely limiting and that instead interpretability should be viewed as a continuous domain-informed concept. We leverage the well-known Mixture of Experts architecture with user-defined limits on non-interpretability. We extend this idea with a counterfactual fairness module to ensure the selection of consistently <i>fair</i> experts: <b>FairMOE</b>. We perform an extensive experimental evaluation with fairness-related data sets and compare our proposal against state-of-the-art methods. Our results demonstrate that FairMOE is competitive with the leading fairness-aware algorithms in both fairness and predictive measures while providing more consistent performance, competitive scalability, and, most importantly, greater interpretability.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"29 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast linear model trees by PILOT PILOT 快速线性模型树
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-08 DOI: 10.1007/s10994-024-06590-3
Jakob Raymaekers, Peter J. Rousseeuw, Tim Verdonck, Ruicong Yao
{"title":"Fast linear model trees by PILOT","authors":"Jakob Raymaekers, Peter J. Rousseeuw, Tim Verdonck, Ruicong Yao","doi":"10.1007/s10994-024-06590-3","DOIUrl":"https://doi.org/10.1007/s10994-024-06590-3","url":null,"abstract":"<p>Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relationships, which is hard for standard decision trees. But most existing methods for fitting linear model trees are time consuming and therefore not scalable to large data sets. In addition, they are more prone to overfitting and extrapolation issues than standard regression trees. In this paper we introduce PILOT, a new algorithm for linear model trees that is fast, regularized, stable and interpretable. PILOT trains in a greedy fashion like classic regression trees, but incorporates an <i>L</i><sup>2</sup> boosting approach and a model selection rule for fitting linear models in the nodes. The abbreviation PILOT stands for PIecewise Linear Organic Tree, where ‘organic’ refers to the fact that no pruning is carried out. PILOT has the same low time and space complexity as CART without its pruning. An empirical study indicates that PILOT tends to outperform standard decision trees and other linear model trees on a variety of data sets. Moreover, we prove its consistency in an additive model setting under weak assumptions. When the data is generated by a linear model, the convergence rate is polynomial.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"10 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systematic approach for learning imbalanced data: enhancing zero-inflated models through boosting 学习不平衡数据的系统方法:通过提升增强零膨胀模型
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-08 DOI: 10.1007/s10994-024-06558-3
Yeasung Jeong, Kangbok Lee, Young Woong Park, Sumin Han
{"title":"A systematic approach for learning imbalanced data: enhancing zero-inflated models through boosting","authors":"Yeasung Jeong, Kangbok Lee, Young Woong Park, Sumin Han","doi":"10.1007/s10994-024-06558-3","DOIUrl":"https://doi.org/10.1007/s10994-024-06558-3","url":null,"abstract":"<p>In this paper, we propose systematic approaches for learning imbalanced data based on a two-regime process: regime 0, which generates excess zeros (majority class), and regime 1, which contributes to generating an outcome of one (minority class). The proposed model contains two latent equations: a split probit (logit) equation in the first stage and an ordinary probit (logit) equation in the second stage. Because boosting improves the accuracy of prediction versus using a single classifier, we combined a boosting strategy with the two-regime process. Thus, we developed the zero-inflated probit boost (ZIPBoost) and zero-inflated logit boost (ZILBoost) methods. We show that the weight functions of ZIPBoost have the desired properties for good predictive performance. Like AdaBoost, the weight functions upweight misclassified examples and downweight correctly classified examples. We show that the weight functions of ZILBoost have similar properties to those of LogitBoost. The algorithm will focus more on examples that are hard to classify in the next iteration, resulting in improved prediction accuracy. We provide the relative performance of ZIPBoost and ZILBoost, which rely on the excess kurtosis of the data distribution. Furthermore, we show the convergence and time complexity of our proposed methods. We demonstrate the performance of our proposed methods using a Monte Carlo simulation, mergers and acquisitions (M&amp;A) data application, and imbalanced datasets from the Keel repository. The results of the experiments show that our proposed methods yield better prediction accuracy compared to other learning algorithms.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"40 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rule learning by modularity 通过模块化学习规则
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-03 DOI: 10.1007/s10994-024-06556-5
Albert Nössig, Tobias Hell, Georg Moser
{"title":"Rule learning by modularity","authors":"Albert Nössig, Tobias Hell, Georg Moser","doi":"10.1007/s10994-024-06556-5","DOIUrl":"https://doi.org/10.1007/s10994-024-06556-5","url":null,"abstract":"<p>In this paper, we present a modular methodology that combines state-of-the-art methods in (stochastic) machine learning with well-established methods in inductive logic programming (ILP) and rule induction to provide efficient and scalable algorithms for the classification of vast data sets. By construction, these classifications are based on the synthesis of simple rules, thus providing direct explanations of the obtained classifications. Apart from evaluating our approach on the common large scale data sets <i>MNIST</i>, <i>Fashion-MNIST</i> and <i>IMDB</i>, we present novel results on explainable classifications of dental bills. The latter case study stems from an industrial collaboration with <i>Allianz Private Krankenversicherung</i> which is an insurance company offering diverse services in Germany.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"50 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信