Machine Learning最新文献

筛选
英文 中文
XAI-TRIS: non-linear image benchmarks to quantify false positive post-hoc attribution of feature importance XAI-TRIS:非线性图像基准,用于量化特征重要性的假阳性事后归因
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-16 DOI: 10.1007/s10994-024-06574-3
Benedict Clark, Rick Wilming, Stefan Haufe
{"title":"XAI-TRIS: non-linear image benchmarks to quantify false positive post-hoc attribution of feature importance","authors":"Benedict Clark, Rick Wilming, Stefan Haufe","doi":"10.1007/s10994-024-06574-3","DOIUrl":"https://doi.org/10.1007/s10994-024-06574-3","url":null,"abstract":"<p>The field of ‘explainable’ artificial intelligence (XAI) has produced highly acclaimed methods that seek to make the decisions of complex machine learning (ML) methods ‘understandable’ to humans, for example by attributing ‘importance’ scores to input features. Yet, a lack of formal underpinning leaves it unclear as to what conclusions can safely be drawn from the results of a given XAI method and has also so far hindered the theoretical verification and empirical validation of XAI methods. This means that challenging non-linear problems, typically solved by deep neural networks, presently lack appropriate remedies. Here, we craft benchmark datasets for one linear and three different non-linear classification scenarios, in which the important class-conditional features are known by design, serving as ground truth explanations. Using novel quantitative metrics, we benchmark the explanation performance of a wide set of XAI methods across three deep learning model architectures. We show that popular XAI methods are often unable to significantly outperform random performance baselines and edge detection methods, attributing false-positive importance to features with no statistical relationship to the prediction target rather than truly important features. Moreover, we demonstrate that explanations derived from different model architectures can be vastly different; thus, prone to misinterpretation even under controlled conditions.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141717813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Partitioned least squares 分区最小二乘法
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-15 DOI: 10.1007/s10994-024-06582-3
Roberto Esposito, Mattia Cerrato, Marco Locatelli
{"title":"Partitioned least squares","authors":"Roberto Esposito, Mattia Cerrato, Marco Locatelli","doi":"10.1007/s10994-024-06582-3","DOIUrl":"https://doi.org/10.1007/s10994-024-06582-3","url":null,"abstract":"<p>Linear least squares is one of the most widely used regression methods in many fields. The simplicity of the model allows this method to be used when data is scarce and allows practitioners to gather some insight into the problem by inspecting the values of the learnt parameters. In this paper we propose a variant of the linear least squares model allowing practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result. We show that the new formulation is not convex and provide two alternative methods to deal with the problem: one non-exact method based on an alternating least squares approach; and one exact method based on a reformulation of the problem. We show the correctness of the exact method and compare the two solutions showing that the exact solution provides better results in a fraction of the time required by the alternating least squares solution (when the number of partitions is small). We also provide a branch and bound algorithm that can be used in place of the exact method when the number of partitions is too large as well as a proof of NP-completeness of the optimization problem.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141717814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
L2XGNN: learning to explain graph neural networks L2XGNN:学习解释图神经网络
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-12 DOI: 10.1007/s10994-024-06576-1
Giuseppe Serra, Mathias Niepert
{"title":"L2XGNN: learning to explain graph neural networks","authors":"Giuseppe Serra, Mathias Niepert","doi":"10.1007/s10994-024-06576-1","DOIUrl":"https://doi.org/10.1007/s10994-024-06576-1","url":null,"abstract":"<p>Graph Neural Networks (GNNs) are a popular class of machine learning models. Inspired by the learning to explain (L2X) paradigm, we propose <span>L2xGnn</span>, a framework for explainable GNNs which provides <i>faithful</i> explanations by design. <span>L2xGnn</span> learns a mechanism for selecting explanatory subgraphs (motifs) which are exclusively used in the GNNs message-passing operations. <span>L2xGnn</span> is able to select, for each input graph, a subgraph with specific properties such as being sparse and connected. Imposing such constraints on the motifs often leads to more interpretable and effective explanations. Experiments on several datasets suggest that <span>L2xGnn</span> achieves the same classification accuracy as baseline methods using the entire input graph while ensuring that only the provided explanations are used to make predictions. Moreover, we show that <span>L2xGnn</span> is able to identify motifs responsible for the graph’s properties it is intended to predict.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Compressed sensing: a discrete optimization approach 压缩传感:一种离散优化方法
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-11 DOI: 10.1007/s10994-024-06577-0
Dimitris Bertsimas, Nicholas A. G. Johnson
{"title":"Compressed sensing: a discrete optimization approach","authors":"Dimitris Bertsimas, Nicholas A. G. Johnson","doi":"10.1007/s10994-024-06577-0","DOIUrl":"https://doi.org/10.1007/s10994-024-06577-0","url":null,"abstract":"<p>We study the Compressed Sensing (CS) problem, which is the problem of finding the most sparse vector that satisfies a set of linear measurements up to some numerical tolerance. CS is a central problem in Statistics, Operations Research and Machine Learning which arises in applications such as signal processing, data compression, image reconstruction, and multi-label learning. We introduce an <span>(ell _2)</span> regularized formulation of CS which we reformulate as a mixed integer second order cone program. We derive a second order cone relaxation of this problem and show that under mild conditions on the regularization parameter, the resulting relaxation is equivalent to the well studied basis pursuit denoising problem. We present a semidefinite relaxation that strengthens the second order cone relaxation and develop a custom branch-and-bound algorithm that leverages our second order cone relaxation to solve small-scale instances of CS to certifiable optimality. When compared against solutions produced by three state of the art benchmark methods on synthetic data, our numerical results show that our approach produces solutions that are on average <span>(6.22%)</span> more sparse. When compared only against the experiment-wise best performing benchmark method on synthetic data, our approach produces solutions that are on average <span>(3.10%)</span> more sparse. On real world ECG data, for a given <span>(ell _2)</span> reconstruction error our approach produces solutions that are on average <span>(9.95%)</span> more sparse than benchmark methods (<span>(3.88%)</span> more sparse if only compared against the best performing benchmark), while for a given sparsity level our approach produces solutions that have on average <span>(10.77%)</span> lower reconstruction error than benchmark methods (<span>(1.42%)</span> lower error if only compared against the best performing benchmark). When used as a component of a multi-label classification algorithm, our approach achieves greater classification accuracy than benchmark compressed sensing methods. This improved accuracy comes at the cost of an increase in computation time by several orders of magnitude. Thus, for applications where runtime is not of critical importance, leveraging integer optimization can yield sparser and lower error solutions to CS than existing benchmarks.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141611977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable dating of greek papyri images 希腊纸莎草纸图像的可解释年代
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-11 DOI: 10.1007/s10994-024-06589-w
John Pavlopoulos, Maria Konstantinidou, Elpida Perdiki, Isabelle Marthot-Santaniello, Holger Essler, Georgios Vardakas, Aristidis Likas
{"title":"Explainable dating of greek papyri images","authors":"John Pavlopoulos, Maria Konstantinidou, Elpida Perdiki, Isabelle Marthot-Santaniello, Holger Essler, Georgios Vardakas, Aristidis Likas","doi":"10.1007/s10994-024-06589-w","DOIUrl":"https://doi.org/10.1007/s10994-024-06589-w","url":null,"abstract":"<p>Greek literary papyri, which are unique witnesses of antique literature, do not usually bear a date. They are thus currently dated based on palaeographical methods, with broad approximations which often span more than a century. We created a dataset of 242 images of papyri written in “bookhand” scripts whose date can be securely assigned, and we used it to train algorithms for the task of dating, showing its challenging nature. To address data scarcity, we extended our dataset by segmenting each image into its respective text lines. By using the line-based version of our dataset, we trained a Convolutional Neural Network, equipped with a fragmentation-based augmentation strategy, and we achieved a mean absolute error of 54 years. The results improve further when the task is cast as a multi-class classification problem, predicting the century. Using our network, we computed precise date estimations for papyri whose date is disputed or vaguely defined, employing explainability to understand dating-driving features.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Moreau-Yoshida variational transport: a general framework for solving regularized distributional optimization problems 莫罗-吉田变分传输:解决正则分布优化问题的一般框架
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-10 DOI: 10.1007/s10994-024-06586-z
Dai Hai Nguyen, Tetsuya Sakurai
{"title":"Moreau-Yoshida variational transport: a general framework for solving regularized distributional optimization problems","authors":"Dai Hai Nguyen, Tetsuya Sakurai","doi":"10.1007/s10994-024-06586-z","DOIUrl":"https://doi.org/10.1007/s10994-024-06586-z","url":null,"abstract":"<p>We address a general optimization problem involving the minimization of a composite objective functional defined over a class of probability distributions. The objective function consists of two components: one assumed to have a variational representation, and the other expressed in terms of the expectation operator of a possibly nonsmooth convex regularizer function. Such a regularized distributional optimization problem widely appears in machine learning and statistics, including proximal Monte-Carlo sampling, Bayesian inference, and generative modeling for regularized estimation and generation. Our proposed method, named Moreau-Yoshida Variational Transport (MYVT), introduces a novel approach to tackle this regularized distributional optimization problem. First, as the name suggests, our method utilizes the Moreau-Yoshida envelope to provide a smooth approximation of the nonsmooth function in the objective. Second, we reformulate the approximate problem as a concave-convex saddle point problem by leveraging the variational representation. Subsequently, we develop an efficient primal–dual algorithm to approximate the saddle point. Furthermore, we provide theoretical analyses and present experimental results to showcase the effectiveness of the proposed method.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141584981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Permutation-invariant linear classifiers 置换不变线性分类器
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06561-8
Ludwig Lausser, Robin Szekely, Hans A. Kestler
{"title":"Permutation-invariant linear classifiers","authors":"Ludwig Lausser, Robin Szekely, Hans A. Kestler","doi":"10.1007/s10994-024-06561-8","DOIUrl":"https://doi.org/10.1007/s10994-024-06561-8","url":null,"abstract":"<p>Invariant concept classes form the backbone of classification algorithms immune to specific data transformations, ensuring consistent predictions regardless of these alterations. However, this robustness can come at the cost of limited access to the original sample information, potentially impacting generalization performance. This study introduces an addition to these classes—the permutation-invariant linear classifiers. Distinguished by their structural characteristics, permutation-invariant linear classifiers are unaffected by permutations on feature vectors, a property not guaranteed by other non-constant linear classifiers. The study characterizes this new concept class, highlighting its constant capacity, independent of input dimensionality. In practical assessments using linear support vector machines, the permutation-invariant classifiers exhibit superior performance in permutation experiments on artificial datasets and real mutation profiles. Interestingly, they outperform general linear classifiers not only in permutation experiments but also in permutation-free settings, surpassing unconstrained counterparts. Additionally, findings from real mutation profiles support the significance of tumor mutational burden as a biomarker.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regional bias in monolingual English language models 单语英语语言模型中的地区偏差
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06555-6
Jiachen Lyu, Katharina Dost, Yun Sing Koh, Jörg Wicker
{"title":"Regional bias in monolingual English language models","authors":"Jiachen Lyu, Katharina Dost, Yun Sing Koh, Jörg Wicker","doi":"10.1007/s10994-024-06555-6","DOIUrl":"https://doi.org/10.1007/s10994-024-06555-6","url":null,"abstract":"<p>In Natural Language Processing (NLP), pre-trained language models (LLMs) are widely employed and refined for various tasks. These models have shown considerable social and geographic biases creating skewed or even unfair representations of certain groups. Research focuses on biases toward L2 (English as a second language) regions but neglects bias within L1 (first language) regions. In this work, we ask if there is regional bias within L1 regions already inherent in pre-trained LLMs and, if so, what the consequences are in terms of downstream model performance. We contribute an investigation framework specifically tailored for low-resource regions, offering a method to identify bias without imposing strict requirements for labeled datasets. Our research reveals subtle geographic variations in the word embeddings of BERT, even in cultures traditionally perceived as similar. These nuanced features, once captured, have the potential to significantly impact downstream tasks. Generally, models exhibit comparable performance on datasets that share similarities, and conversely, performance may diverge when datasets differ in their nuanced features embedded within the language. It is crucial to note that estimating model performance solely based on standard benchmark datasets may not necessarily apply to the datasets with distinct features from the benchmark datasets. Our proposed framework plays a pivotal role in identifying and addressing biases detected in word embeddings, particularly evident in low-resource regions such as New Zealand.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conformal predictions for probabilistically robust scalable machine learning classification 针对概率稳健可扩展机器学习分类的共形预测
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06571-6
Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli
{"title":"Conformal predictions for probabilistically robust scalable machine learning classification","authors":"Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli","doi":"10.1007/s10994-024-06571-6","DOIUrl":"https://doi.org/10.1007/s10994-024-06571-6","url":null,"abstract":"<p>Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined <span>(varepsilon)</span> error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural discovery of balance-aware polarized communities 神经发现平衡感知极化群落
IF 7.5 3区 计算机科学
Machine Learning Pub Date : 2024-07-09 DOI: 10.1007/s10994-024-06581-4
Francesco Gullo, Domenico Mandaglio, Andrea Tagarelli
{"title":"Neural discovery of balance-aware polarized communities","authors":"Francesco Gullo, Domenico Mandaglio, Andrea Tagarelli","doi":"10.1007/s10994-024-06581-4","DOIUrl":"https://doi.org/10.1007/s10994-024-06581-4","url":null,"abstract":"<p><i>Signed graphs</i> are a model to depict friendly (<i>positive</i>) or antagonistic (<i>negative</i>) interactions (edges) among users (nodes). <span>2-Polarized-Communities</span> (<span>2pc</span>) is a well-established combinatorial-optimization problem whose goal is to find two <i>polarized</i> communities from a signed graph, i.e., two subsets of nodes (disjoint, but not necessarily covering the entire node set) which exhibit a high number of both intra-community positive edges and negative inter-community edges. The state of the art in <span>2pc</span> suffers from the limitations that (<i>i</i>) existing methods rely on a single (optimal) solution to a continuous relaxation of the problem in order to produce the ultimate discrete solution via rounding, and (<i>ii</i>) <span>2pc</span> objective function comes with no control on size balance among communities. In this paper, we provide advances to the <span>2pc</span> problem by addressing both these limitations, with a twofold contribution. First, we devise a novel neural approach that allows for soundly and elegantly explore a variety of suboptimal solutions to the relaxed <span>2pc</span> problem, so as to pick the one that leads to the best discrete solution after rounding. Second, we introduce a generalization of <span>2pc</span> objective function – termed <span>(gamma )</span>-<i>polarity </i>– which fosters size balance among communities, and we incorporate it into the proposed machine-learning framework. Extensive experiments attest high accuracy of our approach, its superiority over the state of the art, and capability of function <span>(gamma )</span>-polarity to discover high-quality size-balanced communities.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141577849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信