Transactions on machine learning research最新文献

筛选
英文 中文
Beyond Distribution Shift: Spurious Features Through the Lens of Training Dynamics. 超越分布偏移:从训练动态的角度看虚假特征。
Nihal Murali, Aahlad Puli, Ke Yu, Rajesh Ranganath, Kayhan Batmanghelich
{"title":"Beyond Distribution Shift: Spurious Features Through the Lens of Training Dynamics.","authors":"Nihal Murali, Aahlad Puli, Ke Yu, Rajesh Ranganath, Kayhan Batmanghelich","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Deep Neural Networks (DNNs) are prone to learning spurious features that correlate with the label during training but are irrelevant to the learning problem. This hurts model generalization and poses problems when deploying them in safety-critical applications. This paper aims to better understand the effects of spurious features through the lens of the learning dynamics of the internal neurons during the training process. We make the following observations: (1) While previous works highlight the harmful effects of spurious features on the generalization ability of DNNs, we emphasize that not all spurious features are harmful. Spurious features can be \"<i>benign</i>\" or <i>\"harmful\"</i> depending on whether they are \"harder\" or \"easier\" to learn than the core features for a given model. This definition is model and dataset dependent. (2) We build upon this premise and use <i>instance difficulty</i> methods (like Prediction Depth (Baldock et al., 2021)) to quantify \"easiness\" for a given model and to identify this behavior during the training phase. (3) We empirically show that the harmful spurious features can be detected by observing the learning dynamics of the DNN's <i>early layers</i>. In other words, easy features learned by the initial layers of a DNN early during the training can (potentially) hurt model generalization. We verify our claims on medical and vision datasets, both simulated and real, and justify the empirical success of our hypothesis by showing the theoretical connections between Prediction Depth and information-theoretic concepts like <math><mi>𝒱</mi></math>-usable information (Ethayarajh et al., 2021). Lastly, our experiments show that monitoring only accuracy during training (as is common in machine learning pipelines) is insufficient to detect spurious features. We, therefore, highlight the need for monitoring early training dynamics using suitable instance difficulty metrics.</p>","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2023 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11029547/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140863872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RIFLE: Imputation and Robust Inference from Low Order Marginals. RIFLE:根据低阶边际值进行归因和稳健推断。
Sina Baharlouei, Kelechi Ogudu, Sze-Chuan Suen, Meisam Razaviyayn
{"title":"RIFLE: Imputation and Robust Inference from Low Order Marginals.","authors":"Sina Baharlouei, Kelechi Ogudu, Sze-Chuan Suen, Meisam Razaviyayn","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The ubiquity of missing values in real-world datasets poses a challenge for statistical inference and can prevent similar datasets from being analyzed in the same study, precluding many existing datasets from being used for new analyses. While an extensive collection of packages and algorithms have been developed for data imputation, the overwhelming majority perform poorly if there are many missing values and low sample sizes, which are unfortunately common characteristics in empirical data. Such low-accuracy estimations adversely affect the performance of downstream statistical models. We develop a statistical inference framework for <i>regression and classification in the presence of missing data without imputation</i>. Our framework, RIFLE (Robust InFerence via Low-order moment Estimations), estimates low-order moments of the underlying data distribution with corresponding confidence intervals to learn a distributionally robust model. We specialize our framework to linear regression and normal discriminant analysis, and we provide convergence and performance guarantees. This framework can also be adapted to impute missing data. In numerical experiments, we compare RIFLE to several state-of-the-art approaches (including MICE, Amelia, MissForest, KNN-imputer, MIDA, and Mean Imputer) for imputation and inference in the presence of missing values. Our experiments demonstrate that RIFLE outperforms other benchmark algorithms when the percentage of missing values is high and/or when the number of data points is relatively small. RIFLE is publicly available at https://github.com/optimization-for-data-driven-science/RIFLE.</p>","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2023 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10977932/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140320107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
mL-BFGS: A Momentum-based L-BFGS for Distributed Large-Scale Neural Network Optimization. mL-BFGS:用于分布式大规模神经网络优化的基于动量的L-BFGS。
Yue Niu, Zalan Fabian, Sunwoo Lee, Mahdi Soltanolkotabi, Salman Avestimehr
{"title":"mL-BFGS: A Momentum-based L-BFGS for Distributed Large-Scale Neural Network Optimization.","authors":"Yue Niu, Zalan Fabian, Sunwoo Lee, Mahdi Soltanolkotabi, Salman Avestimehr","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Quasi-Newton methods still face significant challenges in training large-scale neural networks due to additional compute costs in the Hessian related computations and instability issues in stochastic training. A well-known method, L-BFGS that efficiently approximates the Hessian using history parameter and gradient changes, suffers convergence instability in stochastic training. So far, attempts that adapt L-BFGS to large-scale stochastic training incur considerable extra overhead, which offsets its convergence benefits in wall-clock time. In this paper, we propose mL-BFGS, a lightweight momentum-based L-BFGS algorithm that paves the way for quasi-Newton (QN) methods in large-scale distributed deep neural network (DNN) optimization. mL-BFGS introduces a nearly cost-free momentum scheme into L-BFGS update and greatly reduces stochastic noise in the Hessian, therefore stabilizing convergence during stochastic optimization. For model training at a large scale, mL-BFGS approximates a block-wise Hessian, thus enabling distributing compute and memory costs across all computing nodes. We provide a supporting convergence analysis for mL-BFGS in stochastic settings. To investigate mL-BFGS's potential in large-scale DNN training, we train benchmark neural models using mL-BFGS and compare performance with baselines (SGD, Adam, and other quasi-Newton methods). Results show that mL-BFGS achieves both noticeable iteration-wise and wall-clock speedup.</p>","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2023 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12393816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144982031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Convergence and Calibration of Deep Learning with Differential Privacy. 论具有差异隐私的深度学习的收敛与校准
Zhiqi Bu, Hua Wang, Zongyu Dai, Qi Long
{"title":"On the Convergence and Calibration of Deep Learning with Differential Privacy.","authors":"Zhiqi Bu, Hua Wang, Zongyu Dai, Qi Long","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Differentially private (DP) training preserves the data privacy usually at the cost of slower convergence (and thus lower accuracy), as well as more severe mis-calibration than its non-private counterpart. To analyze the convergence of DP training, we formulate a continuous time analysis through the lens of neural tangent kernel (NTK), which characterizes the per-sample gradient clipping and the noise addition in DP training, for arbitrary network architectures and loss functions. Interestingly, we show that the noise addition only affects the privacy risk but not the convergence or calibration, whereas the per-sample gradient clipping (under both flat and layerwise clipping styles) only affects the convergence and calibration. Furthermore, we observe that while DP models trained with small clipping norm usually achieve the best accurate, but are poorly calibrated and thus unreliable. In sharp contrast, DP models trained with large clipping norm enjoy the same privacy guarantee and similar accuracy, but are significantly more <i>calibrated</i>. Our code can be found at https://github.com/woodyx218/opacus_global_clipping.</p>","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2023 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10982613/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140337962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Traditional Machine Learning Models for Building Energy Performance Prediction: A Comparative Research 建筑节能性能预测的传统机器学习模型比较研究
Transactions on machine learning research Pub Date : 2023-05-29 DOI: 10.11648/j.mlr.20230801.11
Zeyu Wu, Hongyang He
{"title":"Traditional Machine Learning Models for Building Energy Performance Prediction: A Comparative Research","authors":"Zeyu Wu, Hongyang He","doi":"10.11648/j.mlr.20230801.11","DOIUrl":"https://doi.org/10.11648/j.mlr.20230801.11","url":null,"abstract":": A large proportion of total energy consumption is caused by buildings. Accurately predicting the heating and cooling demand of a building is crucial in the initial design phase in order to determine the most efficient solution from various designs. In this paper, in order to explore the effectiveness of basic machine learning algorithms to solve this problem, different machine learning models were used to estimate the heating and cooling loads of buildings, utilising data on the energy efficiency of buildings. Notably, this paper also discusses the performance of deep neural network prediction models and concludes that among traditional machine learning algorithms, GradientBoostingRegressor achieves better predictions, with Heating prediction reaching 0.998553 and Cooling prediction Compared with our machine learning algorithm HB-Regressor, the prediction accuracy of HB-Regressor is higher, reaching 0.998672 and 0.995153 respectively, but the fitting speed is not as fast as the GradientBoostingRegressor algorithm.","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"82 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78974691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Indexing of Digital Objects Through Learning from User Data 通过学习用户数据实现数字对象的自动索引
Transactions on machine learning research Pub Date : 2023-01-31 DOI: 10.11648/j.mlr.20220702.12
C. Leung, Yuanxi Li
{"title":"Automatic Indexing of Digital Objects Through Learning from User Data","authors":"C. Leung, Yuanxi Li","doi":"10.11648/j.mlr.20220702.12","DOIUrl":"https://doi.org/10.11648/j.mlr.20220702.12","url":null,"abstract":"","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84342864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Robust is Your Fairness? Evaluating and Sustaining Fairness under Unseen Distribution Shifts. 你的公平性有多强?看不见的分配变化下的公平评估与维持。
Haotao Wang, Junyuan Hong, Jiayu Zhou, Zhangyang Wang
{"title":"How Robust is Your Fairness? Evaluating and Sustaining Fairness under Unseen Distribution Shifts.","authors":"Haotao Wang,&nbsp;Junyuan Hong,&nbsp;Jiayu Zhou,&nbsp;Zhangyang Wang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Increasing concerns have been raised on deep learning fairness in recent years. Existing fairness-aware machine learning methods mainly focus on the fairness of in-distribution data. However, in real-world applications, it is common to have distribution shift between the training and test data. In this paper, we first show that the fairness achieved by existing methods can be easily broken by slight distribution shifts. To solve this problem, we propose a novel fairness learning method termed CUrvature MAtching (CUMA), which can achieve robust fairness generalizable to unseen domains with unknown distributional shifts. Specifically, CUMA enforces the model to have similar generalization ability on the majority and minority groups, by matching the loss curvature distributions of the two groups. We evaluate our method on three popular fairness datasets. Compared with existing methods, CUMA achieves superior fairness under unseen distribution shifts, without sacrificing either the overall accuracy or the in-distribution fairness.</p>","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2023 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10097499/pdf/nihms-1888011.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9310075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Potential Outcome Distributions with Collaborating Causal Networks. 利用协作因果网络估算潜在结果分布。
Tianhui Zhou, William E Carson, David Carlson
{"title":"Estimating Potential Outcome Distributions with Collaborating Causal Networks.","authors":"Tianhui Zhou, William E Carson, David Carlson","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Traditional causal inference approaches leverage observational study data to estimate the difference in observed (factual) and unobserved (counterfactual) outcomes for a potential treatment, known as the Conditional Average Treatment Effect (CATE). However, CATE corresponds to the comparison on the first moment alone, and as such may be insufficient in reflecting the full picture of treatment effects. As an alternative, estimating the full potential outcome distributions could provide greater insights. However, existing methods for estimating treatment effect potential outcome distributions often impose restrictive or overly-simplistic assumptions about these distributions. Here, we propose Collaborating Causal Networks (CCN), a novel methodology which goes beyond the estimation of CATE alone by learning the <i>full potential outcome distributions</i>. Estimation of outcome distributions via the CCN framework does not require restrictive assumptions of the underlying data generating process (e.g. Gaussian errors). Additionally, our proposed method facilitates estimation of the utility of each possible treatment and permits individual-specific variation through utility functions (e.g. risk tolerance variability). CCN not only extends outcome estimation beyond traditional risk difference, but also enables a more comprehensive decision making process through definition of flexible comparisons. Under assumptions commonly made in the causal inference literature, we show that CCN learns distributions that asymptotically capture the correct potential outcome distributions. Furthermore, we propose an adjustment approach that is empirically effective in alleviating sample imbalance between treatment groups in observational studies. Finally, we evaluate the performance of CCN in multiple experiments on both synthetic and semi-synthetic data. We demonstrate that CCN learns improved distribution estimates compared to existing Bayesian and deep generative methods as well as improved decisions with respects to a variety of utility functions.</p>","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2022 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10769464/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139378979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Robust is Your Fairness? Evaluating and Sustaining Fairness under Unseen Distribution Shifts 你的公平性有多强?看不见的分配变化下的公平评估与维持
Transactions on machine learning research Pub Date : 2022-07-04 DOI: 10.48550/arXiv.2207.01168
Haotao Wang, Junyuan Hong, Jiayu Zhou, Zhangyang Wang
{"title":"How Robust is Your Fairness? Evaluating and Sustaining Fairness under Unseen Distribution Shifts","authors":"Haotao Wang, Junyuan Hong, Jiayu Zhou, Zhangyang Wang","doi":"10.48550/arXiv.2207.01168","DOIUrl":"https://doi.org/10.48550/arXiv.2207.01168","url":null,"abstract":"Increasing concerns have been raised on deep learning fairness in recent years. Existing fairness-aware machine learning methods mainly focus on the fairness of in-distribution data. However, in real-world applications, it is common to have distribution shift between the training and test data. In this paper, we first show that the fairness achieved by existing methods can be easily broken by slight distribution shifts. To solve this problem, we propose a novel fairness learning method termed CUrvature MAtching (CUMA), which can achieve robust fairness generalizable to unseen domains with unknown distributional shifts. Specifically, CUMA enforces the model to have similar generalization ability on the majority and minority groups, by matching the loss curvature distributions of the two groups. We evaluate our method on three popular fairness datasets. Compared with existing methods, CUMA achieves superior fairness under unseen distribution shifts, without sacrificing either the overall accuracy or the in-distribution fairness.","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84067865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Design to Build E-learning Application in SMP N 2 Busalangga Busalangga中学电子学习应用的设计
Transactions on machine learning research Pub Date : 2021-01-01 DOI: 10.11648/j.mlr.20210602.11
Jimi Asmara, Gregorius Rinduh Iriane, Edwin Ariesto Umbu Malahina
{"title":"Design to Build E-learning Application in SMP N 2 Busalangga","authors":"Jimi Asmara, Gregorius Rinduh Iriane, Edwin Ariesto Umbu Malahina","doi":"10.11648/j.mlr.20210602.11","DOIUrl":"https://doi.org/10.11648/j.mlr.20210602.11","url":null,"abstract":"","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82304583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信