Transactions on machine learning research最新文献_第2页

Beyond Distribution Shift: Spurious Features Through the Lens of Training Dynamics. 超越分布偏移：从训练动态的角度看虚假特征。

Transactions on machine learning research Pub Date : 2023-10-01

Nihal Murali, Aahlad Puli, Ke Yu, Rajesh Ranganath, Kayhan Batmanghelich

{"title":"Beyond Distribution Shift: Spurious Features Through the Lens of Training Dynamics.","authors":"Nihal Murali, Aahlad Puli, Ke Yu, Rajesh Ranganath, Kayhan Batmanghelich","doi":"","DOIUrl":"","url":null,"abstract":"Deep Neural Networks (DNNs) are prone to learning spurious features that correlate with the label during training but are irrelevant to the learning problem. This hurts model generalization and poses problems when deploying them in safety-critical applications. This paper aims to better understand the effects of spurious features through the lens of the learning dynamics of the internal neurons during the training process. We make the following observations: (1) While previous works highlight the harmful effects of spurious features on the generalization ability of DNNs, we emphasize that not all spurious features are harmful. Spurious features can be \"benign\" or \"harmful\" depending on whether they are \"harder\" or \"easier\" to learn than the core features for a given model. This definition is model and dataset dependent. (2) We build upon this premise and use instance difficulty methods (like Prediction Depth (Baldock et al., 2021)) to quantify \"easiness\" for a given model and to identify this behavior during the training phase. (3) We empirically show that the harmful spurious features can be detected by observing the learning dynamics of the DNN's early layers. In other words, easy features learned by the initial layers of a DNN early during the training can (potentially) hurt model generalization. We verify our claims on medical and vision datasets, both simulated and real, and justify the empirical success of our hypothesis by showing the theoretical connections between Prediction Depth and information-theoretic concepts like <math><mi>𝒱</mi></math>-usable information (Ethayarajh et al., 2021). Lastly, our experiments show that monitoring only accuracy during training (as is common in machine learning pipelines) is insufficient to detect spurious features. We, therefore, highlight the need for monitoring early training dynamics using suitable instance difficulty metrics.","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2023 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11029547/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140863872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RIFLE: Imputation and Robust Inference from Low Order Marginals. RIFLE：根据低阶边际值进行归因和稳健推断。

Transactions on machine learning research Pub Date : 2023-09-01

Sina Baharlouei, Kelechi Ogudu, Sze-Chuan Suen, Meisam Razaviyayn

{"title":"RIFLE: Imputation and Robust Inference from Low Order Marginals.","authors":"Sina Baharlouei, Kelechi Ogudu, Sze-Chuan Suen, Meisam Razaviyayn","doi":"","DOIUrl":"","url":null,"abstract":"The ubiquity of missing values in real-world datasets poses a challenge for statistical inference and can prevent similar datasets from being analyzed in the same study, precluding many existing datasets from being used for new analyses. While an extensive collection of packages and algorithms have been developed for data imputation, the overwhelming majority perform poorly if there are many missing values and low sample sizes, which are unfortunately common characteristics in empirical data. Such low-accuracy estimations adversely affect the performance of downstream statistical models. We develop a statistical inference framework for regression and classification in the presence of missing data without imputation. Our framework, RIFLE (Robust InFerence via Low-order moment Estimations), estimates low-order moments of the underlying data distribution with corresponding confidence intervals to learn a distributionally robust model. We specialize our framework to linear regression and normal discriminant analysis, and we provide convergence and performance guarantees. This framework can also be adapted to impute missing data. In numerical experiments, we compare RIFLE to several state-of-the-art approaches (including MICE, Amelia, MissForest, KNN-imputer, MIDA, and Mean Imputer) for imputation and inference in the presence of missing values. Our experiments demonstrate that RIFLE outperforms other benchmark algorithms when the percentage of missing values is high and/or when the number of data points is relatively small. RIFLE is publicly available at https://github.com/optimization-for-data-driven-science/RIFLE.","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2023 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10977932/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140320107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

mL-BFGS: A Momentum-based L-BFGS for Distributed Large-Scale Neural Network Optimization. mL-BFGS：用于分布式大规模神经网络优化的基于动量的L-BFGS。

Transactions on machine learning research Pub Date : 2023-08-01

Yue Niu, Zalan Fabian, Sunwoo Lee, Mahdi Soltanolkotabi, Salman Avestimehr

{"title":"mL-BFGS: A Momentum-based L-BFGS for Distributed Large-Scale Neural Network Optimization.","authors":"Yue Niu, Zalan Fabian, Sunwoo Lee, Mahdi Soltanolkotabi, Salman Avestimehr","doi":"","DOIUrl":"","url":null,"abstract":"Quasi-Newton methods still face significant challenges in training large-scale neural networks due to additional compute costs in the Hessian related computations and instability issues in stochastic training. A well-known method, L-BFGS that efficiently approximates the Hessian using history parameter and gradient changes, suffers convergence instability in stochastic training. So far, attempts that adapt L-BFGS to large-scale stochastic training incur considerable extra overhead, which offsets its convergence benefits in wall-clock time. In this paper, we propose mL-BFGS, a lightweight momentum-based L-BFGS algorithm that paves the way for quasi-Newton (QN) methods in large-scale distributed deep neural network (DNN) optimization. mL-BFGS introduces a nearly cost-free momentum scheme into L-BFGS update and greatly reduces stochastic noise in the Hessian, therefore stabilizing convergence during stochastic optimization. For model training at a large scale, mL-BFGS approximates a block-wise Hessian, thus enabling distributing compute and memory costs across all computing nodes. We provide a supporting convergence analysis for mL-BFGS in stochastic settings. To investigate mL-BFGS's potential in large-scale DNN training, we train benchmark neural models using mL-BFGS and compare performance with baselines (SGD, Adam, and other quasi-Newton methods). Results show that mL-BFGS achieves both noticeable iteration-wise and wall-clock speedup.","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2023 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12393816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144982031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the Convergence and Calibration of Deep Learning with Differential Privacy. 论具有差异隐私的深度学习的收敛与校准

Transactions on machine learning research Pub Date : 2023-06-01

Zhiqi Bu, Hua Wang, Zongyu Dai, Qi Long

引用次数: 0

Traditional Machine Learning Models for Building Energy Performance Prediction: A Comparative Research 建筑节能性能预测的传统机器学习模型比较研究

Transactions on machine learning research Pub Date : 2023-05-29 DOI: 10.11648/j.mlr.20230801.11

Zeyu Wu, Hongyang He

引用次数: 0

Automatic Indexing of Digital Objects Through Learning from User Data 通过学习用户数据实现数字对象的自动索引

Transactions on machine learning research Pub Date : 2023-01-31 DOI: 10.11648/j.mlr.20220702.12

C. Leung, Yuanxi Li

引用次数: 0

How Robust is Your Fairness? Evaluating and Sustaining Fairness under Unseen Distribution Shifts. 你的公平性有多强?看不见的分配变化下的公平评估与维持。

Transactions on machine learning research Pub Date : 2023-01-01

Haotao Wang, Junyuan Hong, Jiayu Zhou, Zhangyang Wang

引用次数: 0

Estimating Potential Outcome Distributions with Collaborating Causal Networks. 利用协作因果网络估算潜在结果分布。

Transactions on machine learning research Pub Date : 2022-09-01

Tianhui Zhou, William E Carson, David Carlson

{"title":"Estimating Potential Outcome Distributions with Collaborating Causal Networks.","authors":"Tianhui Zhou, William E Carson, David Carlson","doi":"","DOIUrl":"","url":null,"abstract":"Traditional causal inference approaches leverage observational study data to estimate the difference in observed (factual) and unobserved (counterfactual) outcomes for a potential treatment, known as the Conditional Average Treatment Effect (CATE). However, CATE corresponds to the comparison on the first moment alone, and as such may be insufficient in reflecting the full picture of treatment effects. As an alternative, estimating the full potential outcome distributions could provide greater insights. However, existing methods for estimating treatment effect potential outcome distributions often impose restrictive or overly-simplistic assumptions about these distributions. Here, we propose Collaborating Causal Networks (CCN), a novel methodology which goes beyond the estimation of CATE alone by learning the full potential outcome distributions. Estimation of outcome distributions via the CCN framework does not require restrictive assumptions of the underlying data generating process (e.g. Gaussian errors). Additionally, our proposed method facilitates estimation of the utility of each possible treatment and permits individual-specific variation through utility functions (e.g. risk tolerance variability). CCN not only extends outcome estimation beyond traditional risk difference, but also enables a more comprehensive decision making process through definition of flexible comparisons. Under assumptions commonly made in the causal inference literature, we show that CCN learns distributions that asymptotically capture the correct potential outcome distributions. Furthermore, we propose an adjustment approach that is empirically effective in alleviating sample imbalance between treatment groups in observational studies. Finally, we evaluate the performance of CCN in multiple experiments on both synthetic and semi-synthetic data. We demonstrate that CCN learns improved distribution estimates compared to existing Bayesian and deep generative methods as well as improved decisions with respects to a variety of utility functions.","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2022 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10769464/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139378979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How Robust is Your Fairness? Evaluating and Sustaining Fairness under Unseen Distribution Shifts 你的公平性有多强?看不见的分配变化下的公平评估与维持

Transactions on machine learning research Pub Date : 2022-07-04 DOI: 10.48550/arXiv.2207.01168

Haotao Wang, Junyuan Hong, Jiayu Zhou, Zhangyang Wang

引用次数: 5

Design to Build E-learning Application in SMP N 2 Busalangga Busalangga中学电子学习应用的设计

Transactions on machine learning research Pub Date : 2021-01-01 DOI: 10.11648/j.mlr.20210602.11

Jimi Asmara, Gregorius Rinduh Iriane, Edwin Ariesto Umbu Malahina

引用次数: 0