Trans. Mach. Learn. Res.最新文献

筛选
英文 中文
Graph-based Multi-ODE Neural Networks for Spatio-Temporal Traffic Forecasting 基于图的多ode神经网络时空交通预测
Trans. Mach. Learn. Res. Pub Date : 2023-05-30 DOI: 10.48550/arXiv.2305.18687
Zibo Liu, Parshin Shojaee, C. Reddy
{"title":"Graph-based Multi-ODE Neural Networks for Spatio-Temporal Traffic Forecasting","authors":"Zibo Liu, Parshin Shojaee, C. Reddy","doi":"10.48550/arXiv.2305.18687","DOIUrl":"https://doi.org/10.48550/arXiv.2305.18687","url":null,"abstract":"There is a recent surge in the development of spatio-temporal forecasting models in the transportation domain. Long-range traffic forecasting, however, remains a challenging task due to the intricate and extensive spatio-temporal correlations observed in traffic networks. Current works primarily rely on road networks with graph structures and learn representations using graph neural networks (GNNs), but this approach suffers from over-smoothing problem in deep architectures. To tackle this problem, recent methods introduced the combination of GNNs with residual connections or neural ordinary differential equations (ODE). However, current graph ODE models face two key limitations in feature extraction: (1) they lean towards global temporal patterns, overlooking local patterns that are important for unexpected events; and (2) they lack dynamic semantic edges in their architectural design. In this paper, we propose a novel architecture called Graph-based Multi-ODE Neural Networks (GRAM-ODE) which is designed with multiple connective ODE-GNN modules to learn better representations by capturing different views of complex local and global dynamic spatio-temporal dependencies. We also add some techniques like shared weights and divergence constraints into the intermediate layers of distinct ODE-GNN modules to further improve their communication towards the forecasting task. Our extensive set of experiments conducted on six real-world datasets demonstrate the superior performance of GRAM-ODE compared with state-of-the-art baselines as well as the contribution of different components to the overall performance. The code is available at https://github.com/zbliu98/GRAM-ODE","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131656616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Understanding Noise-Augmented Training for Randomized Smoothing 理解随机平滑的噪声增强训练
Trans. Mach. Learn. Res. Pub Date : 2023-05-08 DOI: 10.48550/arXiv.2305.04746
Ambar Pal, Jeremias Sulam
{"title":"Understanding Noise-Augmented Training for Randomized Smoothing","authors":"Ambar Pal, Jeremias Sulam","doi":"10.48550/arXiv.2305.04746","DOIUrl":"https://doi.org/10.48550/arXiv.2305.04746","url":null,"abstract":"Randomized smoothing is a technique for providing provable robustness guarantees against adversarial attacks while making minimal assumptions about a classifier. This method relies on taking a majority vote of any base classifier over multiple noise-perturbed inputs to obtain a smoothed classifier, and it remains the tool of choice to certify deep and complex neural network models. Nonetheless, non-trivial performance of such smoothed classifier crucially depends on the base model being trained on noise-augmented data, i.e., on a smoothed input distribution. While widely adopted in practice, it is still unclear how this noisy training of the base classifier precisely affects the risk of the robust smoothed classifier, leading to heuristics and tricks that are poorly understood. In this work we analyze these trade-offs theoretically in a binary classification setting, proving that these common observations are not universal. We show that, without making stronger distributional assumptions, no benefit can be expected from predictors trained with noise-augmentation, and we further characterize distributions where such benefit is obtained. Our analysis has direct implications to the practical deployment of randomized smoothing, and we illustrate some of these via experiments on CIFAR-10 and MNIST, as well as on synthetic datasets.","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132422564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression 用多项逻辑回归估计高差异分布间的密度比
Trans. Mach. Learn. Res. Pub Date : 2023-05-01 DOI: 10.48550/arXiv.2305.00869
Akash Srivastava, Seung-Jun Han, Kai Xu, Benjamin Rhodes, Michael U Gutmann
{"title":"Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression","authors":"Akash Srivastava, Seung-Jun Han, Kai Xu, Benjamin Rhodes, Michael U Gutmann","doi":"10.48550/arXiv.2305.00869","DOIUrl":"https://doi.org/10.48550/arXiv.2305.00869","url":null,"abstract":"Functions of the ratio of the densities $p/q$ are widely used in machine learning to quantify the discrepancy between the two distributions $p$ and $q$. For high-dimensional distributions, binary classification-based density ratio estimators have shown great promise. However, when densities are well separated, estimating the density ratio with a binary classifier is challenging. In this work, we show that the state-of-the-art density ratio estimators perform poorly on well-separated cases and demonstrate that this is due to distribution shifts between training and evaluation time. We present an alternative method that leverages multi-class classification for density ratio estimation and does not suffer from distribution shift issues. The method uses a set of auxiliary densities ${m_k}_{k=1}^K$ and trains a multi-class logistic regression to classify the samples from $p, q$, and ${m_k}_{k=1}^K$ into $K+2$ classes. We show that if these auxiliary densities are constructed such that they overlap with $p$ and $q$, then a multi-class logistic regression allows for estimating $log p/q$ on the domain of any of the $K+2$ distributions and resolves the distribution shift problems of the current state-of-the-art methods. We compare our method to state-of-the-art density ratio estimators on both synthetic and real datasets and demonstrate its superior performance on the tasks of density ratio estimation, mutual information estimation, and representation learning. Code: https://www.blackswhan.com/mdre/","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128972325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Proximal Curriculum for Reinforcement Learning Agents 强化学习智能体的近端课程
Trans. Mach. Learn. Res. Pub Date : 2023-04-25 DOI: 10.48550/arXiv.2304.12877
Georgios Tzannetos, Bárbara Gomes Ribeiro, Parameswaran Kamalaruban, A. Singla
{"title":"Proximal Curriculum for Reinforcement Learning Agents","authors":"Georgios Tzannetos, Bárbara Gomes Ribeiro, Parameswaran Kamalaruban, A. Singla","doi":"10.48550/arXiv.2304.12877","DOIUrl":"https://doi.org/10.48550/arXiv.2304.12877","url":null,"abstract":"We consider the problem of curriculum design for reinforcement learning (RL) agents in contextual multi-task settings. Existing techniques on automatic curriculum design typically require domain-specific hyperparameter tuning or have limited theoretical underpinnings. To tackle these limitations, we design our curriculum strategy, ProCuRL, inspired by the pedagogical concept of Zone of Proximal Development (ZPD). ProCuRL captures the intuition that learning progress is maximized when picking tasks that are neither too hard nor too easy for the learner. We mathematically derive ProCuRL by analyzing two simple learning settings. We also present a practical variant of ProCuRL that can be directly integrated with deep RL frameworks with minimal hyperparameter tuning. Experimental results on a variety of domains demonstrate the effectiveness of our curriculum strategy over state-of-the-art baselines in accelerating the training process of deep RL agents.","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127644013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Study of Biologically Plausible Neural Network: The Role and Interactions of Brain-Inspired Mechanisms in Continual Learning 生物学上似是而非的神经网络研究:脑激励机制在持续学习中的作用和相互作用
Trans. Mach. Learn. Res. Pub Date : 2023-04-13 DOI: 10.48550/arXiv.2304.06738
F. Sarfraz, E. Arani, Bahram Zonooz
{"title":"A Study of Biologically Plausible Neural Network: The Role and Interactions of Brain-Inspired Mechanisms in Continual Learning","authors":"F. Sarfraz, E. Arani, Bahram Zonooz","doi":"10.48550/arXiv.2304.06738","DOIUrl":"https://doi.org/10.48550/arXiv.2304.06738","url":null,"abstract":"Humans excel at continually acquiring, consolidating, and retaining information from an ever-changing environment, whereas artificial neural networks (ANNs) exhibit catastrophic forgetting. There are considerable differences in the complexity of synapses, the processing of information, and the learning mechanisms in biological neural networks and their artificial counterparts, which may explain the mismatch in performance. We consider a biologically plausible framework that constitutes separate populations of exclusively excitatory and inhibitory neurons that adhere to Dale's principle, and the excitatory pyramidal neurons are augmented with dendritic-like structures for context-dependent processing of stimuli. We then conduct a comprehensive study on the role and interactions of different mechanisms inspired by the brain, including sparse non-overlapping representations, Hebbian learning, synaptic consolidation, and replay of past activations that accompanied the learning event. Our study suggests that the employing of multiple complementary mechanisms in a biologically plausible architecture, similar to the brain, may be effective in enabling continual learning in ANNs.","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"2023 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129526345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Infinitely wide limits for deep Stable neural networks: sub-linear, linear and super-linear activation functions 深度稳定神经网络的无限宽极限:亚线性、线性和超线性激活函数
Trans. Mach. Learn. Res. Pub Date : 2023-04-08 DOI: 10.48550/arXiv.2304.04008
Alberto Bordino, S. Favaro, S. Fortini
{"title":"Infinitely wide limits for deep Stable neural networks: sub-linear, linear and super-linear activation functions","authors":"Alberto Bordino, S. Favaro, S. Fortini","doi":"10.48550/arXiv.2304.04008","DOIUrl":"https://doi.org/10.48550/arXiv.2304.04008","url":null,"abstract":"There is a growing literature on the study of large-width properties of deep Gaussian neural networks (NNs), i.e. deep NNs with Gaussian-distributed parameters or weights, and Gaussian stochastic processes. Motivated by some empirical and theoretical studies showing the potential of replacing Gaussian distributions with Stable distributions, namely distributions with heavy tails, in this paper we investigate large-width properties of deep Stable NNs, i.e. deep NNs with Stable-distributed parameters. For sub-linear activation functions, a recent work has characterized the infinitely wide limit of a suitable rescaled deep Stable NN in terms of a Stable stochastic process, both under the assumption of a ``joint growth\"and under the assumption of a ``sequential growth\"of the width over the NN's layers. Here, assuming a ``sequential growth\"of the width, we extend such a characterization to a general class of activation functions, which includes sub-linear, asymptotically linear and super-linear functions. As a novelty with respect to previous works, our results rely on the use of a generalized central limit theorem for heavy tails distributions, which allows for an interesting unified treatment of infinitely wide limits for deep Stable NNs. Our study shows that the scaling of Stable NNs and the stability of their infinitely wide limits may depend on the choice of the activation function, bringing out a critical difference with respect to the Gaussian setting.","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114574017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Practicality of generalization guarantees for unsupervised domain adaptation with neural networks 泛化的实用性为神经网络的无监督域自适应提供了保证
Trans. Mach. Learn. Res. Pub Date : 2023-03-15 DOI: 10.48550/arXiv.2303.08720
Adam Breitholtz, Fredrik D. Johansson
{"title":"Practicality of generalization guarantees for unsupervised domain adaptation with neural networks","authors":"Adam Breitholtz, Fredrik D. Johansson","doi":"10.48550/arXiv.2303.08720","DOIUrl":"https://doi.org/10.48550/arXiv.2303.08720","url":null,"abstract":"Understanding generalization is crucial to confidently engineer and deploy machine learning models, especially when deployment implies a shift in the data domain. For such domain adaptation problems, we seek generalization bounds which are tractably computable and tight. If these desiderata can be reached, the bounds can serve as guarantees for adequate performance in deployment. However, in applications where deep neural networks are the models of choice, deriving results which fulfill these remains an unresolved challenge; most existing bounds are either vacuous or has non-estimable terms, even in favorable conditions. In this work, we evaluate existing bounds from the literature with potential to satisfy our desiderata on domain adaptation image classification tasks, where deep neural networks are preferred. We find that all bounds are vacuous and that sample generalization terms account for much of the observed looseness, especially when these terms interact with measures of domain shift. To overcome this and arrive at the tightest possible results, we combine each bound with recent data-dependent PAC-Bayes analysis, greatly improving the guarantees. We find that, when domain overlap can be assumed, a simple importance weighting extension of previous work provides the tightest estimable bound. Finally, we study which terms dominate the bounds and identify possible directions for further improvement.","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122280316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Supervised Feature Selection with Neuron Evolution in Sparse Neural Networks 稀疏神经网络中神经元进化的监督特征选择
Trans. Mach. Learn. Res. Pub Date : 2023-03-10 DOI: 10.48550/arXiv.2303.07200
Zahra Atashgahi, Xuhao Zhang, Neil Kichler, Shiwei Liu, Lu Yin, Mykola Pechenizkiy, Raymond N. J. Veldhuis, D. Mocanu
{"title":"Supervised Feature Selection with Neuron Evolution in Sparse Neural Networks","authors":"Zahra Atashgahi, Xuhao Zhang, Neil Kichler, Shiwei Liu, Lu Yin, Mykola Pechenizkiy, Raymond N. J. Veldhuis, D. Mocanu","doi":"10.48550/arXiv.2303.07200","DOIUrl":"https://doi.org/10.48550/arXiv.2303.07200","url":null,"abstract":"Feature selection that selects an informative subset of variables from data not only enhances the model interpretability and performance but also alleviates the resource demands. Recently, there has been growing attention on feature selection using neural networks. However, existing methods usually suffer from high computational costs when applied to high-dimensional datasets. In this paper, inspired by evolution processes, we propose a novel resource-efficient supervised feature selection method using sparse neural networks, named enquote{NeuroFS}. By gradually pruning the uninformative features from the input layer of a sparse neural network trained from scratch, NeuroFS derives an informative subset of features efficiently. By performing several experiments on $11$ low and high-dimensional real-world benchmarks of different types, we demonstrate that NeuroFS achieves the highest ranking-based score among the considered state-of-the-art supervised feature selection models. The code is available on GitHub.","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126590466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Containing a spread through sequential learning: to exploit or to explore? 通过顺序学习来包含传播:利用还是探索?
Trans. Mach. Learn. Res. Pub Date : 2023-03-01 DOI: 10.48550/arXiv.2303.00141
Xingran Chen, Hesam Nikpey, Jungyeol Kim, S. Sarkar, S. S. Bidokhti
{"title":"Containing a spread through sequential learning: to exploit or to explore?","authors":"Xingran Chen, Hesam Nikpey, Jungyeol Kim, S. Sarkar, S. S. Bidokhti","doi":"10.48550/arXiv.2303.00141","DOIUrl":"https://doi.org/10.48550/arXiv.2303.00141","url":null,"abstract":"The spread of an undesirable contact process, such as an infectious disease (e.g. COVID-19), is contained through testing and isolation of infected nodes. The temporal and spatial evolution of the process (along with containment through isolation) render such detection as fundamentally different from active search detection strategies. In this work, through an active learning approach, we design testing and isolation strategies to contain the spread and minimize the cumulative infections under a given test budget. We prove that the objective can be optimized, with performance guarantees, by greedily selecting the nodes to test. We further design reward-based methodologies that effectively minimize an upper bound on the cumulative infections and are computationally more tractable in large networks. These policies, however, need knowledge about the nodes' infection probabilities which are dynamically changing and have to be learned by sequential testing. We develop a message-passing framework for this purpose and, building on that, show novel tradeoffs between exploitation of knowledge through reward-based heuristics and exploration of the unknown through a carefully designed probabilistic testing. The tradeoffs are fundamentally distinct from the classical counterparts under active search or multi-armed bandit problems (MABs). We provably show the necessity of exploration in a stylized network and show through simulations that exploration can outperform exploitation in various synthetic and real-data networks depending on the parameters of the network and the spread.","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126398529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
U-Statistics for Importance-Weighted Variational Inference 重要加权变分推理的u统计量
Trans. Mach. Learn. Res. Pub Date : 2023-02-27 DOI: 10.48550/arXiv.2302.13918
Javier Burroni
{"title":"U-Statistics for Importance-Weighted Variational Inference","authors":"Javier Burroni","doi":"10.48550/arXiv.2302.13918","DOIUrl":"https://doi.org/10.48550/arXiv.2302.13918","url":null,"abstract":"We propose the use of U-statistics to reduce variance for gradient estimation in importance-weighted variational inference. The key observation is that, given a base gradient estimator that requires $m>1$ samples and a total of $n>m$ samples to be used for estimation, lower variance is achieved by averaging the base estimator on overlapping batches of size $m$ than disjoint batches, as currently done. We use classical U-statistic theory to analyze the variance reduction, and propose novel approximations with theoretical guarantees to ensure computational efficiency. We find empirically that U-statistic variance reduction can lead to modest to significant improvements in inference performance on a range of models, with little computational cost.","PeriodicalId":432739,"journal":{"name":"Trans. Mach. Learn. Res.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126062299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信