Machine LearningPub Date : 2024-07-02DOI: 10.1007/s10994-024-06575-2
Yinghua Yao, Yuangang Pan, Jing Li, Ivor Tsang, Xin Yao
{"title":"PROUD: PaRetO-gUided diffusion model for multi-objective generation","authors":"Yinghua Yao, Yuangang Pan, Jing Li, Ivor Tsang, Xin Yao","doi":"10.1007/s10994-024-06575-2","DOIUrl":"https://doi.org/10.1007/s10994-024-06575-2","url":null,"abstract":"<p>Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation quality (i.e., the quality of generated samples). To address these issues, we formulate a constrained optimization problem. It seeks to optimize generation quality while ensuring that generated samples reside at the Pareto front of multiple property objectives. Such a formulation enables the generation of samples that cannot be further improved simultaneously on the conflicting property functions and preserves good quality of generated samples.Building upon this formulation, we introduce the ParetO-gUided Diffusion model (PROUD), wherein the gradients in the denoising process are dynamically adjusted to enhance generation quality while the generated samples adhere to Pareto optimality. Experimental evaluations on image generation and protein generation tasks demonstrate that our PROUD consistently maintains superior generation quality while approaching Pareto optimality across multiple property functions compared to various baselines</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"13 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-06-27DOI: 10.1007/s10994-024-06541-y
Ganyu Wang, Qingsong Zhang, Xiang Li, Boyu Wang, Bin Gu, Charles X. Ling
{"title":"Secure and fast asynchronous Vertical Federated Learning via cascaded hybrid optimization","authors":"Ganyu Wang, Qingsong Zhang, Xiang Li, Boyu Wang, Bin Gu, Charles X. Ling","doi":"10.1007/s10994-024-06541-y","DOIUrl":"https://doi.org/10.1007/s10994-024-06541-y","url":null,"abstract":"<p>Vertical Federated Learning (VFL) is gaining increasing attention due to its ability to enable multiple parties to collaboratively train a privacy-preserving model using vertically partitioned data. Recent research has highlighted the advantages of using zeroth-order optimization (ZOO) in developing practical VFL algorithms. However, a significant drawback of ZOO-based VFL is its slow convergence rate, which limits its applicability in handling large modern models. To address this issue, we propose a cascaded hybrid optimization method for VFL. In this method, the downstream models (clients) are trained using ZOO to ensure privacy and prevent the sharing of internal information. Simultaneously, the upstream model (server) is updated locally using first-order optimization, which significantly improves the convergence rate. This approach allows for the training of large models without compromising privacy and security. We theoretically prove that our VFL method achieves faster convergence compared to ZOO-based VFL because the convergence rate of our framework is not limited by the size of the server model, making it effective for training large models. Extensive experiments demonstrate that our method achieves faster convergence than ZOO-based VFL while maintaining an equivalent level of privacy protection. Additionally, we demonstrate the feasibility of training large models using our method.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"44 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-06-27DOI: 10.1007/s10994-024-06567-2
Arthur Hoarau, Vincent Lemaire, Yolande Le Gall, Jean-Christophe Dubois, Arnaud Martin
{"title":"Evidential uncertainty sampling strategies for active learning","authors":"Arthur Hoarau, Vincent Lemaire, Yolande Le Gall, Jean-Christophe Dubois, Arnaud Martin","doi":"10.1007/s10994-024-06567-2","DOIUrl":"https://doi.org/10.1007/s10994-024-06567-2","url":null,"abstract":"<p>Recent studies in active learning, particularly in uncertainty sampling, have focused on the decomposition of model uncertainty into reducible and irreducible uncertainties. In this paper, the aim is to simplify the computational process while eliminating the dependence on observations. Crucially, the inherent uncertainty in the labels is considered, i.e. the uncertainty of the oracles. Two strategies are proposed, sampling by Klir uncertainty, which tackles the exploration–exploitation dilemma, and sampling by evidential epistemic uncertainty, which extends the concept of reducible uncertainty within the evidential framework, both using the theory of belief functions. Experimental results in active learning demonstrate that our proposed method can outperform uncertainty sampling.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"26 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-06-27DOI: 10.1007/s10994-024-06573-4
Gabor Paczolay, Matteo Papini, Alberto Maria Metelli, Istvan Harmati, Marcello Restelli
{"title":"Sample complexity of variance-reduced policy gradient: weaker assumptions and lower bounds","authors":"Gabor Paczolay, Matteo Papini, Alberto Maria Metelli, Istvan Harmati, Marcello Restelli","doi":"10.1007/s10994-024-06573-4","DOIUrl":"https://doi.org/10.1007/s10994-024-06573-4","url":null,"abstract":"<p>Several variance-reduced versions of REINFORCE based on importance sampling achieve an improved <span>(O(epsilon ^{-3}))</span> sample complexity to find an <span>(epsilon)</span>-stationary point, under an unrealistic assumption on the variance of the importance weights. In this paper, we propose the Defensive Policy Gradient (DEF-PG) algorithm, based on defensive importance sampling, achieving the same result without any assumption on the variance of the importance weights. We also show that this is not improvable by establishing a matching <span>(Omega (epsilon ^{-3}))</span> lower bound, and that REINFORCE with its <span>(O(epsilon ^{-4}))</span> sample complexity is actually optimal under weaker assumptions on the policy class. Numerical simulations show promising results for the proposed technique compared to similar algorithms based on vanilla importance sampling.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"24 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-06-25DOI: 10.1007/s10994-024-06578-z
Andrea Basteri, Dario Trevisan
{"title":"Quantitative Gaussian approximation of randomly initialized deep neural networks","authors":"Andrea Basteri, Dario Trevisan","doi":"10.1007/s10994-024-06578-z","DOIUrl":"https://doi.org/10.1007/s10994-024-06578-z","url":null,"abstract":"<p>Given any deep fully connected neural network, initialized with random Gaussian parameters, we bound from above the quadratic Wasserstein distance between its output distribution and a suitable Gaussian process. Our explicit inequalities indicate how the hidden and output layers sizes affect the Gaussian behaviour of the network and quantitatively recover the distributional convergence results in the wide limit, i.e., if all the hidden layers sizes become large.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"33 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141532684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-06-25DOI: 10.1007/s10994-024-06579-y
Manuel Dileo, Matteo Zignani
{"title":"Discrete-time graph neural networks for transaction prediction in Web3 social platforms","authors":"Manuel Dileo, Matteo Zignani","doi":"10.1007/s10994-024-06579-y","DOIUrl":"https://doi.org/10.1007/s10994-024-06579-y","url":null,"abstract":"<p>In Web3 social platforms, i.e. social web applications that rely on blockchain technology to support their functionalities, interactions among users are usually multimodal, from common social interactions such as following, liking, or posting, to specific relations given by crypto-token transfers facilitated by the blockchain. In this dynamic and intertwined networked context, modeled as a financial network, our main goals are (i) to predict whether a pair of users will be involved in a financial transaction, i.e. the <i>transaction prediction task</i>, even using textual information produced by users, and (ii) to verify whether performances may be enhanced by textual content. To address the above issues, we compared current snapshot-based temporal graph learning methods and developed T3GNN, a solution based on state-of-the-art temporal graph neural networks’ design, which integrates fine-tuned sentence embeddings and a simple yet effective graph-augmentation strategy for representing content, and historical negative sampling. We evaluated models in a Web3 context by leveraging a novel high-resolution temporal dataset, collected from one of the most used Web3 social platforms, which spans more than one year of financial interactions as well as published textual content. The experimental evaluation has shown that T3GNN consistently achieved the best performance over time and for most of the snapshots. Furthermore, through an extensive analysis of the performance of our model, we show that, despite the graph structure being crucial for making predictions, textual content contains useful information for forecasting transactions, highlighting an interplay between users’ interests and economic relationships in Web3 platforms. Finally, the evaluation has also highlighted the importance of adopting sampling methods alternative to random negative sampling when dealing with prediction tasks on temporal networks.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"345 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-06-21DOI: 10.1007/s10994-024-06572-5
Yunting Zhang, Shang Li, Lin Ye, Hongli Zhang, Zhe Chen, Binxing Fang
{"title":"Kalt: generating adversarial explainable chinese legal texts","authors":"Yunting Zhang, Shang Li, Lin Ye, Hongli Zhang, Zhe Chen, Binxing Fang","doi":"10.1007/s10994-024-06572-5","DOIUrl":"https://doi.org/10.1007/s10994-024-06572-5","url":null,"abstract":"<p>Deep neural networks (DNNs) are vulnerable to adversarial examples (AEs), which are well-designed input samples with imperceptible perturbations. Existing methods generate AEs to evaluate the robustness of DNN-based natural language processing models. However, the AE attack performance significantly degrades in some verticals, such as law, due to overlooking essential domain knowledge. To generate explainable Chinese legal adversarial texts, we introduce legal knowledge and propose a novel black-box approach, knowledge-aware law tricker (KALT), in the framework of adversarial text generation based on word importance. Firstly, we invent a legal knowledge extraction method based on KeyBERT. The knowledge contains unique features from each category and shared features among different categories. Additionally, we design two perturbation strategies, Strengthen Similar Label and Weaken Original Label, to selectively perturb the two types of features, which can significantly reduce the classification accuracy of the target model. These two perturbation strategies can be regarded as components, which can be conveniently integrated into any perturbation method to enhance attack performance. Furthermore, we propose a strong hybrid perturbation method to introduce perturbation into the original texts. The perturbation method combines seven representative perturbation methods for Chinese. Finally, we design a formula to calculate interpretability scores, quantifying the interpretability of adversarial text generation methods. Experimental results demonstrate that KALT can effectively generate explainable Chinese legal adversarial texts that can be misclassified with high confidence and achieve excellent attack performance against the powerful Chinese BERT.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"53 32 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-06-19DOI: 10.1007/s10994-024-06549-4
Ofir Moshe, Gil Fidel, Ron Bitton, Asaf Shabtai
{"title":"Improving interpretability via regularization of neural activation sensitivity","authors":"Ofir Moshe, Gil Fidel, Ron Bitton, Asaf Shabtai","doi":"10.1007/s10994-024-06549-4","DOIUrl":"https://doi.org/10.1007/s10994-024-06549-4","url":null,"abstract":"<p>State-of-the-art deep neural networks (DNNs) are highly effective at tackling many real-world tasks. However, their widespread adoption in mission-critical contexts is limited due to two major weaknesses - their susceptibility to adversarial attacks and their opaqueness. The former raises concerns about DNNs’ security and generalization in real-world conditions, while the latter, opaqueness, directly impacts interpretability. The lack of interpretability diminishes user trust as it is challenging to have confidence in a model’s decision when its reasoning is not aligned with human perspectives. In this research, we (1) examine the effect of adversarial robustness on interpretability, and (2) present a novel approach for improving DNNs’ interpretability that is based on the regularization of neural activation sensitivity. We evaluate the interpretability of models trained using our method to that of standard models and models trained using state-of-the-art adversarial robustness techniques. Our results show that adversarially robust models are superior to standard models, and that models trained using our proposed method are even better than adversarially robust models in terms of interpretability.(Code provided in supplementary material.)</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"32 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-06-19DOI: 10.1007/s10994-024-06569-0
Marco Markwald, Elena Demidova
{"title":"REFUEL: rule extraction for imbalanced neural node classification","authors":"Marco Markwald, Elena Demidova","doi":"10.1007/s10994-024-06569-0","DOIUrl":"https://doi.org/10.1007/s10994-024-06569-0","url":null,"abstract":"<p>Imbalanced graph node classification is a highly relevant and challenging problem in many real-world applications. The inherent data scarcity, a central characteristic of this task, substantially limits the performance of neural classification models driven solely by data. Given the limited instances of relevant nodes and complex graph structures, current methods fail to capture the distinct characteristics of node attributes and graph patterns within the underrepresented classes. In this article, we propose REFUEL—a novel approach for highly imbalanced node classification problems in graphs. Whereas symbolic and neural methods have complementary strengths and weaknesses when applied to such problems, REFUEL combines the power of symbolic and neural learning in a novel neural rule-extraction architecture. REFUEL captures the class semantics in the automatically extracted rule vectors. Then, REFUEL augments the graph nodes with the extracted rules vectors and adopts a Graph Attention Network-based neural node embedding, enhancing the downstream neural node representation. Our evaluation confirms the effectiveness of the proposed REFUEL approach for three real-world datasets with different minority class sizes. REFUEL achieves at least a 4% point improvement in precision on the minority classes of 1.5–2% compared to the baselines.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"85 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine LearningPub Date : 2024-06-19DOI: 10.1007/s10994-024-06566-3
Hanrui Wu, Yanxin Wu, Nuosi Li, Min Yang, Jia Zhang, Michael K. Ng, Jinyi Long
{"title":"High-order proximity and relation analysis for cross-network heterogeneous node classification","authors":"Hanrui Wu, Yanxin Wu, Nuosi Li, Min Yang, Jia Zhang, Michael K. Ng, Jinyi Long","doi":"10.1007/s10994-024-06566-3","DOIUrl":"https://doi.org/10.1007/s10994-024-06566-3","url":null,"abstract":"<p>Cross-network node classification aims to leverage the labeled nodes from a source network to assist the learning in a target network. Existing approaches work mainly in homogeneous settings, i.e., the nodes of the source and target networks are characterized by the same features. However, in many practical applications, nodes from different networks usually have heterogeneous features. To handle this issue, in this paper, we study the cross-network node classification under heterogeneous settings, i.e., cross-network heterogeneous node classification. Specifically, we propose a new model called High-order Proximity and Relation Analysis, which studies the high-order proximity in each network and the high-order relation between nodes across the networks to obtain two kinds of features. Subsequently, these features are exploited to learn the final effective representations by introducing a feature matching mechanism and an adversarial domain adaptation. We perform extensive experiments on several real-world datasets and make comparisons with existing baseline methods. Experimental results demonstrate the effectiveness of the proposed model.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"7 1","pages":""},"PeriodicalIF":7.5,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}