Zuqin Chen, Yujie Zheng, Jike Ge, Wencheng Yu, Zining Wang
{"title":"A Parallel Model for Jointly Extracting Entities and Relations","authors":"Zuqin Chen, Yujie Zheng, Jike Ge, Wencheng Yu, Zining Wang","doi":"10.1007/s11063-024-11616-x","DOIUrl":"https://doi.org/10.1007/s11063-024-11616-x","url":null,"abstract":"<p>Extracting relational triples from a piece of text is an essential task in knowledge graph construction. However, most existing methods either identify entities before predicting their relations, or detect relations before recognizing associated entities. This order may lead to error accumulation because once there is an error in the initial step, it will accumulate to subsequent steps. To solve this problem, we propose a parallel model for jointly extracting entities and relations, called PRE-Span, which consists of two mutually independent submodules. Specifically, candidate entities and relations are first generated by enumerating token sequences in sentences. Then, two independent submodules (Entity Extraction Module and Relation Detection Module) are designed to predict entities and relations. Finally, the predicted results of the two submodules are analyzed to select entities and relations, which are jointly decoded to obtain relational triples. The advantage of this method is that all triples can be extracted in just one step. Extensive experiments on the WebNLG*, NYT*, NYT and WebNLG datasets show that our model outperforms other baselines at 94.4%, 88.3%, 86.5% and 83.0%, respectively.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"17 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-linear Time Series Prediction using Improved CEEMDAN, SVD and LSTM","authors":"Sameer Poongadan, M. C. Lineesh","doi":"10.1007/s11063-024-11622-z","DOIUrl":"https://doi.org/10.1007/s11063-024-11622-z","url":null,"abstract":"<p>This study recommends a new time series forecasting model, namely ICEEMDAN - SVD - LSTM model, which coalesces Improved Complete Ensemble EMD with Adaptive Noise, Singular Value Decomposition and Long Short Term Memory network. It can be applied to analyse Non-linear and non-stationary data. The framework of this model is comprised of three levels, namely ICEEMDAN level, SVD level and LSTM level. The first level utilized ICEEMDAN to break up the series into some IMF components along with a residue. The SVD in the second level accounts for de-noising of every IMF component and residue. LSTM forecasts all the resultant IMF components and residue in third level. To obtain the forecasted values of the original data, the predictions of all IMF components and residue are added. The proposed model is contrasted with other extant ones, namely LSTM model, EMD - LSTM model, EEMD - LSTM model, CEEMDAN - LSTM model, EEMD - SVD - LSTM model, ICEEMDAN - LSTM model and CEEMDAN - SVD - LSTM model. The comparison bears witness to the potential of the recommended model over the traditional models.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"18 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual Classifier Adaptation: Source-Free UDA via Adaptive Pseudo-Labels Learning","authors":"Yunyun Wang, Qinghao Li, Ziyi Hua","doi":"10.1007/s11063-024-11627-8","DOIUrl":"https://doi.org/10.1007/s11063-024-11627-8","url":null,"abstract":"<p>Different from Unsupervised Domain Adaptation (UDA), Source-Free Unsupervised Domain Adaptation (SFUDA) transfers source knowledge to target domain without accessing the source data, using only the source model, has attracted much attention recently. One mainstream SFUDA method fine-tunes the source model by self-training to generate pseudo-labels of the target data. However, due to the significant differences between different domains, these target pseudo-labels often contain some noise, and it will inevitably degenerates the target performance. For this purpose, we propose an innovative SFUDA method with adaptive pseudo-labels learning named Dual Classifier Adaptation (DCA). In DCA, a dual classifier structure is introduced to adaptively learn target pseudo-labels by cooperation between source and target classifiers. Simultaneously, the minimax entropy is introduced for target learning, in order to adapt target data to source model, while capture the intrinsic cluster structure in target domain as well. After compared our proposed method DCA with a range of UDA and SFUDA methods, DCA achieves far ahead performance on several benchmark datasets.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"13 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Dialogues Summarization Algorithm Based on Multi-task Learning","authors":"Haowei Chen, Chen Li, Jiajing Liang, Lihua Tian","doi":"10.1007/s11063-024-11619-8","DOIUrl":"https://doi.org/10.1007/s11063-024-11619-8","url":null,"abstract":"<p>With the continuous advancement of social information, the number of texts in the form of dialogue between individuals has exponentially increased. However, it is very challenging to review the previous dialogue content before initiating a new conversation. In view of the above background, a new dialogue summarization algorithm based on multi-task learning is first proposed in the paper. Specifically, Minimum Risk Training is used as the loss function to alleviate the problem of inconsistent goals between the training phase and the testing phase. Then, in order to deal with the problem that the model cannot effectively distinguish gender pronouns, a gender pronoun discrimination auxiliary task based on contrast learning is designed to help the model learn to distinguish different gender pronouns. Finally, an auxiliary task of reducing exposure bias is introduced, which involves incorporating the summary generated during inference into another round of training to reduce the difference between the decoder inputs during the training and testing stages. Experimental results show that our model outperforms strong baselines on three public dialogue summarization datasets: SAMSUM, DialogSum, and CSDS.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"57 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Harsha Putla, Chanakya Patibandla, Krishna Pratap Singh, P Nagabhushan
{"title":"A Pilot Study of Observation Poisoning on Selective Reincarnation in Multi-Agent Reinforcement Learning","authors":"Harsha Putla, Chanakya Patibandla, Krishna Pratap Singh, P Nagabhushan","doi":"10.1007/s11063-024-11625-w","DOIUrl":"https://doi.org/10.1007/s11063-024-11625-w","url":null,"abstract":"<p>This research explores the vulnerability of selective reincarnation, a concept in Multi-Agent Reinforcement Learning (MARL), in response to observation poisoning attacks. Observation poisoning is an adversarial strategy that subtly manipulates an agent’s observation space, potentially leading to a misdirection in its learning process. The primary aim of this paper is to systematically evaluate the robustness of selective reincarnation in MARL systems against the subtle yet potentially debilitating effects of observation poisoning attacks. Through assessing how manipulated observation data influences MARL agents, we seek to highlight potential vulnerabilities and inform the development of more resilient MARL systems. Our experimental testbed was the widely used HalfCheetah environment, utilizing the Independent Deep Deterministic Policy Gradient algorithm within a cooperative MARL setting. We introduced a series of triggers, namely Gaussian noise addition, observation reversal, random shuffling, and scaling, into the teacher dataset of the MARL system provided to the reincarnating agents of HalfCheetah. Here, the “teacher dataset” refers to the stored experiences from previous training sessions used to accelerate the learning of reincarnating agents in MARL. This approach enabled the observation of these triggers’ significant impact on reincarnation decisions. Specifically, the reversal technique showed the most pronounced negative effect for maximum returns, with an average decrease of 38.08% in Kendall’s tau values across all the agent combinations. With random shuffling, Kendall’s tau values decreased by 17.66%. On the other hand, noise addition and scaling aligned with the original ranking by only 21.42% and 32.66%, respectively. The results, quantified by Kendall’s tau metric, indicate the fragility of the selective reincarnation process under adversarial observation poisoning. Our findings also reveal that vulnerability to observation poisoning varies significantly among different agent combinations, with some exhibiting markedly higher susceptibility than others. This investigation elucidates our understanding of selective reincarnation’s robustness against observation poisoning attacks, which is crucial for developing more secure MARL systems and also for making informed decisions about agent reincarnation.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"308 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Disentangled Variational Autoencoder for Social Recommendation","authors":"Yongshuai Zhang, Jiajin Huang, Jian Yang","doi":"10.1007/s11063-024-11607-y","DOIUrl":"https://doi.org/10.1007/s11063-024-11607-y","url":null,"abstract":"<p>Social recommendation aims to improve the recommendation performance by learning user interest and social representations from users’ interaction records and social relations. Intuitively, these learned representations entangle user interest factors with social factors because users’ interaction behaviors and social relations affect each other. A high-quality recommender system should provide items to a user according to his/her interest factors. However, most existing social recommendation models aggregate the two kinds of representations indiscriminately, and this kind of aggregation limits their recommendation performance. In this paper, we develop a model called <b>D</b>isentangled <b>V</b>ariational autoencoder for <b>S</b>ocial <b>R</b>ecommendation (DVSR) to disentangle interest and social factors from the two kinds of user representations. Firstly, we perform a preliminary analysis of the entangled information on three popular social recommendation datasets. Then, we present the model architecture of DVSR, which is based on the Variational AutoEncoder (VAE) framework. Besides the traditional method of training VAE, we also use contrastive estimation to penalize the mutual information between interest and social factors. Extensive experiments are conducted on three benchmark datasets to evaluate the effectiveness of our model.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"18 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Class-Balanced Regularization for Long-Tailed Recognition","authors":"Yuge Xu, Chuanlong Lyu","doi":"10.1007/s11063-024-11624-x","DOIUrl":"https://doi.org/10.1007/s11063-024-11624-x","url":null,"abstract":"<p>Long-tailed recognition performs poorly on minority classes. The extremely imbalanced distribution of classifier weight norms leads to a decision boundary biased toward majority classes. To address this issue, we propose Class-Balanced Regularization to balance the distribution of classifier weight norms so that the model can make more balanced and reasonable classification decisions. In detail, CBR separately adjusts the regularization factors based on L2 regularization to be correlated with the class sample frequency positively, rather than using a fixed regularization factor. CBR trains balanced classifiers by increasing the L2 norm penalty for majority classes and reducing the penalty for minority classes. Since CBR is mainly used for classification adjustment instead of feature extraction, we adopt a two-stage training algorithm. In the first stage, the network with the traditional empirical risk minimization is trained, and in the second stage, CBR for classifier adjustment is applied. To validate the effectiveness of CBR, we perform extensive experiments on CIFAR10-LT, CIFAR100-LT, and ImageNet-LT datasets. The results demonstrate that CBR significantly improves performance by effectively balancing the distribution of classifier weight norms.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"94 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinyi Chang, Chunyu Du, Xinjing Song, Weifeng Liu, Yanjiang Wang
{"title":"Target Oriented Dynamic Adaption for Cross-Domain Few-Shot Learning","authors":"Xinyi Chang, Chunyu Du, Xinjing Song, Weifeng Liu, Yanjiang Wang","doi":"10.1007/s11063-024-11508-0","DOIUrl":"https://doi.org/10.1007/s11063-024-11508-0","url":null,"abstract":"<p>Few-shot learning has achieved satisfactory progress over the years, but these methods implicitly hypothesize that the data in the base (source) classes and novel (target) classes are sampled from the same data distribution (domain), which is often invalid in reality. The purpose of cross-domain few-shot learning (CD-FSL) is to successfully identify novel target classes with a small quantity of labeled instances on the target domain under the circumstance of domain shift between the source domain and the target domain. However, in CD-FSL, the knowledge learned by the network on the source domain often suffers from the situation of inadaptation when it is transferred to the target domain, since the instances on the source and target domains do not obey the same data distribution. To surmount this problem, we propose a Target Oriented Dynamic Adaption (TODA) model, which uses a tiny amount of target data to orient the network to dynamically adjust and adapt during training. Specifically, this work proposes a domain-specific adapter to ameliorate the network inadaptability issues in transfer to the target domain. The domain-specific adapter can make the extracted features more specific to the tasks in the target domain and reduce the impact of tasks in the source domain by combining them with the mainstream backbone network. In addition, we propose an adaptive optimization method in the network optimization process, which assigns different weights according to the importance of different optimization tasks. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our TODA method.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"20 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140809329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Claudio Filipi Gonçalves dos Santos, João Paulo Papa
{"title":"Rethinking Regularization with Random Label Smoothing","authors":"Claudio Filipi Gonçalves dos Santos, João Paulo Papa","doi":"10.1007/s11063-024-11579-z","DOIUrl":"https://doi.org/10.1007/s11063-024-11579-z","url":null,"abstract":"<p>Regularization helps to improve machine learning techniques by penalizing the models during training. Such approaches act in either the input, internal, or output layers. Regarding the latter, label smoothing is widely used to introduce noise in the label vector, making learning more challenging. This work proposes a new label regularization method, Random Label Smoothing, that attributes random values to the labels while preserving their semantics during training. The idea is to change the entire label into fixed arbitrary values. Results show improvements in image classification and super-resolution tasks, outperforming state-of-the-art techniques for such purposes.\u0000</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"18 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140809350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Geometric Theory for Binary Classification of Finite Datasets by DNNs with Relu Activations","authors":"Xiao-Song Yang","doi":"10.1007/s11063-024-11612-1","DOIUrl":"https://doi.org/10.1007/s11063-024-11612-1","url":null,"abstract":"<p>In this paper we investigate deep neural networks for binary classification of datasets from geometric perspective in order to understand the working mechanism of deep neural networks. First, we establish a geometrical result on injectivity of finite set under a projection from Euclidean space to the real line. Then by introducing notions of alternative points and alternative number, we propose an approach to design DNNs for binary classification of finite labeled points on the real line, thus proving existence of binary classification neural net with its hidden layers of width two and the number of hidden layers not larger than the cardinality of the finite labelled set. We also demonstrate geometrically how the dataset is transformed across every hidden layers in a narrow DNN setting for binary classification task.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"12 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140809324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}