{"title":"Sub-One Quasi-Norm-Based k-Means Clustering Algorithm and Analyses","authors":"Qi An, Shan Jiang","doi":"10.1007/s11063-024-11615-y","DOIUrl":"https://doi.org/10.1007/s11063-024-11615-y","url":null,"abstract":"<p>Recognizing the pivotal role of choosing an appropriate distance metric in designing the clustering algorithm, our focus is on innovating the <i>k</i>-means method by redefining the distance metric in its distortion. In this study, we introduce a novel <i>k</i>-means clustering algorithm utilizing a distance metric derived from the <span>(ell _p)</span> quasi-norm with <span>(pin (0,1))</span>. Through an illustrative example, we showcase the advantageous properties of the proposed distance metric compared to commonly used alternatives for revealing natural groupings in data. Subsequently, we present a novel <i>k</i>-means type heuristic by integrating this sub-one quasi-norm-based distance, offer a step-by-step iterative relocation scheme, and prove the convergence to the Kuhn-Tucker point. Finally, we empirically validate the effectiveness of our clustering method through experiments on synthetic and real-life datasets, both in their original form and with additional noise introduced. We also investigate the performance of the proposed method as a subroutine in a deep learning clustering algorithm. Our results demonstrate the efficacy of the proposed <i>k</i>-means algorithm in capturing distinctive patterns exhibited by certain data types.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"46 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140938931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Xia, Jianfeng Tang, Guangli Li, Jun Fu, Shukai Duan, Lidan Wang
{"title":"Time Series Classification Based on Forward Echo State Convolution Network","authors":"Lei Xia, Jianfeng Tang, Guangli Li, Jun Fu, Shukai Duan, Lidan Wang","doi":"10.1007/s11063-024-11449-8","DOIUrl":"https://doi.org/10.1007/s11063-024-11449-8","url":null,"abstract":"<p>The Echo state network (ESN) is an efficient recurrent neural network that has achieved good results in time series prediction tasks. Still, its application in time series classification tasks has yet to develop fully. In this study, we work on the time series classification problem based on echo state networks. We propose a new framework called forward echo state convolutional network (FESCN). It consists of two parts, the encoder and the decoder, where the encoder part is composed of a forward topology echo state network (FT-ESN), and the decoder part mainly consists of a convolutional layer and a max-pooling layer. We apply the proposed network framework to the univariate time series dataset UCR and compare it with six traditional methods and four neural network models. The experimental findings demonstrate that FESCN outperforms other methods in terms of overall classification accuracy. Additionally, we investigated the impact of reservoir size on network performance and observed that the optimal classification results were obtained when the reservoir size was set to 32. Finally, we investigated the performance of the network under noise interference, and the results show that FESCN has a more stable network performance compared to EMN (echo memory network).</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"49 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140938985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Ye, Xiang Tian, Bolun Zheng, Fan Zhou, Yaowu Chen
{"title":"A Unified Asymmetric Knowledge Distillation Framework for Image Classification","authors":"Xin Ye, Xiang Tian, Bolun Zheng, Fan Zhou, Yaowu Chen","doi":"10.1007/s11063-024-11606-z","DOIUrl":"https://doi.org/10.1007/s11063-024-11606-z","url":null,"abstract":"<p>Knowledge distillation is a model compression technique that transfers knowledge learned by teacher networks to student networks. Existing knowledge distillation methods greatly expand the forms of knowledge, but also make the distillation models complex and symmetric. However, few studies have explored the commonalities among these methods. In this study, we propose a concise distillation framework to unify these methods and a method to construct asymmetric knowledge distillation under the framework. Asymmetric distillation aims to enable differentiated knowledge transfers for different distillation objects. We designed a multi-stage shallow-wide branch bifurcation method to distill different knowledge representations and a grouping ensemble strategy to supervise the network to teach and learn selectively. Consequently, we conducted experiments using image classification benchmarks to verify the proposed method. Experimental results show that our implementation can achieve considerable improvements over existing methods, demonstrating the effectiveness of the method and the potential of the framework.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"21 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pinning Group Consensus of Multi-agent Systems Under DoS Attacks","authors":"Qian Lang, Jing Xu, Huiwen Zhang, Zhengxin Wang","doi":"10.1007/s11063-024-11630-z","DOIUrl":"https://doi.org/10.1007/s11063-024-11630-z","url":null,"abstract":"<p>In this paper, group consensus is investigated for a class of nonlinear multi-agent systems suffered from the DoS attacks. Firstly, a first-order nonlinear multi-agent system is constructed, which is divided into <i>M</i> subsystems and each subsystem has an unique leader. Then a protocol is proposed and a Lyapunov function candidate is chosen. By means of the stability theory, a sufficient criterion, which involves the duration of DoS attacks, coupling strength and control gain, is obtained for achieving group consensus in first-order system. That is, the nodes in each subsystem can track the leader of that group. Furthermore, the result is extended to nonlinear second-order multi-agent systems and the controller is also improved to obtain sufficient conditions for group consensus. Additionally, the lower bounds of the coupling strength and average interval of DoS attacks can be determined from the obtained sufficient conditions. Finally, several numerical simulations are presented to explain the effectiveness of the proposed controllers and the derived theoretical results.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"27 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140938927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manu Augustine, Om Prakash Yadav, Ashish Nayyar, Dheeraj Joshi
{"title":"Use of a Modified Threshold Function in Fuzzy Cognitive Maps for Improved Failure Mode Identification","authors":"Manu Augustine, Om Prakash Yadav, Ashish Nayyar, Dheeraj Joshi","doi":"10.1007/s11063-024-11623-y","DOIUrl":"https://doi.org/10.1007/s11063-024-11623-y","url":null,"abstract":"<p>Fuzzy cognitive maps (FCMs) provide a rapid and efficient approach for system modeling and simulation. The literature demonstrates numerous successful applications of FCMs in identifying failure modes. The standard process of failure mode identification using FCMs involves monitoring crucial concept/node values for excesses. Threshold functions are used to limit the value of nodes within a pre-specified range, which is usually [0, 1] or [-1, + 1]. However, traditional FCMs using the <i>tanh</i> threshold function possess two crucial drawbacks for this particular.Purpose(i) a tendency to reduce the values of state vector components, and (ii) the potential inability to reach a limit state with clearly identifiable failure states. The reason for this is the inherent mathematical nature of the <i>tanh</i> function in being asymptotic to the horizontal line demarcating the edge of the specified range. To overcome these limitations, this paper introduces a novel modified <i>tanh</i> threshold function that effectively addresses both issues.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"25 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140938983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Domain Adaptation Depth Estimation Based on Self-attention Mechanism and Edge Consistency Constraints","authors":"Peng Guo, Shuguo Pan, Peng Hu, Ling Pei, Baoguo Yu","doi":"10.1007/s11063-024-11621-0","DOIUrl":"https://doi.org/10.1007/s11063-024-11621-0","url":null,"abstract":"<p>In the unsupervised domain adaptation (UDA) (Akada et al. Self-supervised learning of domain invariant features for depth estimation, in: 2022 IEEE/CVF winter conference on applications of computer vision (WACV), pp 3377–3387 (2022). 10.1109/WACV51458.2022.00107) depth estimation task, a new adaptive approach is to use the bidirectional transformation network to transfer the style between the target and source domain inputs, and then train the depth estimation network in their respective domains. However, the domain adaptation process and the style transfer may result in defects and biases, often leading to depth holes and instance edge depth missing in the target domain’s depth output. To address these issues, We propose a training network that has been improved in terms of model structure and supervision constraints. First, we introduce a edge-guided self-attention mechanism in the task network of each domain to enhance the network’s attention to high-frequency edge features, maintain clear boundaries and fill in missing areas of depth. Furthermore, we utilize an edge detection algorithm to extract edge features from the input of the target domain. Then we establish edge consistency constraints between inter-domain entities in order to narrow the gap between domains and make domain-to-domain transfers easier. Our experimental demonstrate that our proposed method effectively solve the aforementioned problem, resulting in a higher quality depth map and outperforming existing state-of-the-art methods.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"2 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140938928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Prototype-Based Neural Network for Image Anomaly Detection and Localization","authors":"Chao Huang, Zhao Kang, Hong Wu","doi":"10.1007/s11063-024-11466-7","DOIUrl":"https://doi.org/10.1007/s11063-024-11466-7","url":null,"abstract":"<p>Image anomaly detection and localization perform not only image-level anomaly classification but also locate pixel-level anomaly regions. Recently, it has received much research attention due to its wide application in various fields. This paper proposes ProtoAD, a prototype-based neural network for image anomaly detection and localization. First, the patch features of normal images are extracted by a deep network pre-trained on nature images. Then, the prototypes of the normal patch features are learned by non-parametric clustering. Finally, we construct an image anomaly localization network (ProtoAD) by appending the feature extraction network with <i>L</i>2 feature normalization, a <span>(1times 1)</span> convolutional layer, a channel max-pooling, and a subtraction operation. We use the prototypes as the kernels of the <span>(1times 1)</span> convolutional layer; therefore, our neural network does not need a training phase and can conduct anomaly detection and localization in an end-to-end manner. Extensive experiments on two challenging industrial anomaly detection datasets, MVTec AD and BTAD, demonstrate that ProtoAD achieves competitive performance compared to the state-of-the-art methods with a higher inference speed. The code and pre-trained models are publicly available at https://github.com/98chao/ProtoAD.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"45 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140938881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kyungdeuk Ko, Donghyeon Kim, Kyungseok Oh, Hanseok Ko
{"title":"WaveVC: Speech and Fundamental Frequency Consistent Raw Audio Voice Conversion","authors":"Kyungdeuk Ko, Donghyeon Kim, Kyungseok Oh, Hanseok Ko","doi":"10.1007/s11063-024-11613-0","DOIUrl":"https://doi.org/10.1007/s11063-024-11613-0","url":null,"abstract":"<p>Voice conversion (VC) is a task for changing the speech of a source speaker to the target voice while preserving linguistic information of the source speech. The existing VC methods typically use mel-spectrogram as both input and output, so a separate vocoder is required to transform mel-spectrogram into waveform. Therefore, the VC performance varies depending on the vocoder performance, and noisy speech can be generated due to problems such as train-test mismatch. In this paper, we propose a speech and fundamental frequency consistent raw audio voice conversion method called WaveVC. Unlike other methods, WaveVC does not require a separate vocoder and can perform VC directly on raw audio waveform using 1D convolution. This eliminates the issue of performance degradation caused by the train-test mismatch of the vocoder. In the training phase, WaveVC employs speech loss and F0 loss to preserve the content of the source speech and generate F0 consistent speech using the pre-trained networks. WaveVC is capable of converting voices while maintaining consistency in speech and fundamental frequency. In the test phase, the F0 feature of the source speech is concatenated with a content embedding vector to ensure the converted speech follows the fundamental frequency flow of the source speech. WaveVC achieves higher performances than baseline methods in both many-to-many VC and any-to-any VC. The converted samples are available online.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"37 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-view Self-supervised Learning and Multi-scale Feature Fusion for Automatic Speech Recognition","authors":"Jingyu Zhao, Ruwei Li, Maocun Tian, Weidong An","doi":"10.1007/s11063-024-11614-z","DOIUrl":"https://doi.org/10.1007/s11063-024-11614-z","url":null,"abstract":"<p>To address the challenges of the poor representation capability and low data utilization rate of end-to-end speech recognition models in deep learning, this study proposes an end-to-end speech recognition model based on multi-scale feature fusion and multi-view self-supervised learning (MM-ASR). It adopts a multi-task learning paradigm for training. The proposed method emphasizes the importance of inter-layer information within shared encoders, aiming to enhance the model’s characterization capability via the multi-scale feature fusion module. Moreover, we apply multi-view self-supervised learning to effectively exploit data information. Our approach is rigorously evaluated on the Aishell-1 dataset and further validated its effectiveness on the English corpus WSJ. The experimental results demonstrate a noteworthy 4.6<span>(%)</span> reduction in character error rate, indicating significantly improved speech recognition performance . These findings showcase the effectiveness and potential of our proposed MM-ASR model for end-to-end speech recognition tasks.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"29 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TLCE: Transfer-Learning Based Classifier Ensembles for Few-Shot Class-Incremental Learning","authors":"Shuangmei Wang, Yang Cao, Tieru Wu","doi":"10.1007/s11063-024-11605-0","DOIUrl":"https://doi.org/10.1007/s11063-024-11605-0","url":null,"abstract":"<p>Few-shot class-incremental learning (FSCIL) struggles to incrementally recognize novel classes from few examples without catastrophic forgetting of old classes or overfitting to new classes. We propose TLCE, which ensembles multiple pre-trained models to improve separation of novel and old classes. Specifically, we use episodic training to map images from old classes to quasi-orthogonal prototypes, which minimizes interference between old and new classes. Then, we incorporate the use of ensembling diverse pre-trained models to further tackle the challenge of data imbalance and enhance adaptation to novel classes. Extensive experiments on various datasets demonstrate that our transfer learning ensemble approach outperforms state-of-the-art FSCIL methods.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"12 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}