{"title":"Accurate Lung Nodule Segmentation With Detailed Representation Transfer and Soft Mask Supervision.","authors":"Changwei Wang, Rongtao Xu, Shibiao Xu, Weiliang Meng, Jun Xiao, Xiaopeng Zhang","doi":"10.1109/TNNLS.2023.3315271","DOIUrl":"10.1109/TNNLS.2023.3315271","url":null,"abstract":"<p><p>Accurate lung lesion segmentation from computed tomography (CT) images is crucial to the analysis and diagnosis of lung diseases, such as COVID-19 and lung cancer. However, the smallness and variety of lung nodules and the lack of high-quality labeling make the accurate lung nodule segmentation difficult. To address these issues, we first introduce a novel segmentation mask named \"soft mask\", which has richer and more accurate edge details description and better visualization, and develop a universal automatic soft mask annotation pipeline to deal with different datasets correspondingly. Then, a novel network with detailed representation transfer and soft mask supervision (DSNet) is proposed to process the input low-resolution images of lung nodules into high-quality segmentation results. Our DSNet contains a special detailed representation transfer module (DRTM) for reconstructing the detailed representation to alleviate the small size of lung nodules images and an adversarial training framework with soft mask for further improving the accuracy of segmentation. Extensive experiments validate that our DSNet outperforms other state-of-the-art methods for accurate lung nodule segmentation, and has strong generalization ability in other accurate medical segmentation tasks with competitive results. Besides, we provide a new challenging lung nodules segmentation dataset for further studies (https://drive.google.com/file/d/15NNkvDTb_0Ku0IoPsNMHezJR TH1Oi1wm/view?usp=sharing).</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qilong Wang, Qiyao Hu, Zilin Gao, Peihua Li, Qinghua Hu
{"title":"AMS-Net: Modeling Adaptive Multi-Granularity Spatio-Temporal Cues for Video Action Recognition.","authors":"Qilong Wang, Qiyao Hu, Zilin Gao, Peihua Li, Qinghua Hu","doi":"10.1109/TNNLS.2023.3321141","DOIUrl":"10.1109/TNNLS.2023.3321141","url":null,"abstract":"<p><p>Effective spatio-temporal modeling as a core of video representation learning is challenged by complex scale variations in spatio-temporal cues in videos, especially different visual tempos of actions and varying spatial sizes of moving objects. Most of the existing works handle complex spatio-temporal scale variations based on input-level or feature-level pyramid mechanisms, which, however, rely on expensive multistream architectures or explore multiscale spatio-temporal features in a fixed manner. To effectively capture complex scale dynamics of spatio-temporal cues in an efficient way, this article proposes a single-stream architecture (SS-Arch.) with single-input namely, adaptive multi-granularity spatio-temporal network (AMS-Net) to model adaptive multi-granularity (Multi-Gran.) Spatio-temporal cues for video action recognition. To this end, our AMS-Net proposes two core components, namely, competitive progressive temporal modeling (CPTM) block and collaborative spatio-temporal pyramid (CSTP) module. They, respectively, capture fine-grained temporal cues and fuse coarse-level spatio-temporal features in an adaptive manner. It admits that AMS-Net can handle subtle variations in visual tempos and fair-sized spatio-temporal dynamics in a unified architecture. Note that our AMS-Net can be flexibly instantiated based on existing deep convolutional neural networks (CNNs) with the proposed CPTM block and CSTP module. The experiments are conducted on eight video benchmarks, and the results show our AMS-Net establishes state-of-the-art (SOTA) performance on fine-grained action recognition (i.e., Diving48 and FineGym), while performing very competitively on widely used Something-Something and Kinetics.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-Intrusive Speech Quality Assessment Based on Deep Neural Networks for Speech Communication.","authors":"Miao Liu, Jing Wang, Fei Wang, Fei Xiang, Jingdong Chen","doi":"10.1109/TNNLS.2023.3321076","DOIUrl":"10.1109/TNNLS.2023.3321076","url":null,"abstract":"<p><p>Traditionally, speech quality evaluation relies on subjective assessments or intrusive methods that require reference signals or additional equipment. However, over recent years, non-intrusive speech quality assessment has emerged as a promising alternative, capturing much attention from researchers and industry professionals. This article presents a deep learning-based method that exploits large-scale intrusive simulated data to improve the accuracy and generalization of non-intrusive methods. The major contributions of this article are as follows. First, it presents a data simulation method, which generates degraded speech signals and labels their speech quality with the perceptual objective listening quality assessment (POLQA). The generated data is proven to be useful for pretraining the deep learning models. Second, it proposes to apply an adversarial speaker classifier to reduce the impact of speaker-dependent information on speech quality evaluation. Third, an autoencoder-based deep learning scheme is proposed following the principle of representation learning and adversarial training (AT) methods, which is able to transfer the knowledge learned from a large amount of simulated speech data labeled by POLQA. With the help of discriminative representations extracted from the autoencoder, the prediction model can be trained well on a relatively small amount of speech data labeled through subjective listening tests. Fourth, an end-to-end speech quality evaluation neural network is developed, which takes magnitude and phase spectral features as its inputs. This phase-aware model is more accurate than the model using only the magnitude spectral features. A large number of experiments are carried out with three datasets: one simulated with labels obtained using POLQA and two recorded with labels obtained using subjective listening tests. The results show that the presented phase-aware method improves the performance of the baseline model and the proposed model with latent representations extracted from the adversarial autoencoder (AAE) outperforms the state-of-the-art objective quality assessment methods, reducing the root mean square error (RMSE) by 10.5% and 12.2% on the Beijing Institute of Technology (BIT) dataset and Tencent Corpus, respectively. The code and supplementary materials are available at https://github.com/liushenme/AAE-SQA.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synchronization of Time-Delay Coupled Neural Networks With Stabilizing Delayed Impulsive Control.","authors":"Lingzhong Zhang, Jianquan Lu, Fengyi Liu, Jungang Lou","doi":"10.1109/TNNLS.2023.3320651","DOIUrl":"10.1109/TNNLS.2023.3320651","url":null,"abstract":"<p><p>This brief studies the distributed synchronization of time-delay coupled neural networks (NNs) with impulsive pinning control involving stabilizing delays. A novel differential inequality is proposed, where the state's past information at impulsive time is effectively extracted and used to handle the synchronization of coupled NNs. Based on this inequality, the restriction that the size of impulsive delay is always limited by the system delay is removed, and the upper bound on the impulsive delay is relaxed, which is improved the existing related results. By using the methods of average impulsive interval (AII) and impulsive delay, some relaxed criteria for distributed synchronization of time-delay coupled NNs are obtained. The proposed synchronization conditions do not impose on the upper bound of two consecutive impulsive signals, and the lower bound is more flexible. Moreover, our results reveal that the impulsive delays may contribute to the synchronization of time-delay systems. Finally, typical networks are presented to illustrate the advantage of our delayed impulsive control method.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Lightweight Pixel-Level Unified Image Fusion Network.","authors":"Jinyang Liu, Shutao Li, Haibo Liu, Renwei Dian, Xiaohui Wei","doi":"10.1109/TNNLS.2023.3311820","DOIUrl":"10.1109/TNNLS.2023.3311820","url":null,"abstract":"<p><p>In recent years, deep-learning-based pixel-level unified image fusion methods have received more and more attention due to their practicality and robustness. However, they usually require a complex network to achieve more effective fusion, leading to high computational cost. To achieve more efficient and accurate image fusion, a lightweight pixel-level unified image fusion (L-PUIF) network is proposed. Specifically, the information refinement and measurement process are used to extract the gradient and intensity information and enhance the feature extraction capability of the network. In addition, these information are converted into weights to guide the loss function adaptively. Thus, more effective image fusion can be achieved while ensuring the lightweight of the network. Extensive experiments have been conducted on four public image fusion datasets across multimodal fusion, multifocus fusion, and multiexposure fusion. Experimental results show that L-PUIF can achieve better fusion efficiency and has a greater visual effect compared with state-of-the-art methods. In addition, the practicability of L-PUIF in high-level computer vision tasks, i.e., object detection and image segmentation, has been verified.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zehua Du, Hao Zhang, Zhiqiang Wei, Yuanyuan Zhu, Jiali Xu, Xianqing Huang, Bo Yin
{"title":"Merge Loss Calculation Method for Highly Imbalanced Data Multiclass Classification.","authors":"Zehua Du, Hao Zhang, Zhiqiang Wei, Yuanyuan Zhu, Jiali Xu, Xianqing Huang, Bo Yin","doi":"10.1109/TNNLS.2023.3321753","DOIUrl":"10.1109/TNNLS.2023.3321753","url":null,"abstract":"<p><p>In real classification scenarios, the number distribution of modeling samples is usually out of proportion. Most of the existing classification methods still face challenges in comprehensive model performance for imbalanced data. In this article, a novel theoretical framework is proposed that establishes a proportion coefficient independent of the number distribution of modeling samples and a general merge loss calculation method independent of class distribution. The loss calculation method of the imbalanced problem focuses on both the global and batch sample levels. Specifically, the loss function calculation introduces the true-positive rate (TPR) and the false-positive rate (FPR) to ensure the independence and balance of loss calculation for each class. Based on this, global and local loss weight coefficients are generated from the entire dataset and batch dataset for the multiclass classification problem, and a merge weight loss function is calculated after unifying the weight coefficient scale. Furthermore, the designed loss function is applied to different neural network models and datasets. The method shows better performance on imbalanced datasets than state-of-the-art methods.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiview Spectral Clustering Based on Consensus Neighbor Strategy.","authors":"Jiayi Tang, Yuping Lai, Xinwang Liu","doi":"10.1109/TNNLS.2023.3319823","DOIUrl":"10.1109/TNNLS.2023.3319823","url":null,"abstract":"<p><p>Multiview spectral clustering, renowned for its spatial learning capability, has garnered significant attention in the data mining field. However, existing methods assume that the optimal consensus adjacency matrix is confined within the space spanned by each view's adjacency matrix. This constraint restricts the feasible domain of the algorithm and hinders the exploration of the optimal consensus adjacency matrix. To address this limitation, we propose a novel and convex strategy, termed the consensus neighbor strategy, for learning the optimal consensus adjacency matrix. This approach constructs the optimal consensus adjacency matrix by capturing the consensus local structure of each sample across all views, thereby expanding the search space and facilitating the discovery of the optimal consensus adjacency matrix. Furthermore, we introduce the concept of a correlation measuring matrix to prevent trivial solution. We develop an efficient iterative algorithm to solve the resulting optimization problem, benefitting from the convex nature of our model, which ensures convergence to a global optimum. Experimental results on 16 multiview datasets demonstrate that our proposed algorithm surpasses state-of-the-art methods in terms of its robust consensus representation learning capability. The code of this article is uploaded to https://github.com/PhdJiayiTang/Consensus-Neighbor-Strategy.git.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gradient Correction for White-Box Adversarial Attacks.","authors":"Hongying Liu, Zhijin Ge, Zhenyu Zhou, Fanhua Shang, Yuanyuan Liu, Licheng Jiao","doi":"10.1109/TNNLS.2023.3315414","DOIUrl":"10.1109/TNNLS.2023.3315414","url":null,"abstract":"<p><p>Deep neural networks (DNNs) play key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, which are almost imperceptibly different from the original samples but can greatly change the output of DNNs. Recently, many white-box attack algorithms have been proposed, and most of the algorithms concentrate on how to make the best use of gradients per iteration to improve adversarial performance. In this article, we focus on the properties of the widely used activation function, rectified linear unit (ReLU), and find that there exist two phenomena (i.e., wrong blocking and over transmission) misguiding the calculation of gradients for ReLU during backpropagation. Both issues enlarge the difference between the predicted changes of the loss function from gradients and corresponding actual changes and misguide the optimized direction, which results in larger perturbations. Therefore, we propose a universal gradient correction adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient-based white-box attack algorithms such as fast gradient signed method (FGSM), iterative FGSM (I-FGSM), momentum I-FGSM (MI-FGSM), and variance tuning MI-FGSM (VMI-FGSM). Through backpropagation, our approach calculates the gradient of the loss function with respect to the network input, maps the values to scores, and selects a part of them to update the misguided gradients. Comprehensive experimental results on ImageNet and CIFAR10 demonstrate that our ADV-ReLU can be easily integrated into many state-of-the-art gradient-based white-box attack algorithms, as well as transferred to black-box attacks, to further decrease perturbations measured in the l<sub>2</sub> -norm.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Song Zhu, Jiahui Zhang, Xiaoyang Liu, Mouquan Shen, Shiping Wen, Chaoxu Mu
{"title":"Multistability and Robustness of Competitive Neural Networks With Time-Varying Delays.","authors":"Song Zhu, Jiahui Zhang, Xiaoyang Liu, Mouquan Shen, Shiping Wen, Chaoxu Mu","doi":"10.1109/TNNLS.2023.3321434","DOIUrl":"10.1109/TNNLS.2023.3321434","url":null,"abstract":"<p><p>This article is devoted to analyzing the multistability and robustness of competitive neural networks (NNs) with time-varying delays. Based on the geometrical structure of activation functions, some sufficient conditions are proposed to ascertain the coexistence of ∏<sub>i=1</sub><sup>n</sup>(2R<sub>i</sub>+1) equilibrium points, ∏<sub>i=1</sub><sup>n</sup>(R<sub>i</sub>+1) of them are locally exponentially stable, where n represents a dimension of system and R<sub>i</sub> is the parameter related to activation functions. The derived stability results not only involve exponential stability but also include power stability and logarithmical stability. In addition, the robustness of ∏<sub>i=1</sub><sup>n</sup>(R<sub>i</sub>+1) stable equilibrium points is discussed in the presence of perturbations. Compared with previous papers, the conclusions proposed in this article are easy to verify and enrich the existing stability theories of competitive NNs. Finally, numerical examples are provided to support theoretical results.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pixel-Centric Context Perception Network for Camouflaged Object Detection.","authors":"Ze Song, Xudong Kang, Xiaohui Wei, Shutao Li","doi":"10.1109/TNNLS.2023.3319323","DOIUrl":"10.1109/TNNLS.2023.3319323","url":null,"abstract":"<p><p>Camouflaged object detection (COD) aims to identify object pixels visually embedded in the background environment. Existing deep learning methods fail to utilize the context information around different pixels adequately and efficiently. In order to solve this problem, a novel pixel-centric context perception network (PCPNet) is proposed, the core of which is to customize the personalized context of each pixel based on the automatic estimation of its surroundings. Specifically, PCPNet first employs an elegant encoder equipped with the designed vital component generation (VCG) module to obtain a set of compact features rich in low-level spatial and high-level semantic information across multiple subspaces. Then, we present a parameter-free pixel importance estimation (PIE) function based on multiwindow information fusion. Object pixels with complex backgrounds will be assigned with higher PIE values. Subsequently, PIE is utilized to regularize the optimization loss. In this way, the network can pay more attention to those pixels with higher PIE values in the decoding stage. Finally, a local continuity refinement module (LCRM) is used to refine the detection results. Extensive experiments on four COD benchmarks, five salient object detection (SOD) benchmarks, and five polyp segmentation benchmarks demonstrate the superiority of PCPNet with respect to other state-of-the-art methods.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41199561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}