{"title":"A novel neural network architecture utilizing parametric-logarithmic-modulus-based activation function: Theory, algorithm, and applications","authors":"","doi":"10.1016/j.knosys.2024.112425","DOIUrl":"10.1016/j.knosys.2024.112425","url":null,"abstract":"<div><p>This paper introduces a novel parametric-logarithmic-modulus-based activation function (PLM-AF) designed to significantly enhance the nonlinear expression capabilities of high-dimensional spectroscopy data. A one-dimensional CNN-LSTM (1D-CNN-BiLSTM) model is subsequently developed to capture long-term dependencies within glucose Raman spectroscopy. To the best of our knowledge, this is the first work to simultaneously optimize the predictive performance of the model from the perspectives of both network architecture and activation functions. The effectiveness of the model is comprehensively evaluated against state-of-the-art methods using a public Raman spectroscopy dataset. Compared to the sub-optimal glucose prediction models, the proposed model improves the training root mean square error (RMSE) by 41.89%. The improved prediction accuracy demonstrates that the proposed regression model with the novel PLM-AF can significantly facilitate non-invasive glucose concentration prediction, thereby advancing the auxiliary diagnosis and healthcare industry.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142094757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Event-triggered fixed-time distributed observers for general linear systems","authors":"","doi":"10.1016/j.knosys.2024.112416","DOIUrl":"10.1016/j.knosys.2024.112416","url":null,"abstract":"<div><p>This paper delves into the implementation of a distributed fixed-time state estimation approach for linear systems, utilizing an event-triggered strategy. Particularly, the output data from the linear systems are strategically distributed across various observers. A novel fixed-time distributed observer is proposed to reconstruct full states of the linear system cooperatively with the observer error converges to zero within fixed time. Based on the event-triggered strategy, the communication times among observers are reduced by intermittent communication. Simultaneously, Zeno-behavior is ruled out completely. Finally, the proposed results are verified by a simulation example.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing performance of transformer-based models in natural language understanding through word importance embedding","authors":"","doi":"10.1016/j.knosys.2024.112404","DOIUrl":"10.1016/j.knosys.2024.112404","url":null,"abstract":"<div><p>Transformer-based models have achieved state-of-the-art performance on natural language understanding (NLU) tasks by learning important token relationships through the attention mechanism. However, we observe that attention can become overly distributed during fine-tuning, failing to preserve the dependencies between meaningful tokens adequately. This phenomenon negatively affects the learning of token relationships in sentences. To overcome this issue, we propose a methodology that embeds the feature of word importance (WI) in the transformer-based models as a new layer, weighting the words according to their importance. Our simple yet powerful approach offers a general technique to boost transformer model capabilities on NLU tasks by mitigating the risk of attention dispersion during fine-tuning. Through extensive experiments on GLUE, SuperGLUE, and SQuAD benchmarks for pre-trained models (BERT, RoBERTa, ELECTRA, and DeBERTa), and MMLU, Big Bench Hard, and DROP benchmarks for the large language model, Llama2, we validate the effectiveness of our method in consistently enhancing performance across models with negligible overhead. Furthermore, we validate that our WI layer better preserves the dependencies between important tokens than standard fine-tuning by introducing a model classifying dependent tokens from the learned attention weights. The code is available at <span><span>https://github.com/bigbases/WordImportance</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing wind power generation prediction using relevance assessment-based transfer learning","authors":"","doi":"10.1016/j.knosys.2024.112417","DOIUrl":"10.1016/j.knosys.2024.112417","url":null,"abstract":"<div><p>Accurate wind power generation forecasting can help build a reliable grid; however, the limited dataset makes accurate forecasting results a challenging work. This study introduces a relevant assessment-based transfer learning architecture to solve this problem. Linear fuzzy neighborhood mutual information is adopted to assess the relevance of the source domain selection. A convolutional structure with long-term memory architecture is designed as the deep learning model. The pre-trained model is transferred to the other wind turbines by calculating the linear fuzzy neighborhood mutual information. The proposed model avoids the lack of a dataset from analogous turbines. The simulation results indicate the proposed model surpasses the popular models in forecasting accuracy and exhibits superior time efficiency compared to popular deep-learning approaches.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142087898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-source domain adaptation using diffusion denoising for bearing fault diagnosis under variable working conditions","authors":"","doi":"10.1016/j.knosys.2024.112396","DOIUrl":"10.1016/j.knosys.2024.112396","url":null,"abstract":"<div><p>Transfer learning of multi-source domain adaptation seems a promising way for fault diagnosis of roller element bearings under variable working conditions. Data imbalance affects the performance of multi-source domain adaptation greatly and is expected to be solved by GAN. However, GAN-based transfer learning diagnosis models suffer pattern collapse and training instability, leading to unsatisfying diagnosis results in practical engineering. This paper proposes a denoising diffusion multi-source domain adaptation model (DDMDA). The proposed model uses diffusion denoising, which has better performance and is simpler to train than GAN, to generate shifted source domains for solving the data imbalance problem. A new noise prediction structure in diffusion denoising named Utrans-net, is constructed to restore the data distribution in the shifted source domain. Also, a multiple-domain discriminator structure is designed to extract features from multiple source domains to solve the issue of variable working conditions. Advanced models are used in this paper to compare with the proposed model for validation. Experimental demonstrations show that the proposed model is superior to the comparison models with satisfying performance.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Causal deconfounding deep reinforcement learning for mobile robot motion planning","authors":"","doi":"10.1016/j.knosys.2024.112406","DOIUrl":"10.1016/j.knosys.2024.112406","url":null,"abstract":"<div><p>Deep reinforcement learning (DRL) has emerged as an efficient approach for motion planning in mobile robot systems. It leverages the offline training process to enhance real-time computation efficiency. In DRL-based methods, the DRL models are trained to compute an action based on the current state of the robot and the surrounding obstacles. However, the trained models may capture spurious correlations through potential confounders, resulting in non-robust state representations, which limits the models’ robustness and generalizability. In this paper, we propose a Causal Deconfounding DRL method for Motion Planning, <span>CD-DRL-MP</span>, to address spurious correlations and learn robust and generalizable policies. Specifically, we formalize the temporal causal relationships between states and actions using a structural causal model. We then extract the minimal sufficient state representation set by blocking the backdoor paths in the causal model. Finally, using the representation set, <span>CD-DRL-MP</span> learns the causal effect between states and actions while mitigating the detrimental influence of potential confounders and computes motion commands for mobile robots. Comprehensive experiments show that the proposed method significantly outperforms non-causal DRL methods and existing causal methods, while guaranteeing good robustness and generalizability.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142087897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"UNIFY: A unified policy designing framework for solving integrated Constrained Optimization and Machine Learning problems","authors":"","doi":"10.1016/j.knosys.2024.112383","DOIUrl":"10.1016/j.knosys.2024.112383","url":null,"abstract":"<div><p>The integration of Machine Learning (ML) and Constrained Optimization (CO) techniques has recently gained significant interest. While pure CO methods struggle with scalability and robustness, and ML methods like constrained Reinforcement Learning (RL) face difficulties with combinatorial decision spaces and hard constraints, a hybrid approach shows promise. However, multi-stage decision-making under uncertainty remains challenging for current methods, which often rely on restrictive assumptions or specialized algorithms. This paper introduces <span>unify</span>, a versatile framework for tackling a wide range of problems, including multi-stage decision-making under uncertainty, using standard ML and CO components. <span>unify</span> integrates a CO problem with an unconstrained ML model through parameters controlled by the ML model, guiding the decision process. This ensures feasible decisions, minimal costs over time, and robustness to uncertainty. In the empirical evaluation, <span>unify</span> demonstrates its capability to address problems typically handled by Decision Focused Learning, Constrained RL, and Stochastic Optimization. While not always outperforming specialized methods, <span>unify</span>’s flexibility offers broader applicability and maintainability. The paper includes the method’s formalization and empirical evaluation through case studies in energy management and production scheduling, concluding with future research directions.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0950705124010177/pdfft?md5=381af6ebea374745dc3208dd45863745&pid=1-s2.0-S0950705124010177-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142083999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Disentanglement-inspired single-source domain-generalization network for cross-scene hyperspectral image classification","authors":"","doi":"10.1016/j.knosys.2024.112413","DOIUrl":"10.1016/j.knosys.2024.112413","url":null,"abstract":"<div><p>Cross-scene classification stands as a pivotal frontier in hyperspectral image (HSI) processing, aiming to enhance the generalization capabilities of classification models. However, the diversity of sensor type, shooting environments, and shooting times leads to the spectral heterogeneity problem in HSI. As a result, the same land cover may exhibit varying spectral traits in different domains, posing challenges for cross-scene HSI classification. Drawing inspiration from image disentanglement, we have identified that extracting the latent domain-invariant representation (DIR) of HSI could potentially mitigate the spectral heterogeneity issue. Therefore, we propose a Disentanglement-Inspired Single-Source Domain Generalization Network (DSDGnet) for cross-scene HSI classification in this paper. Firstly, a style transfer module based on a Transformer encoder-transfer-decoder is designed to expand the single source domain to an extended domain. Then, a progressive disentanglement module is proposed to decompose the domain-invariant features and domain-specific features of HSI. Furthermore, a domain combination module is designed to guarantee the accuracy of the progressive disentanglement module and ensure the effectiveness of the domain-invariant feature of HSI. Finally, the domain-invariant features are applied to the classification task, and the domain-specific features are separated to reduce their impact on the generalization ability of classification models. Extensive experiments on three HSI datasets have demonstrated the advanced classification performance of DSDGnet compared to existing domain-generalization methods.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MTLSC-Diff: Multitask learning with diffusion models for hyperspectral image super-resolution and classification","authors":"","doi":"10.1016/j.knosys.2024.112415","DOIUrl":"10.1016/j.knosys.2024.112415","url":null,"abstract":"<div><p>Hyperspectral image (HSIs) super-resolution (SR) can improve the spatial resolution of images for subsequent application tasks. In recent years, SR methods based on deep learning have gained widespread attention. However, most of the existing SR methods do not take into account the needs of specific application tasks when designing the network structure. These methods may not be able to efficiently generate high-quality images that satisfy the specific application tasks, leading to degradation of the performance of subsequent application tasks. To solve this problem, we propose a multi-task learning architecture based on the diffusion model, namely MTLSC-Diff. MTLSC-Diff combines the SR network and the classification network in a multi-task learning manner on the basis of the diffusion model. MTLSC-Diff achieves mutual guidance of the two tasks by iterating the image super-resolution and classification tasks, thus gradually reconstructing high-quality images and improving classification accuracy. The guided operations for each time step are performed by the specially designed Mutual-Guidance SR-Classification Synergy Module (M-GSCS). M-GSCS refines the multi-scale image obtained at the previous time step and uses the predicted high spatial resolution image for classification. Meanwhile, a class-guided SR dynamic refinement strategy (C-GSR) is proposed in M-GSCS, which uses multi-scale classification results to guide target scale images to learn new knowledge to further reconstruct high-quality images. Experimental results on relevant datasets show that our method significantly improves the super-resolution performance as well as the classification performance.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ensemble of classifiers based on score function defined by clusters and decision boundary of linear base learners","authors":"","doi":"10.1016/j.knosys.2024.112411","DOIUrl":"10.1016/j.knosys.2024.112411","url":null,"abstract":"<div><p>One possible type of base classifier output is a scoring function, which can be regarded as the probability that the class label is the true one. The measurement expressed by the score function can be of very different natures: distances, probabilities, or confidence. In this paper, we propose determining score function value based on two factors: the distance of the object from the decision boundary and the clustering of the object in the feature space. In the proposed framework, the above-mentioned scores are combined by a weighted average, and the classifier ensemble is created using the bagging or boosting technique. The proposed method was compared with five reference methods for determining the score function based on the distance between the object and the classifier decision boundary. The experimental results on 63 publicly available datasets clearly show that the proposed method outperforms the reference methods in the context of four classification performance metrics.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142083969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}