{"title":"Joint multimodal entity-relation extraction based on temporal enhancement and similarity-gated attention","authors":"","doi":"10.1016/j.knosys.2024.112504","DOIUrl":"10.1016/j.knosys.2024.112504","url":null,"abstract":"<div><p>Joint Multimodal Entity and Relation Extraction (JMERE), which needs to combine complex image information to extract entity-relation quintuples from text sequences, posts higher requirements of the model’s multimodal feature fusion and selection capabilities. With the advancement of large pre-trained language models, existing studies focus on improving the feature alignments between textual and visual modalities. However, there remains a noticeable gap in capturing the temporal information present in textual sequences. In addition, these methods exhibit a certain deficiency in distinguishing irrelevant images when integrating image and text features, making them susceptible to interference from image information unrelated to the text. To address these challenges, we propose a temporally enhanced and similarity-gated attention network (TESGA) for joint multimodal entity relation extraction. Specifically, we first incorporate an LSTM-based Text Temporal Enhancement module to enhance the model’s ability to capture temporal information from the text. Next, we introduce a Text-Image Similarity-Gated Attention mechanism, which controls the degree of incorporating image information based on the consistency between image and text features. Subsequently, We design the entity and relation prediction module using a form-filling approach based on entity and relation types, and conduct prediction of entity-relation quintuples. Notably, apart from the JMERE task, our approach can also be applied to other tasks involving text-visual enhancement, such as Multimodal Named Entity Recognition (MNER) and Multimodal Relation Extraction (MRE). To demonstrate the effectiveness of our approach, our model is extensively experimented on three benchmark datasets and achieves state-of-the-art performance. Our code will be available upon paper acceptance.<span><span><sup>1</sup></span></span></p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142171653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"StAlK: Structural Alignment based Self Knowledge distillation for Medical Image Classification","authors":"","doi":"10.1016/j.knosys.2024.112503","DOIUrl":"10.1016/j.knosys.2024.112503","url":null,"abstract":"<div><p>In the realm of medical image analysis, where challenges like high class imbalance, inter-class similarity, and intra-class variance are prevalent, knowledge distillation has emerged as a powerful mechanism for model compression and regularization. Existing methodologies, including label smoothening, contrastive learning, and relational knowledge transfer, aim to address these challenges but exhibit limitations in effectively managing either class imbalance or intricate inter and intra-class relations within input samples. In response, this paper introduces StAlK (<strong>St</strong>ructural <strong>Al</strong>ignment based Self <strong>K</strong>nowledge distillation) for Medical Image Classification, a novel approach which leverages the alignment of complex high-order discriminative features from a mean teacher model. This alignment enhances the student model’s ability to distinguish examples across different classes. StAlK demonstrates superior performance in scenarios involving both inter and intra-class relationships and proves significantly more robust in handling class imbalance compared to baseline methods. Extensive investigations across multiple benchmark datasets reveal that StAlK achieves a substantial improvement of 6%–7% in top-1 accuracy compared to various state-of-the-art baselines. The code is available at: <span><span>https://github.com/philsaurabh/StAlK_KBS</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142171748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Physically-guided temporal diffusion transformer for long-term time series forecasting","authors":"","doi":"10.1016/j.knosys.2024.112508","DOIUrl":"10.1016/j.knosys.2024.112508","url":null,"abstract":"<div><p>Transformer has shown excellent performance in long-term time series forecasting because of its capability to capture long-term dependencies. However, existing Transformer-based approaches often overlook the unique characteristics inherent to time series, particularly multi-scale periodicity, which leads to a gap in inductive biases. To address this oversight, the temporal diffusion Transformer (TDT) was developed in this study to reveal the intrinsic evolution processes of time series. First, to uncover the connections among the periods of multi-periodic time series, the series are transformed into various types of patches using a multi-scale Patch method. Inspired by the principles of heat conduction, TDT conceptualizes the evolution of a time series as a diffusion process. TDT aims to achieve global consistency by minimizing energy constraints, which is accomplished through the iterative updating of patches. Finally, the results of these iterations across multiple periods are aggregated to form the TDT output. Compared to previous advanced models, TDT achieved state-of-the-art predictive performance in our experiments. In most datasets, TDT outperformed the baseline model by approximately 2% in terms of mean square error (MSE) and mean absolute error (MAE). Its effectiveness was further validated through ablation, efficiency, and hyperparameter analyses. TDT offers intuitive explanations by elucidating the diffusion process of time series patches throughout the iterative procedure.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142171650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multidimensional time series motif group discovery based on matrix profile","authors":"","doi":"10.1016/j.knosys.2024.112509","DOIUrl":"10.1016/j.knosys.2024.112509","url":null,"abstract":"<div><p>With the continuous advancements in sensor technology and the increasing capabilities for data collection and storage, the acquisition of time series data across diverse domains has become significantly easier. Consequently, there is a growing demand for identifying potential motifs within multidimensional time series. The introduction of the Matrix Profile (MP) structure and the mSTOMP algorithm enables the detection of multidimensional motifs in large-scale time series datasets. However, the Matrix Profile (MP) does not provide information regarding the frequency of occurrence of these motifs. As a result, it is challenging to determine whether a motif appears frequently or to identify the specific time periods during which it typically occurs, thereby limiting further analysis of the discovered motifs. To address this limitation, we proposed Index Link Motif Group Discovery (ILMGD) algorithm, which uses index linking to rapidly merge and group multidimensional motifs. Based on the results of the ILMGD algorithm, we can determine the frequency and temporal positions of motifs, facilitating deeper analysis. Our proposed method requires minimal additional parameters and reduces the need for extensive manual intervention. We validate the effectiveness of our algorithm on synthetic datasets and demonstrate its applicability on three real-world datasets, highlighting how it enables a comprehensive understanding of the discovered motifs.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142168771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CIRA: Class imbalance resilient adaptive Gaussian process classifier","authors":"","doi":"10.1016/j.knosys.2024.112500","DOIUrl":"10.1016/j.knosys.2024.112500","url":null,"abstract":"<div><p>The problem of class imbalance is pervasive across various real-world applications, resulting in machine learning classifiers exhibiting bias towards majority classes. Algorithm-level balancing approaches adapt the machine learning algorithms to learn from imbalanced datasets while preserving the data’s original distribution. The Gaussian process classifier is a powerful machine learning classification algorithm, however, as with other standard classifiers, its classification performance could be exacerbated by class imbalance. In this work, we propose the Class Imbalance Resilient Adaptive Gaussian process classifier (CIRA), an algorithm-level adaptation of the binary Gaussian process classifier to alleviate the class imbalance. To the best of our knowledge, the proposed algorithm (CIRA) is the first adaptive method for the Gaussian process classifier to handle unbalanced data. The proposed CIRA algorithm consists of two balancing modifications to the original classifier. The first modification balances the posterior mean approximation to learn a more balanced decision boundary between the majority and minority classes. The second modification adopts an asymmetric conditional prediction model to give more emphasis to the minority points during the training process. We conduct extensive experiments and statistical significance tests on forty-two real-world unbalanced datasets. Through the experiments, our proposed CIRA algorithm surpasses six popular data sampling methods with an average of 2.29%, 3.25%, 3.67%, and 1.81% in terms of the Geometric mean, F1-measure, Matthew correlation coefficient, and Area under the receiver operating characteristics curve performance metrics respectively.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A survey on temporal knowledge graph embedding: Models and applications","authors":"","doi":"10.1016/j.knosys.2024.112454","DOIUrl":"10.1016/j.knosys.2024.112454","url":null,"abstract":"<div><p>Knowledge graph embedding (KGE), as a pivotal technology in artificial intelligence, plays a significant role in enhancing the logical reasoning and management efficiency of downstream tasks in knowledge graphs (KGs). It maps the intricate structure of a KG to a continuous vector space. Conventional KGE techniques primarily focus on representing static data within a KG. However, in the real world, facts frequently change over time, as exemplified by evolving social relationships and news events. The effective utilization of embedding technologies to represent KGs that integrate temporal data has gained significant scholarly interest. This paper comprehensively reviews the existing methods for learning KG representations that incorporate temporal data. It offers a highly intuitive perspective by categorizing temporal KGE (TKGE) methods into seven main classes based on dynamic evolution models and extensions of static KGE. The review covers various aspects of TKGE, including the background, problem definition, symbolic representation, training process, commonly used datasets, evaluation schemes, and relevant research. Furthermore, detailed descriptions of related embedding models are provided, followed by an introduction to typical downstream tasks in temporal KG scenarios. Finally, the paper concludes by summarizing the challenges faced in TKGE and outlining future research directions.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0950705124010888/pdfft?md5=6825155cbc22973e3b9d0b91ab9c11af&pid=1-s2.0-S0950705124010888-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing multi-time series forecasting for enhanced cloud resource utilization based on machine learning","authors":"","doi":"10.1016/j.knosys.2024.112489","DOIUrl":"10.1016/j.knosys.2024.112489","url":null,"abstract":"<div><p>Due to its flexibility, cloud computing has become essential in modern operational schemes. However, the effective management of cloud resources to ensure cost-effectiveness and maintain high performance presents significant challenges. The pay-as-you-go pricing model, while convenient, can lead to escalated expenses and hinder long-term planning. Consequently, FinOps advocates proactive management strategies, with resource usage prediction emerging as a crucial optimization category. In this research, we introduce the multi-time series forecasting system (MSFS), a novel approach for data-driven resource optimization alongside the hybrid ensemble anomaly detection algorithm (HEADA). Our method prioritizes the concept-centric approach, focusing on factors such as prediction uncertainty, interpretability and domain-specific measures. Furthermore, we introduce the similarity-based time-series grouping (STG) method as a core component of MSFS for optimizing multi-time series forecasting, ensuring its scalability with the rapid growth of the cloud environment. The experiments performed demonstrate that our group-specific forecasting model (GSFM) approach enabled MSFS to achieve a significant cost reduction of up to 44%.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0950705124011237/pdfft?md5=f19c1aa29695016ff8f758ff70605e16&pid=1-s2.0-S0950705124011237-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CENN: Capsule-enhanced neural network with innovative metrics for robust speech emotion recognition","authors":"","doi":"10.1016/j.knosys.2024.112499","DOIUrl":"10.1016/j.knosys.2024.112499","url":null,"abstract":"<div><p>Speech emotion recognition (SER) plays a pivotal role in enhancing Human-computer interaction (HCI) systems. This paper introduces a groundbreaking Capsule-enhanced neural network (CENN) that significantly advances the state of SER through a robust and reproducible deep learning framework. The CENN architecture seamlessly integrates advanced components, including Multi-head attention (MHA), residual module, and capsule module, which collectively enhance the model's capacity to capture both global and local features essential for precise emotion classification. A key contribution of this work is the development of a comprehensive reproducibility framework, featuring novel metrics: General learning reproducibility (GLR) and Correct learning reproducibility (CLR). These metrics, alongside their fractional and perfect variants, offer a multi-dimensional evaluation of the model's consistency and correctness across multiple executions, thereby ensuring the reliability and credibility of the results. To tackle the persistent challenge of overfitting in deep learning models, we propose an innovative overfitting metric that considers the intricate relationship between training and testing errors, model complexity, and data complexity. This metric, in conjunction with the newly introduced generalization and robustness metrics, provides a holistic assessment of the model's performance, guiding the application of regularization techniques to maintain generalizability and resilience. Extensive experiments conducted on benchmark SER datasets demonstrate that the CENN model not only surpasses existing approaches in terms of accuracy but also sets a new benchmark in reproducibility. This work establishes a new paradigm for deep learning model development in SER, underscoring the vital importance of reproducibility and offering a rigorous framework for future research.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142168961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Elliptic geometry-based kernel matrix for improved biological sequence classification","authors":"","doi":"10.1016/j.knosys.2024.112479","DOIUrl":"10.1016/j.knosys.2024.112479","url":null,"abstract":"<div><p>Protein sequence classification plays a pivotal role in bioinformatics as it enables the comprehension of protein functions and their involvement in diverse biological processes. While numerous machine learning models have been proposed to tackle this challenge, traditional approaches face limitations in capturing the intricate relationships and hierarchical structures inherent in genomic sequences. These limitations stem from operating within high-dimensional non-Euclidean spaces. To address this issue, we introduce the application of the elliptic geometry-based approach for protein sequence classification. First, we transform the problem in elliptic geometry and integrate it with the Gaussian kernel to map the problem into the Mercer kernel. The Gaussian-Elliptic approach allows for the implicit mapping of data into a higher-dimensional feature space, enabling the capture of complex nonlinear relationships. This feature becomes particularly advantageous when dealing with hierarchical or tree-like structures commonly encountered in biological sequences. Experimental results highlight the effectiveness of the proposed model in protein sequence classification, showcasing the advantages of utilizing elliptic geometry in bioinformatics analyses. It outperforms state-of-the-art methods by achieving 76% and 84% accuracies for DNA and Protein datasets, respectively. Furthermore, we provide theoretical justifications for the proposed model. This study contributes to the burgeoning field of geometric deep learning, offering insights into the potential applications of elliptic representations in the analysis of biological data.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-timescale attention residual shrinkage network with adaptive global-local denoising for rolling-bearing fault diagnosis","authors":"","doi":"10.1016/j.knosys.2024.112478","DOIUrl":"10.1016/j.knosys.2024.112478","url":null,"abstract":"<div><p>In actual engineering scenarios, bearing fault signals are inevitably overwhelmed by strong background noise from various sources. However, most deep-learning-based diagnostic models tend to broaden the feature extraction scale to extract rich fault features for bearing-fault identification under noise interference, with little attention paid to multi-timescale discriminative feature mining with adaptive noise rejection, which affects the diagnostic performance. Thus, a multi-timescale attention residual shrinkage network with adaptive global-local denoising (AMARSN) was proposed for rolling-bearing fault diagnosis by learning discriminative multi-timescale fault features from signals and fully eliminating noise components in the multi-timescale fault features. First, a multi-timescale attention learning module (MALMod) was developed to capture multi-timescale fault features and enhance their discriminability under noise interference. Subsequently, an adaptive global-local denoising module (AGDMod) was constructed to fully eliminate noise in multiscale fault features by constructing specific global-local denoising thresholds and designing an adaptive smooth soft thresholding function. Finally, end-to-end bearing fault diagnosis tasks were realized using a softmax classifier located at the end of the AMARSN. The AMARSN was validated using two bearing datasets. The extensive results demonstrated that the AMARSN can mine more effective fault features from signals and achieve average diagnostic accuracies of 85.24% and 80.09% under different noise with different levels.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142171649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}