Knowledge and Information Systems最新文献_第3页

Enhancing the performance of deep learning models with fuzzy c-means clustering 利用模糊均值聚类提高深度学习模型的性能

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-24 DOI: 10.1007/s10115-024-02211-6

Saumya Singh, Smriti Srivastava

{"title":"Enhancing the performance of deep learning models with fuzzy c-means clustering","authors":"Saumya Singh, Smriti Srivastava","doi":"10.1007/s10115-024-02211-6","DOIUrl":"https://doi.org/10.1007/s10115-024-02211-6","url":null,"abstract":"Deep learning models (DLMs), such as recurrent neural networks (RNN), long short-term memory (LSTM), bidirectional long short-term memory (Bi-LSTM), and gated recurrent unit (GRU), are superior for sequential data analysis due to their ability to learn complex patterns. This paper proposes enhancing performance of these models by applying fuzzy c-means (FCM) clustering on sequential data from a nonlinear plant and the stock market. FCM clustering helps to organize the data into clusters based on similarity, which improves the performance of the models. Thus, the proposed fuzzy c-means recurrent neural network (FCM-RNN), fuzzy c-means long short-term memory (FCM-LSTM), fuzzy c-means bidirectional long short-term memory (FCM-Bi-LSTM), and fuzzy c-means gated recurrent unit (FCM-GRU) models showed enhanced prediction results than RNN, LSTM, Bi-LSTM, and GRU models, respectively. This enhancement is validated using performance metrics such as root-mean-square error and mean absolute error and is further illustrated by scatter plots comparing actual versus predicted values for training, validation, and testing data. The experiment results confirm that integrating FCM clustering with DLMs shows the superiority of the proposed models.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"36 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EIGP: document-level event argument extraction with information enhancement generated based on prompts EIGP：文档级事件论据提取，根据提示生成信息增强功能

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-23 DOI: 10.1007/s10115-024-02213-4

Kai Liu, Hui Zhao, Zicong Wang, Qianxi Hou

{"title":"EIGP: document-level event argument extraction with information enhancement generated based on prompts","authors":"Kai Liu, Hui Zhao, Zicong Wang, Qianxi Hou","doi":"10.1007/s10115-024-02213-4","DOIUrl":"https://doi.org/10.1007/s10115-024-02213-4","url":null,"abstract":"The event argument extraction (EAE) task primarily aims to identify event arguments and their specific roles within a given event. Existing generation-based event argument extraction models, including the recent ones focused on document-level event argument extraction, emphasize the construction of prompt templates and entity representations. However, they overlook the inadequate comprehension of model in document context structure information and the impact of arguments spanning a wide range on event argument extraction. Consequently, this results in reduced model detection accuracy. In this paper, we propose a prompt-based generation event argument extraction model with the ability of document structure information enhancement for document-level event argument extraction task based on prompt generation. Specifically, we use sentence abstract meaning representation (AMR) to represent the contextual structural information of the document, and then remove the redundant parts of the structural information through constraints to obtain the constraint graph with the document information. Finally, we use the encoder to convert the graph into the corresponding dense vector. We inject these vectors with contextual structural information into the prompt-based generation EAE model in a prefixed manner. When contextual information and prompt templates interact at the attention layer of the model, the generated structural information improves the generation by affecting attention. We conducted experiments on RAMS and WIKIEVENTS datasets, and the results show that our model achieves excellent results compared with the current advanced generative EAE model.\u0000","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"8 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computing marginal and conditional divergences between decomposable models with applications in quantum computing and earth observation 计算可分解模型之间的边际分歧和条件分歧，并将其应用于量子计算和地球观测

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-22 DOI: 10.1007/s10115-024-02191-7

Loong Kuan Lee, Geoffrey I. Webb, Daniel F. Schmidt, Nico Piatkowski

{"title":"Computing marginal and conditional divergences between decomposable models with applications in quantum computing and earth observation","authors":"Loong Kuan Lee, Geoffrey I. Webb, Daniel F. Schmidt, Nico Piatkowski","doi":"10.1007/s10115-024-02191-7","DOIUrl":"https://doi.org/10.1007/s10115-024-02191-7","url":null,"abstract":"The ability to compute the exact divergence between two high-dimensional distributions is useful in many applications, but doing so naively is intractable. Computing the (alpha beta )-divergence—a family of divergences that includes the Kullback–Leibler divergence and Hellinger distance—between the joint distribution of two decomposable models, i.e., chordal Markov networks, can be done in time exponential in the treewidth of these models. Extending this result, we propose an approach to compute the exact (alpha beta )-divergence between any marginal or conditional distribution of two decomposable models. In order to do so tractably, we provide a decomposition over the marginal and conditional distributions of decomposable models. We then show how our method can be used to analyze distributional changes by first applying it to the benchmark image dataset QMNIST and a dataset containing observations from various areas at the Roosevelt Nation Forest and their cover type. Finally, based on our framework, we propose a novel way to quantify the error in contemporary superconducting quantum computers.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"97 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GoSum: extractive summarization of long documents by reinforcement learning and graph-organized discourse state GoSum：通过强化学习和图组织话语状态提取长文档摘要

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-22 DOI: 10.1007/s10115-024-02195-3

Junyi Bian, Xiaodi Huang, Hong Zhou, Tianyang Huang, Shanfeng Zhu

{"title":"GoSum: extractive summarization of long documents by reinforcement learning and graph-organized discourse state","authors":"Junyi Bian, Xiaodi Huang, Hong Zhou, Tianyang Huang, Shanfeng Zhu","doi":"10.1007/s10115-024-02195-3","DOIUrl":"https://doi.org/10.1007/s10115-024-02195-3","url":null,"abstract":"Summarizing extensive documents involves selecting sentences, with the organizational structure of document sections playing a pivotal role. However, effectively utilizing discourse information for summary generation poses a significant challenge, especially given the inconsistency between training and evaluation in extractive summarization. In this paper, we introduce GoSum, a novel extractive summarizer that integrates a graph-based model with reinforcement learning techniques to summarize long documents. Specifically, GoSum utilizes a graph neural network to encode sentence states, constructing a heterogeneous graph that represents each document at various discourse levels. The edges of this graph capture hierarchical relationships between different document sections. Furthermore, GoSum incorporates offline reinforcement learning, enabling the model to receive ROUGE score feedback on diverse training samples, thereby enhancing the quality of summary generation. On the two scientific article datasets PubMed and arXiv, GoSum achieved the highest performance among extractive models. Particularly on the PubMed dataset, GoSum outperformed other models with ROUGE-1 and ROUGE-L scores surpassing by 0.45 and 0.26, respectively.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"10 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Probabilistic temporal semantic graph: a holistic framework for event detection in twitter 概率时间语义图：用于检测 twitter 中事件的整体框架

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-22 DOI: 10.1007/s10115-024-02208-1

Hadis Bashiri, Hassan Naderi

{"title":"Probabilistic temporal semantic graph: a holistic framework for event detection in twitter","authors":"Hadis Bashiri, Hassan Naderi","doi":"10.1007/s10115-024-02208-1","DOIUrl":"https://doi.org/10.1007/s10115-024-02208-1","url":null,"abstract":"Event detection on social media platforms, especially Twitter, poses significant challenges due to the dynamic nature and high volume of data. The rapid flow of tweets and the varied ways users express thoughts complicate the identification of relevant events. Accurately identifying and interpreting events from this noisy and fast-paced environment is crucial for various applications, including crisis management and market analysis. This paper presents a novel unsupervised framework for event detection on social media, designed to enhance the accuracy and efficiency of identifying significant events from Twitter data. The framework incorporates several innovative techniques, including dynamic bandwidth adjustment based on local data density, Mahalanobis distance integration, adaptive kernel density estimation, and an improved Louvain-MOMR method for community detection. Additionally, a new scoring system is implemented to accurately extract trending words that evoke strong emotions, improving the identification of event-related keywords. The proposed framework demonstrates robust performance across three diverse datasets: FACup, Super Tuesday, and US Election, showcasing its effectiveness in capturing temporal and semantic patterns within tweets.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"93 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

REDAffectiveLM: leveraging affect enriched embedding and transformer-based neural language model for readers’ emotion detection REDAffectiveLM：利用情感丰富嵌入和基于转换器的神经语言模型进行读者情感检测

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-19 DOI: 10.1007/s10115-024-02194-4

Anoop Kadan, P. Deepak, Manjary P. Gangan, Sam Savitha Abraham, V. L. Lajish

{"title":"REDAffectiveLM: leveraging affect enriched embedding and transformer-based neural language model for readers’ emotion detection","authors":"Anoop Kadan, P. Deepak, Manjary P. Gangan, Sam Savitha Abraham, V. L. Lajish","doi":"10.1007/s10115-024-02194-4","DOIUrl":"https://doi.org/10.1007/s10115-024-02194-4","url":null,"abstract":"Technological advancements in web platforms allow people to express and share emotions toward textual write-ups written and shared by others. This brings about different interesting domains for analysis, emotion expressed by the writer and emotion elicited from the readers. In this paper, we propose a novel approach for readers’ emotion detection from short-text documents using a deep learning model called REDAffectiveLM. Within state-of-the-art NLP tasks, it is well understood that utilizing context-specific representations from transformer-based pre-trained language models helps achieve improved performance. Within this affective computing task, we explore how incorporating affective information can further enhance performance. Toward this, we leverage context-specific and affect enriched representations by using a transformer-based pre-trained language model in tandem with affect enriched Bi-LSTM+Attention. For empirical evaluation, we procure a new dataset REN-20k, besides using RENh-4k and SemEval-2007. We evaluate the performance of our REDAffectiveLM rigorously across these datasets, against a vast set of state-of-the-art baselines, where our model consistently outperforms baselines and obtains statistically significant results. Our results establish that utilizing affect enriched representation along with context-specific representation within a neural architecture can considerably enhance readers’ emotion detection. Since the impact of affect enrichment specifically in readers’ emotion detection isn’t well explored, we conduct a detailed analysis over affect enriched Bi-LSTM+Attention using qualitative and quantitative model behavior evaluation techniques. We observe that compared to conventional semantic embedding, affect enriched embedding increases the ability of the network to effectively identify and assign weightage to the key terms responsible for readers’ emotion detection to improve prediction.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"29 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Aspect-based sentiment analysis: approaches, applications, challenges and trends 基于方面的情感分析：方法、应用、挑战和趋势

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-14 DOI: 10.1007/s10115-024-02200-9

Deena Nath, Sanjay K. Dwivedi

引用次数: 0

Complementary incomplete weighted concept factorization methods for multi-view clustering 用于多视角聚类的互补不完全加权概念因式分解方法

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-14 DOI: 10.1007/s10115-024-02197-1

Ghufran Ahmad Khan, Jalaluddin Khan, Taushif Anwar, Zaid Al-Huda, Bassoma Diallo, Naved Ahmad

{"title":"Complementary incomplete weighted concept factorization methods for multi-view clustering","authors":"Ghufran Ahmad Khan, Jalaluddin Khan, Taushif Anwar, Zaid Al-Huda, Bassoma Diallo, Naved Ahmad","doi":"10.1007/s10115-024-02197-1","DOIUrl":"https://doi.org/10.1007/s10115-024-02197-1","url":null,"abstract":"The main aim of traditional multi-view clustering is to categorize data into separate clusters under the assumption that all views are fully available. However, practical scenarios often arise where not all aspects of the data are accessible, which hampers the efficacy of conventional multi-view clustering techniques. Recent advancements have made significant progress in addressing the incompleteness in multi-view data clustering. Still, current incomplete multi-view clustering methods overlooked a number of important factors, such as providing a consensus representation across the kernel space, dealing with over-fitting issue from different views, and looking at how these multiple views relate to each other at the same time. To deal these challenges, we introduced an innovative multi-view clustering algorithm to manage incomplete data from multiple perspectives. Additionally, we have introduced a novel objective function incorporating a weighted concept factorization technique to tackle the absence of data instances within each incomplete viewpoint. We used a co-regularization constraint to learn a common shared structure from different points of view and a smooth regularization term to prevent view over-fitting. It is noteworthy that the proposed objective function is inherently non-convex, presenting optimization challenges. To obtain the optimal solution, we have implemented an iterative optimization approach to converge the local minima for our method. To underscore the effectiveness and validation of our approach, we conducted experiments using real-world datasets against state-of-the-art methods for comparative evaluation.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"57 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hyperparameter elegance: fine-tuning text analysis with enhanced genetic algorithm hyperparameter landscape 超参数优雅：利用增强型遗传算法超参数景观微调文本分析

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-13 DOI: 10.1007/s10115-024-02202-7

Gyananjaya Tripathy, Aakanksha Sharaff

{"title":"Hyperparameter elegance: fine-tuning text analysis with enhanced genetic algorithm hyperparameter landscape","authors":"Gyananjaya Tripathy, Aakanksha Sharaff","doi":"10.1007/s10115-024-02202-7","DOIUrl":"https://doi.org/10.1007/s10115-024-02202-7","url":null,"abstract":"Due to the significant participation of the users, it is highly challenging to handle enormous datasets using machine learning algorithms. Deep learning methods are therefore designed with efficient hyperparameter sets to enhance the processing of the vast corpus. Different hyperparameter tuning models have been used previously in various studies. Still, tuning the deep learning models with the greatest possible number of hyperparameters has not yet been possible. This study developed a modified optimization methodology for effective hyperparameter identification, addressing the shortcomings of the previous studies. To get the optimum outcome, an enhanced genetic algorithm is used with modified crossover and mutation. The method has the ability to tune several hyperparameters simultaneously. The benchmark datasets for online reviews show outstanding results from the proposed methodology. The outcome demonstrates that the presented enhanced genetic algorithm-based hyperparameter tuning model performs better than other standard approaches with 88.73% classification accuracy, 87.31% sensitivity, 90.15% specificity, and 88.58% F-score value for the IMDB dataset and 92.17% classification accuracy, 91.89% sensitivity, 92.47% specificity, and 92.50% F-score value for the Yelp dataset while requiring less processing effort. To further enhance the performance, attention mechanism is applied to the designed model, achieving 89.62% accuracy, 88.59% sensitivity, 91.89% specificity, and 89.35% F-score with the IMDB dataset and 93.29% accuracy, 92.04% sensitivity, 93.22% specificity, and 92.98% F-score with the Yelp dataset.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"18 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive moving average Q-learning 自适应移动平均 Q 学习

IF 2.7 4区计算机科学

Knowledge and Information Systems Pub Date : 2024-08-12 DOI: 10.1007/s10115-024-02190-8

Tao Tan, Hong Xie, Yunni Xia, Xiaoyu Shi, Mingsheng Shang

{"title":"Adaptive moving average Q-learning","authors":"Tao Tan, Hong Xie, Yunni Xia, Xiaoyu Shi, Mingsheng Shang","doi":"10.1007/s10115-024-02190-8","DOIUrl":"https://doi.org/10.1007/s10115-024-02190-8","url":null,"abstract":"A variety of algorithms have been proposed to address the long-standing overestimation bias problem of Q-learning. Reducing this overestimation bias may lead to an underestimation bias, such as double Q-learning. However, it is still unclear how to make a good balance between overestimation and underestimation. We present a simple yet effective algorithm to fill in this gap and call Moving Average Q-learning. Specifically, we maintain two dependent Q-estimators. The first one is used to estimate the maximum expected Q-value. The second one is used to select the optimal action. In particular, the second estimator is the moving average of historical Q-values generated by the first estimator. The second estimator has only one hyperparameter, namely the moving average parameter. This parameter controls the dependence between the second estimator and the first estimator, ranging from independent to identical. Based on Moving Average Q-learning, we design an adaptive strategy to select the moving average parameter, resulting in AdaMA (Adaptive Moving Average) Q-learning. This adaptive strategy is a simple function, where the moving average parameter increases monotonically with the number of state–action pairs visited. Moreover, we extend AdaMA Q-learning to AdaMA DQN in high-dimensional environments. Extensive experiment results reveal why Moving Average Q-learning and AdaMA Q-learning can mitigate the overestimation bias, and also show that AdaMA Q-learning and AdaMA DQN outperform SOTA baselines drastically. In particular, when compared with the overestimated value of 1.66 in Q-learning, AdaMA Q-learning underestimates by 0.196, resulting in an improvement of 88.19%.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"372 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0