{"title":"EIGP: document-level event argument extraction with information enhancement generated based on prompts","authors":"Kai Liu, Hui Zhao, Zicong Wang, Qianxi Hou","doi":"10.1007/s10115-024-02213-4","DOIUrl":"https://doi.org/10.1007/s10115-024-02213-4","url":null,"abstract":"<p>The event argument extraction (EAE) task primarily aims to identify event arguments and their specific roles within a given event. Existing generation-based event argument extraction models, including the recent ones focused on document-level event argument extraction, emphasize the construction of prompt templates and entity representations. However, they overlook the inadequate comprehension of model in document context structure information and the impact of arguments spanning a wide range on event argument extraction. Consequently, this results in reduced model detection accuracy. In this paper, we propose a prompt-based generation event argument extraction model with the ability of document structure information enhancement for document-level event argument extraction task based on prompt generation. Specifically, we use sentence abstract meaning representation (AMR) to represent the contextual structural information of the document, and then remove the redundant parts of the structural information through constraints to obtain the constraint graph with the document information. Finally, we use the encoder to convert the graph into the corresponding dense vector. We inject these vectors with contextual structural information into the prompt-based generation EAE model in a prefixed manner. When contextual information and prompt templates interact at the attention layer of the model, the generated structural information improves the generation by affecting attention. We conducted experiments on RAMS and WIKIEVENTS datasets, and the results show that our model achieves excellent results compared with the current advanced generative EAE model.\u0000</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"8 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Loong Kuan Lee, Geoffrey I. Webb, Daniel F. Schmidt, Nico Piatkowski
{"title":"Computing marginal and conditional divergences between decomposable models with applications in quantum computing and earth observation","authors":"Loong Kuan Lee, Geoffrey I. Webb, Daniel F. Schmidt, Nico Piatkowski","doi":"10.1007/s10115-024-02191-7","DOIUrl":"https://doi.org/10.1007/s10115-024-02191-7","url":null,"abstract":"<p>The ability to compute the exact divergence between two high-dimensional distributions is useful in many applications, but doing so naively is intractable. Computing the <span>(alpha beta )</span>-divergence—a family of divergences that includes the Kullback–Leibler divergence and Hellinger distance—between the joint distribution of two decomposable models, i.e., chordal Markov networks, can be done in time exponential in the treewidth of these models. Extending this result, we propose an approach to compute the exact <span>(alpha beta )</span>-divergence between any marginal or conditional distribution of two decomposable models. In order to do so tractably, we provide a decomposition over the marginal and conditional distributions of decomposable models. We then show how our method can be used to analyze distributional changes by first applying it to the benchmark image dataset QMNIST and a dataset containing observations from various areas at the Roosevelt Nation Forest and their cover type. Finally, based on our framework, we propose a novel way to quantify the error in contemporary superconducting quantum computers.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"97 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junyi Bian, Xiaodi Huang, Hong Zhou, Tianyang Huang, Shanfeng Zhu
{"title":"GoSum: extractive summarization of long documents by reinforcement learning and graph-organized discourse state","authors":"Junyi Bian, Xiaodi Huang, Hong Zhou, Tianyang Huang, Shanfeng Zhu","doi":"10.1007/s10115-024-02195-3","DOIUrl":"https://doi.org/10.1007/s10115-024-02195-3","url":null,"abstract":"<p>Summarizing extensive documents involves selecting sentences, with the organizational structure of document sections playing a pivotal role. However, effectively utilizing discourse information for summary generation poses a significant challenge, especially given the inconsistency between training and evaluation in extractive summarization. In this paper, we introduce GoSum, a novel extractive summarizer that integrates a graph-based model with reinforcement learning techniques to summarize long documents. Specifically, GoSum utilizes a graph neural network to encode sentence states, constructing a heterogeneous graph that represents each document at various discourse levels. The edges of this graph capture hierarchical relationships between different document sections. Furthermore, GoSum incorporates offline reinforcement learning, enabling the model to receive ROUGE score feedback on diverse training samples, thereby enhancing the quality of summary generation. On the two scientific article datasets PubMed and arXiv, GoSum achieved the highest performance among extractive models. Particularly on the PubMed dataset, GoSum outperformed other models with ROUGE-1 and ROUGE-L scores surpassing by 0.45 and 0.26, respectively.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"10 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probabilistic temporal semantic graph: a holistic framework for event detection in twitter","authors":"Hadis Bashiri, Hassan Naderi","doi":"10.1007/s10115-024-02208-1","DOIUrl":"https://doi.org/10.1007/s10115-024-02208-1","url":null,"abstract":"<p>Event detection on social media platforms, especially Twitter, poses significant challenges due to the dynamic nature and high volume of data. The rapid flow of tweets and the varied ways users express thoughts complicate the identification of relevant events. Accurately identifying and interpreting events from this noisy and fast-paced environment is crucial for various applications, including crisis management and market analysis. This paper presents a novel unsupervised framework for event detection on social media, designed to enhance the accuracy and efficiency of identifying significant events from Twitter data. The framework incorporates several innovative techniques, including dynamic bandwidth adjustment based on local data density, Mahalanobis distance integration, adaptive kernel density estimation, and an improved Louvain-MOMR method for community detection. Additionally, a new scoring system is implemented to accurately extract trending words that evoke strong emotions, improving the identification of event-related keywords. The proposed framework demonstrates robust performance across three diverse datasets: FACup, Super Tuesday, and US Election, showcasing its effectiveness in capturing temporal and semantic patterns within tweets.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"93 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anoop Kadan, P. Deepak, Manjary P. Gangan, Sam Savitha Abraham, V. L. Lajish
{"title":"REDAffectiveLM: leveraging affect enriched embedding and transformer-based neural language model for readers’ emotion detection","authors":"Anoop Kadan, P. Deepak, Manjary P. Gangan, Sam Savitha Abraham, V. L. Lajish","doi":"10.1007/s10115-024-02194-4","DOIUrl":"https://doi.org/10.1007/s10115-024-02194-4","url":null,"abstract":"<p>Technological advancements in web platforms allow people to express and share emotions toward textual write-ups written and shared by others. This brings about different interesting domains for analysis, emotion expressed by the writer and emotion elicited from the readers. In this paper, we propose a novel approach for readers’ emotion detection from short-text documents using a deep learning model called <i>REDAffectiveLM</i>. Within state-of-the-art NLP tasks, it is well understood that utilizing context-specific representations from transformer-based pre-trained language models helps achieve improved performance. Within this affective computing task, we explore how incorporating affective information can further enhance performance. Toward this, we leverage context-specific and affect enriched representations by using a transformer-based pre-trained language model in tandem with affect enriched Bi-LSTM+Attention. For empirical evaluation, we procure a new dataset REN-20k, besides using RENh-4k and SemEval-2007. We evaluate the performance of our <i>REDAffectiveLM</i> rigorously across these datasets, against a vast set of state-of-the-art baselines, where our model consistently outperforms baselines and obtains statistically significant results. Our results establish that utilizing affect enriched representation along with context-specific representation within a neural architecture can considerably enhance readers’ emotion detection. Since the impact of affect enrichment specifically in readers’ emotion detection isn’t well explored, we conduct a detailed analysis over affect enriched Bi-LSTM+Attention using qualitative and quantitative model behavior evaluation techniques. We observe that compared to conventional semantic embedding, affect enriched embedding increases the ability of the network to effectively identify and assign weightage to the key terms responsible for readers’ emotion detection to improve prediction.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"29 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Aspect-based sentiment analysis: approaches, applications, challenges and trends","authors":"Deena Nath, Sanjay K. Dwivedi","doi":"10.1007/s10115-024-02200-9","DOIUrl":"https://doi.org/10.1007/s10115-024-02200-9","url":null,"abstract":"<p>Sentiment analysis (SA) is a technique that employs natural language processing to determine the function of mining methodically, extract, analyse and comprehend people’s thoughts, feelings, personal opinions and perceptions as well as their reactions and attitude regarding various subjects such as topics, commodities and various other products and services. However, it only reveals the overall sentiment. Unlike SA, the aspect-based sentiment analysis (ABSA) study categorizes a text into distinct components and determines the appropriate sentiment, which is more reliable in its predictions. Hence, ABSA is essential to study and break down texts into various service elements. It then assigns the appropriate sentiment polarity (positive, negative or neutral) for every aspect. In this paper, the main task is to critically review the research outcomes to look at the various techniques, methods and features used for ABSA. After giving brief introduction of SA in order to establish a clear relationship between SA and ABSA, we focussed on approaches, applications, challenges and trends in ABSA research.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"50 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ghufran Ahmad Khan, Jalaluddin Khan, Taushif Anwar, Zaid Al-Huda, Bassoma Diallo, Naved Ahmad
{"title":"Complementary incomplete weighted concept factorization methods for multi-view clustering","authors":"Ghufran Ahmad Khan, Jalaluddin Khan, Taushif Anwar, Zaid Al-Huda, Bassoma Diallo, Naved Ahmad","doi":"10.1007/s10115-024-02197-1","DOIUrl":"https://doi.org/10.1007/s10115-024-02197-1","url":null,"abstract":"<p>The main aim of traditional multi-view clustering is to categorize data into separate clusters under the assumption that all views are fully available. However, practical scenarios often arise where not all aspects of the data are accessible, which hampers the efficacy of conventional multi-view clustering techniques. Recent advancements have made significant progress in addressing the incompleteness in multi-view data clustering. Still, current incomplete multi-view clustering methods overlooked a number of important factors, such as providing a consensus representation across the kernel space, dealing with over-fitting issue from different views, and looking at how these multiple views relate to each other at the same time. To deal these challenges, we introduced an innovative multi-view clustering algorithm to manage incomplete data from multiple perspectives. Additionally, we have introduced a novel objective function incorporating a weighted concept factorization technique to tackle the absence of data instances within each incomplete viewpoint. We used a co-regularization constraint to learn a common shared structure from different points of view and a smooth regularization term to prevent view over-fitting. It is noteworthy that the proposed objective function is inherently non-convex, presenting optimization challenges. To obtain the optimal solution, we have implemented an iterative optimization approach to converge the local minima for our method. To underscore the effectiveness and validation of our approach, we conducted experiments using real-world datasets against state-of-the-art methods for comparative evaluation.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"57 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hyperparameter elegance: fine-tuning text analysis with enhanced genetic algorithm hyperparameter landscape","authors":"Gyananjaya Tripathy, Aakanksha Sharaff","doi":"10.1007/s10115-024-02202-7","DOIUrl":"https://doi.org/10.1007/s10115-024-02202-7","url":null,"abstract":"<p>Due to the significant participation of the users, it is highly challenging to handle enormous datasets using machine learning algorithms. Deep learning methods are therefore designed with efficient hyperparameter sets to enhance the processing of the vast corpus. Different hyperparameter tuning models have been used previously in various studies. Still, tuning the deep learning models with the greatest possible number of hyperparameters has not yet been possible. This study developed a modified optimization methodology for effective hyperparameter identification, addressing the shortcomings of the previous studies. To get the optimum outcome, an enhanced genetic algorithm is used with modified crossover and mutation. The method has the ability to tune several hyperparameters simultaneously. The benchmark datasets for online reviews show outstanding results from the proposed methodology. The outcome demonstrates that the presented enhanced genetic algorithm-based hyperparameter tuning model performs better than other standard approaches with 88.73% classification accuracy, 87.31% sensitivity, 90.15% specificity, and 88.58% F-score value for the IMDB dataset and 92.17% classification accuracy, 91.89% sensitivity, 92.47% specificity, and 92.50% F-score value for the Yelp dataset while requiring less processing effort. To further enhance the performance, attention mechanism is applied to the designed model, achieving 89.62% accuracy, 88.59% sensitivity, 91.89% specificity, and 89.35% F-score with the IMDB dataset and 93.29% accuracy, 92.04% sensitivity, 93.22% specificity, and 92.98% F-score with the Yelp dataset.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"18 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tao Tan, Hong Xie, Yunni Xia, Xiaoyu Shi, Mingsheng Shang
{"title":"Adaptive moving average Q-learning","authors":"Tao Tan, Hong Xie, Yunni Xia, Xiaoyu Shi, Mingsheng Shang","doi":"10.1007/s10115-024-02190-8","DOIUrl":"https://doi.org/10.1007/s10115-024-02190-8","url":null,"abstract":"<p>A variety of algorithms have been proposed to address the long-standing overestimation bias problem of Q-learning. Reducing this overestimation bias may lead to an underestimation bias, such as double Q-learning. However, it is still unclear how to make a good balance between overestimation and underestimation. We present a simple yet effective algorithm to fill in this gap and call Moving Average Q-learning. Specifically, we maintain two dependent Q-estimators. The first one is used to estimate the maximum expected Q-value. The second one is used to select the optimal action. In particular, the second estimator is the moving average of historical Q-values generated by the first estimator. The second estimator has only one hyperparameter, namely the moving average parameter. This parameter controls the dependence between the second estimator and the first estimator, ranging from independent to identical. Based on Moving Average Q-learning, we design an adaptive strategy to select the moving average parameter, resulting in AdaMA (<u>Ada</u>ptive <u>M</u>oving <u>A</u>verage) Q-learning. This adaptive strategy is a simple function, where the moving average parameter increases monotonically with the number of state–action pairs visited. Moreover, we extend AdaMA Q-learning to AdaMA DQN in high-dimensional environments. Extensive experiment results reveal why Moving Average Q-learning and AdaMA Q-learning can mitigate the overestimation bias, and also show that AdaMA Q-learning and AdaMA DQN outperform SOTA baselines drastically. In particular, when compared with the overestimated value of 1.66 in Q-learning, AdaMA Q-learning underestimates by 0.196, resulting in an improvement of 88.19%.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"372 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mahsa Nooribakhsh, Marta Fernández-Diego, Fernando González-Ladrón-De-Guevara, Mahdi Mollamotalebi
{"title":"Community detection in social networks using machine learning: a systematic mapping study","authors":"Mahsa Nooribakhsh, Marta Fernández-Diego, Fernando González-Ladrón-De-Guevara, Mahdi Mollamotalebi","doi":"10.1007/s10115-024-02201-8","DOIUrl":"https://doi.org/10.1007/s10115-024-02201-8","url":null,"abstract":"<p>One of the important issues in social networks is the social communities which are formed by interactions between its members. Three types of community including overlapping, non-overlapping, and hidden are detected by different approaches. Regarding the importance of community detection in social networks, this paper provides a systematic mapping of machine learning-based community detection approaches. The study aimed to show the type of communities in social networks along with the algorithms of machine learning that have been used for community detection. After carrying out the steps of mapping and removing useless references, 246 papers were selected to answer the questions of this research. The results of the research indicated that unsupervised machine learning-based algorithms with 41.46% (such as <i>k</i> means) are the most used categories to detect communities in social networks due to their low processing overheads. On the other hand, there has been a significant increase in the use of deep learning since 2020 which has sufficient performance for community detection in large-volume data. With regard to the ability of NMI to measure the correlation or similarity between communities, with 53.25%, it is the most frequently used metric to evaluate the performance of community identifications. Furthermore, considering availability, low in size, and lack of multiple edge and loops, dataset Zachary’s Karate Club with 26.42% is the most used dataset for community detection research in social networks.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"53 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}