Journal of Intelligent Information Systems最新文献_第2页

Machine learning approaches to predict the execution time of the meteorological simulation software COSMO 预测气象模拟软件 COSMO 执行时间的机器学习方法

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-08-31 DOI: 10.1007/s10844-024-00880-x

Allegra De Filippo, Emanuele Di Giacomo, Andrea Borghesi

{"title":"Machine learning approaches to predict the execution time of the meteorological simulation software COSMO","authors":"Allegra De Filippo, Emanuele Di Giacomo, Andrea Borghesi","doi":"10.1007/s10844-024-00880-x","DOIUrl":"https://doi.org/10.1007/s10844-024-00880-x","url":null,"abstract":"Predicting the execution time of weather forecast models is a complex task, since these models are usually performed on High Performance Computing systems that require large computing capabilities. Indeed, a reliable prediction can imply several benefits, by allowing for an improved planning of the model execution, a better allocation of available resources, and the identification of possible anomalies. However, to make such predictions is usually hard, since there is a scarcity of datasets that benchmark the existing meteorological simulation models. In this work, we focus on the runtime predictions of the execution of the COSMO (COnsortium for SMall-scale MOdeling) weather forecasting model used at the Hydro-Meteo-Climate Structure of the Regional Agency for the Environment and Energy Prevention Emilia-Romagna. We show how a plethora of Machine Learning approaches can obtain accurate runtime predictions of this complex model, by designing a new well-defined benchmark for this application task. Indeed, our contribution is twofold: 1) the creation of a large public dataset reporting the runtime of COSMO run under a variety of different configurations; 2) a comparative study of ML models, which greatly outperform the current state-of-practice used by the domain experts. This data collection represents an essential initial benchmark for this application field, and a useful resource for analyzing the model performance: better accuracy in runtime predictions could help facility owners to improve job scheduling and resource allocation of the entire system; while for a final user, a posteriori analysis could help to identify anomalous runs.","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"75 1 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Span-based semantic syntactic dual enhancement for aspect sentiment triplet extraction 基于跨度的语义句法二元增强，用于方面情感三元组提取

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-08-22 DOI: 10.1007/s10844-024-00881-w

Shuxia Ren, Zewei Guo, Xiaohan Li, Ruikun Zhong

{"title":"Span-based semantic syntactic dual enhancement for aspect sentiment triplet extraction","authors":"Shuxia Ren, Zewei Guo, Xiaohan Li, Ruikun Zhong","doi":"10.1007/s10844-024-00881-w","DOIUrl":"https://doi.org/10.1007/s10844-024-00881-w","url":null,"abstract":"Aspect-Based Sentiment Triple Extraction (ASTE), a critical sub-task of Aspect-Based Sentiment Analysis (ABSA), has received extensive attention in recent years. ASTE aims to extract structured sentiment triples from texts, with most existing studies focusing on designing new strategic frameworks. Nonetheless, these methods often overlook the complex characteristics of linguistic expression and the deeper semantic nuances, leading to deficiencies in extracting the semantic representations of triples and effectively utilizing syntactic relationships in texts. To address these challenges, this paper introduces a span-based semantic and syntactic Dual-Enhanced model that deeply integrates rich syntactic information, such as part-of-speech tagging, constituent syntax, and dependency syntax structures. Specifically, we designed a semantic encoder and a syntactic encoder to capture the semantic-syntactic information closely related to the sentence’s underlying intent. Through a Feature Interaction Module, we effectively integrate information across different dimensions and promote a more comprehensive understanding of the relationships between aspects and opinions. We also adopted a span-based tagging scheme that generates more precise aspect sentiment triple extractions by exploring cross-level information and constraints. Experimental results on benchmark datasets derived from the SemEval challenge prove that our model significantly outperforms existing baselines.","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"26 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tiramisù: making sense of multi-faceted process information through time and space Tiramisù：通过时间和空间理解多方面的过程信息

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-08-14 DOI: 10.1007/s10844-024-00875-8

Anti Alman, Alessio Arleo, Iris Beerepoot, Andrea Burattin, Claudio Di Ciccio, Manuel Resinas

{"title":"Tiramisù: making sense of multi-faceted process information through time and space","authors":"Anti Alman, Alessio Arleo, Iris Beerepoot, Andrea Burattin, Claudio Di Ciccio, Manuel Resinas","doi":"10.1007/s10844-024-00875-8","DOIUrl":"https://doi.org/10.1007/s10844-024-00875-8","url":null,"abstract":"Knowledge-intensive processes represent a particularly challenging scenario for process mining. The flexibility that such processes allow constitutes a hurdle as they are hard to capture in a single model. To tackle this problem, multiple visual representations of the same processes could be beneficial, each addressing different information dimensions according to the specific needs and background knowledge of the concrete process workers and stakeholders. In this paper, we propose, describe, and evaluate a framework, named Tiramisù , that leverages visual analytics for the interactive visualization of multi-faceted process information, aimed at supporting the investigation and insight generation of users in their process analysis tasks. Tiramisù is based on a multi-layer visualization methodology that includes a visual backdrop that provides context and an arbitrary number of superimposed and on-demand dimension layers. This arrangement allows our framework to display process information from different perspectives and to project this information onto a domain-friendly representation of the context in which the process unfolds. We provide an in-depth description of the approach’s founding principles, deeply rooted in visualization research, that justify our design choices for the whole framework. We demonstrate the feasibility of the framework through its application in two use-case scenarios in the context of healthcare and personal information management. Plus, we conducted qualitative evaluations with potential end users of both scenarios, gathering precious insights about the efficacy and applicability of our framework to various application domains.","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"72 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning recommendations from educational event data in higher education 从高等教育的教育事件数据中学习建议

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-08-13 DOI: 10.1007/s10844-024-00873-w

Gyunam Park, Lukas Liss, Wil M. P. van der Aalst

引用次数: 0

Temporal knowledge completion enhanced self-supervised entity alignment 时态知识完成增强型自监督实体配准

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-08-13 DOI: 10.1007/s10844-024-00878-5

Teng Fu, Gang Zhou

{"title":"Temporal knowledge completion enhanced self-supervised entity alignment","authors":"Teng Fu, Gang Zhou","doi":"10.1007/s10844-024-00878-5","DOIUrl":"https://doi.org/10.1007/s10844-024-00878-5","url":null,"abstract":"Temporal graph entity alignment aims at finding the equivalent entity pairs across different temporal knowledge graphs (TKGs). Primarily methods mainly utilize a time-aware and relationship-aware approach to embed and align. However, the existence of long-tail entities in TKGs still restricts the accuracy of alignment, as the limited neighborhood information may restrict the available neighborhood information for obtaining high-quality embeddings, and hence would impact the efficiency of entity alignment in representation space. Moreover, most previous researches are supervised, with heavy dependence on seed labels for alignment, restricting their applicability in scenarios with limited resources. To tackle these challenges, we propose a Temporal Knowledge Completion enhanced Self-supervised Entity Alignment (TSEA). We argue that, with high-quality embeddings, the entities would be aligned in a self-supervised manner. To this end, TSEA is constituted of two modules: A graph completion module to predict the missing links for the long-tailed entities. With the improved graph, TSEA further incorporates a self-supervised entity alignment module to achieve unsupervised alignment. Experimental results on widely adopted benchmarks demonstrate improved performance compared to several recent baseline methods. Additional ablation experiments further corroborate the efficacy of the proposed modules.","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"58 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142217578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improved machine learning technique for feature reduction and its application in spam email detection 用于减少特征的改进型机器学习技术及其在垃圾邮件检测中的应用

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-08-07 DOI: 10.1007/s10844-024-00870-z

Ahmed A. Ewees, Marwa A. Gaheen, Mohammed M. Alshahrani, Ahmed M. Anter, Fatma H. Ismail

{"title":"Improved machine learning technique for feature reduction and its application in spam email detection","authors":"Ahmed A. Ewees, Marwa A. Gaheen, Mohammed M. Alshahrani, Ahmed M. Anter, Fatma H. Ismail","doi":"10.1007/s10844-024-00870-z","DOIUrl":"https://doi.org/10.1007/s10844-024-00870-z","url":null,"abstract":"This paper introduces MPAG, a new feature selection method aimed at overcoming the limitations of the conventional Marine Predators Algorithm (MPA). The MPA may experience stagnation and become trapped in local optima during optimization. To address this challenge, we propose a refined version of the MPA, termed MPAG, which incorporates the Local Escape Operator (LEO) from the gradient-based optimizer (GBO). By leveraging the LEO operator, MPAG enhances the exploration ability of the MPA, particularly during the initial one-third of iterations. This enhancement injects more diversity into populations, thereby improving the process of search space discovery and mitigating the risk of premature convergence. The performance of MPAG is evaluated on 14 feature selection benchmark datasets, employing seven performance measures including fitness value, classification accuracy, and selected features. Our findings indicate that MPAG outperforms other algorithms in 86% of the datasets, underscoring its capability to select the most relevant features across various datasets while maintaining stability. Additionally, MPAG is evaluated using two cybersecurity applications, specifically spam detection datasets, where it demonstrates superior performance across most performance measures compared to other methods.","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"77 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Joint entity and relation extraction with fusion of multi-feature semantics 融合多特征语义的联合实体和关系提取

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-08-01 DOI: 10.1007/s10844-024-00871-y

Ting Wang, Wenjie Yang, Tao Wu, Chuan Yang, Jiaying Liang, Hongyang Wang, Jia Li, Dong Xiang, Zheng Zhou

{"title":"Joint entity and relation extraction with fusion of multi-feature semantics","authors":"Ting Wang, Wenjie Yang, Tao Wu, Chuan Yang, Jiaying Liang, Hongyang Wang, Jia Li, Dong Xiang, Zheng Zhou","doi":"10.1007/s10844-024-00871-y","DOIUrl":"https://doi.org/10.1007/s10844-024-00871-y","url":null,"abstract":"Entity relation extraction is a key technology for extracting structured information from unstructured text and serves as the foundation for building large-scale knowledge graphs. Current joint entity relation extraction methods primarily focus on improving the recognition of overlapping triplets to enhance the overall performance of the model. However, the model still faces numerous challenges in managing intra-triplet and inter-triplet interactions, expanding the breadth of semantic encoding, and reducing information redundancy during the extraction process. These issues make it challenging for the model to achieve satisfactory performance in both normal and overlapping triple extraction. To address these challenges, this study proposes a comprehensive prediction network that includes multi-feature semantic fusion. We have developed a semantic fusion module that integrates entity mask embedding sequences, which enhance connections between entities, and context embedding sequences that provide richer semantic information, to enhance inter-triplet interactions and expand semantic encoding. Subsequently, using a parallel decoder to simultaneously generate a set of triplets, improving the interaction between them. Additionally, we utilize an entity mask sequence to finely prune these triplets, optimizing the final set of triplets. Experimental results on the publicly available datasets NYT and WebNLG demonstrate that, with BERT as the encoder, our model outperforms the baseline model in terms of accuracy and F1 score.","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"34 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141870542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SESAME - self-supervised framework for extractive question answering over document collections SESAME - 文件集抽取式问题解答自监督框架

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-07-30 DOI: 10.1007/s10844-024-00869-6

Vitor A. Batista, Diogo S. M. Gomes, Alexandre Evsukoff

{"title":"SESAME - self-supervised framework for extractive question answering over document collections","authors":"Vitor A. Batista, Diogo S. M. Gomes, Alexandre Evsukoff","doi":"10.1007/s10844-024-00869-6","DOIUrl":"https://doi.org/10.1007/s10844-024-00869-6","url":null,"abstract":"Question Answering is one of the most relevant areas in the field of Natural Language Processing, rapidly evolving with promising results due to the increasing availability of suitable datasets and the advent of new technologies, such as Generative Models. This article introduces SESAME, a Self-supervised framework for Extractive queStion Answering over docuMent collEctions. SESAME aims to enhance open-domain question answering systems (ODQA) by leveraging domain adaptation with synthetic datasets, enabling efficient question answering over private document collections with low resource usage. The framework incorporates recent advances with large language models, and an efficient hybrid method for context retrieval. We conducted several sets of experiments with the Machine Reading for Question Answering (MRQA) 2019 Shared Task datasets, FAQuAD - a Brazilian Portuguese reading comprehension dataset, Wikipedia, and Retrieval-Augmented Generation Benchmark, to demonstrate SESAME’s effectiveness. The results indicate that SESAME’s domain adaptation using synthetic data significantly improves QA performance, generalizes across different domains and languages, and competes with or surpasses state-of-the-art systems in ODQA. Finally, SESAME is an open-source tool, and all code, datasets and experimental data are available for public use in our repository.","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"15 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141870711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing data preparation: insights from a time series case study 加强数据准备：时间序列案例研究的启示

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-07-25 DOI: 10.1007/s10844-024-00867-8

Camilla Sancricca, Giovanni Siracusa, Cinzia Cappiello

{"title":"Enhancing data preparation: insights from a time series case study","authors":"Camilla Sancricca, Giovanni Siracusa, Cinzia Cappiello","doi":"10.1007/s10844-024-00867-8","DOIUrl":"https://doi.org/10.1007/s10844-024-00867-8","url":null,"abstract":"Data play a key role in AI systems that support decision-making processes. Data-centric AI highlights the importance of having high-quality input data to obtain reliable results. However, well-preparing data for machine learning is becoming difficult due to the variety of data quality issues and available data preparation tasks. For this reason, approaches that help users in performing this demanding phase are needed. This work proposes DIANA, a framework for data-centric AI to support data exploration and preparation, suggesting suitable cleaning tasks to obtain valuable analysis results. We design an adaptive self-service environment that can handle the analysis and preparation of different types of sources, i.e., tabular, and streaming data. The central component of our framework is a knowledge base that collects evidence related to the effectiveness of the data preparation actions along with the type of input data and the considered machine learning model. In this paper, we first describe the framework, the knowledge base model, and its enrichment process. Then, we show the experiments conducted to enrich the knowledge base in a particular case study: time series data streams.","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"78 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-task learning and mutual information maximization with crossmodal transformer for multimodal sentiment analysis 多任务学习和互信息最大化与跨模态变换器用于多模态情感分析

IF 3.4 3区计算机科学

Journal of Intelligent Information Systems Pub Date : 2024-07-10 DOI: 10.1007/s10844-024-00858-9

Yang Shi, Jinglang Cai, Lei Liao

{"title":"Multi-task learning and mutual information maximization with crossmodal transformer for multimodal sentiment analysis","authors":"Yang Shi, Jinglang Cai, Lei Liao","doi":"10.1007/s10844-024-00858-9","DOIUrl":"https://doi.org/10.1007/s10844-024-00858-9","url":null,"abstract":"The effectiveness of multimodal sentiment analysis hinges on the seamless integration of information from diverse modalities, where the quality of modality fusion directly influences sentiment analysis accuracy. Prior methods often rely on intricate fusion strategies, elevating computational costs and potentially yielding inaccurate multimodal representations due to distribution gaps and information redundancy across heterogeneous modalities. This paper centers on the backpropagation of loss and introduces a Transformer-based model called Multi-Task Learning and Mutual Information Maximization with Crossmodal Transformer (MMMT). Addressing the issue of inaccurate multimodal representation for MSA, MMMT effectively combines mutual information maximization with crossmodal Transformer to convey more modality-invariant information to multimodal representation, fully exploring modal commonalities. Notably, it utilizes multi-modal labels for uni-modal training, presenting a fresh perspective on multi-task learning in MSA. Comparative experiments on the CMU-MOSI and CMU-MOSEI datasets demonstrate that MMMT improves model accuracy while reducing computational burden, making it suitable for resource-constrained and real-time performance-requiring application scenarios. Additionally, ablation experiments validate the efficacy of multi-task learning and probe the specific impact of combining mutual information maximization with Transformer in MSA.","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"16 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141567460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0