Information Systems最新文献

筛选
英文 中文
Learning to resolve inconsistencies in qualitative constraint networks 学习解决定性约束网络中的不一致性
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-04-18 DOI: 10.1016/j.is.2025.102557
Anastasia Paparrizou, Michael Sioutis
{"title":"Learning to resolve inconsistencies in qualitative constraint networks","authors":"Anastasia Paparrizou,&nbsp;Michael Sioutis","doi":"10.1016/j.is.2025.102557","DOIUrl":"10.1016/j.is.2025.102557","url":null,"abstract":"<div><div>In this paper, we present a reinforcement learning approach for resolving inconsistencies in qualitative constraint networks (<span><math><mi>QCN</mi></math></span>s). <span><math><mi>QCN</mi></math></span>s are typically used in constraint programming to represent and reason about intuitive spatial or temporal relations like <em>x</em> {<em>is inside of</em> <span><math><mo>∨</mo></math></span> <em>overlaps</em>} <em>y</em>. Naturally, <span><math><mi>QCN</mi></math></span>s are not immune to uncertainty, noise, or imperfect data that may be present in information, and thus, more often than not, they are hampered by inconsistencies. We propose a multi-armed bandit approach that defines a well-suited ordering of constraints for finding a maximal satisfiable subset of them. Specifically, our learning approach interacts with a solver, and after each trial a reward is returned to measure the performance of the selected action (constraint addition). The reward function is based on the reduction of the solution space of a consistent reconstruction of the input <span><math><mi>QCN</mi></math></span>. Experimental results with different bandit policies and various rewards that are obtained by our algorithm suggest that we can do better than the state of the art in terms of both effectiveness, viz., lower number of repairs obtained for an inconsistent <span><math><mi>QCN</mi></math></span>, and efficiency, viz., faster runtime.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102557"},"PeriodicalIF":3.0,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental checking of SQL assertions in an RDBMS RDBMS中SQL断言的增量检查
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-04-16 DOI: 10.1016/j.is.2025.102550
Xavier Oriol, Ernest Teniente
{"title":"Incremental checking of SQL assertions in an RDBMS","authors":"Xavier Oriol,&nbsp;Ernest Teniente","doi":"10.1016/j.is.2025.102550","DOIUrl":"10.1016/j.is.2025.102550","url":null,"abstract":"<div><div>The notion of SQL assertion was introduced, in SQL-92 standard, to define general constraints over a relational database. They can be used, for instance, to specify cross-row constraints or multitable check constraints. However, up to now, none of the current relational database management systems (RDBMSs) support SQL assertions due to the difficulty of providing an efficient solution.</div><div>To implement SQL assertions efficiently, the RDBMs require an incremental checking mechanism. I.e., given an assertion, the RDBMS should revalidate it only when a transaction changes data in a manner that could violate it, and only for the affected data. Some years ago, the deductive database community provided several <em>incremental checking</em> methods, however, their results could not get into practice in RDBMS.</div><div>In this paper, we propose an approach to efficiently implement SQL assertions in an RDBMS through an incremental revalidation technique. Such an approach is compatible with any RDBMS since it is fully based on standard SQL concepts (tables, triggers, and procedures). Our proposal uses and extends <em>the Event Rules</em>, an existing proposal for incremental checking in deductive databases. This extension is required to handle distributive aggregates, which pushes the expressiveness of the handled SQL assertions beyond first-order constraints. Moreover, we exploit this extension to improve the treatment of constraints involving existential variables, which are a very common kind of constraints difficult and expensive to handle. Finally, we show the efficiency of our approach through some experiments, and we formally prove its soundness and completeness.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102550"},"PeriodicalIF":3.0,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143848283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A high-accuracy unsupervised statistical learning method for joint dangling entity detection and entity alignment 一种用于关节悬垂实体检测和实体对齐的高精度无监督统计学习方法
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-04-11 DOI: 10.1016/j.is.2025.102554
Cong Xu , Mengxin Shi , Xiang Gao , Zhongkang Yin , Xiujuan Yao , Wei Li , Jiasen Yang
{"title":"A high-accuracy unsupervised statistical learning method for joint dangling entity detection and entity alignment","authors":"Cong Xu ,&nbsp;Mengxin Shi ,&nbsp;Xiang Gao ,&nbsp;Zhongkang Yin ,&nbsp;Xiujuan Yao ,&nbsp;Wei Li ,&nbsp;Jiasen Yang","doi":"10.1016/j.is.2025.102554","DOIUrl":"10.1016/j.is.2025.102554","url":null,"abstract":"<div><div>Dangling entities are common in knowledge graphs but there is a lack of research on entity alignment involving them. Most existing studies leverage neural network methods through supervised learning. However, these data-driven methods suffer from poor interpretability and high computation overhead. In this paper, we propose a Simple Unsupervised Dangling entity detection and entity Alignment method (SUDA)<span><span><sup>1</sup></span></span> without employing neural networks. Our method consists of three modules: entity embedding, dangling entity detection, and entity alignment. While the state-of-the-art Simple but Effective Unsupervised entity alignment method (SEU)<span><span><sup>2</sup></span></span> is incapable of dealing with dangling entities, SUDA further extends it and addresses the bilateral dangling entities problem. Theoretical proof of our method is given. We also design a new adjacent matrix for incorporating richer entity relations. Then we construct entity similarity outlier intervals to detect dangling entities and align entities through assignment problem after removing them. Extensive experiments demonstrate that our method outperforms those supervised and unsupervised methods. Additionally, in the entity alignment tasks, SUDA consumes less runtime compared to neural network methods, while maintaining high efficiency, interpretability, and stability. Code is available at <span><span>https://github.com/skyccong/SUDA.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102554"},"PeriodicalIF":3.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143838186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical aspect-based sentiment analysis using semantic capsuled multi-granular networks 基于语义封装的多颗粒网络分层面向情感分析
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-04-03 DOI: 10.1016/j.is.2025.102556
Jeffin Gracewell , A. Arul Edwin Raj , C.T. Kalaivani , Renugadevi R
{"title":"Hierarchical aspect-based sentiment analysis using semantic capsuled multi-granular networks","authors":"Jeffin Gracewell ,&nbsp;A. Arul Edwin Raj ,&nbsp;C.T. Kalaivani ,&nbsp;Renugadevi R","doi":"10.1016/j.is.2025.102556","DOIUrl":"10.1016/j.is.2025.102556","url":null,"abstract":"<div><div>In the ever-evolving domain of sentiment analysis, discerning intricate sentiments towards specific aspects and their sub-components within textual data has become pivotal. This paper introduces the Semantic Capsuled Hierarchical Multi-Granular Network (SCH-MGN) model, an innovative approach explicitly designed for aspect-based sentiment analysis (ABSA) challenges. The SCH-MGN model is primed to evaluate sentiments at both macro (broader topics) and micro (detailed sub-aspects) hierarchical levels, offering a comprehensive sentiment evaluation spectrum. By integrating mechanisms like the Semantic Knowledge Graph Attention Network (SKG-AN) for targeted aspect extraction, Hierarchical Embedding Layers leveraging Multilingual BERT (mBERT), and advanced neural architectures including Recurrent Neural Networks (RNNs) and Temporal Convolutional Networks (TCNs), the model ensures a nuanced sentiment interpretation. The paper provides a meticulous dissection of the model's methodology, from tokenization and embedding to detailed sentiment extraction, accentuating its capability to offer granular sentiment interpretations. Empirical illustrations validate the model's proficiency in handling compound sentiments, cementing its potential as an indispensable tool for businesses, reviewers, and analysts. This groundbreaking approach to ABSA promises to redefine the granularity with which we understand and evaluate textual sentiments in diverse domains.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"132 ","pages":"Article 102556"},"PeriodicalIF":3.0,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
When GDD meets GNN: A knowledge-driven neural connection for effective entity resolution in property graphs 当GDD遇到GNN时:一种知识驱动的神经连接,用于属性图中有效的实体解析
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-03-22 DOI: 10.1016/j.is.2025.102551
Junwei Hu , Michael Bewong , Selasi Kwashie , Yidi Zhang , Vincent Nofong , John Wondoh , Zaiwen Feng
{"title":"When GDD meets GNN: A knowledge-driven neural connection for effective entity resolution in property graphs","authors":"Junwei Hu ,&nbsp;Michael Bewong ,&nbsp;Selasi Kwashie ,&nbsp;Yidi Zhang ,&nbsp;Vincent Nofong ,&nbsp;John Wondoh ,&nbsp;Zaiwen Feng","doi":"10.1016/j.is.2025.102551","DOIUrl":"10.1016/j.is.2025.102551","url":null,"abstract":"<div><div>This paper studies the entity resolution (ER) problem in property graphs. ER is the task of identifying and linking different records that refer to the same real-world entity. It is commonly used in data integration, data cleansing, and other applications where it is important to have accurate and consistent data. In general, two predominant approaches exist in the literature: rule-based and learning-based methods. On the one hand, rule-based techniques are often desired due to their explainability and ability to encode domain knowledge. Learning-based methods, on the other hand, are preferred due to their effectiveness in spite of their black-box nature. In this work, we devise a hybrid ER solution, <span>GraphER</span>, that leverages the strengths of both systems for property graphs. In particular, we adopt <em>graph differential dependency</em> (GDD) for encoding the so-called <em>record-matching rules</em>, and employ them to guide a graph neural network (GNN) based representation learning for the task. We conduct extensive empirical evaluation of our proposal on benchmark ER datasets including 17 graph datasets and 7 relational datasets in comparison with 10 state-of-the-art (SOTA) techniques. The results show that our approach provides a significantly better solution to addressing ER in graph data, both quantitatively and qualitatively, while attaining highly competitive results on the benchmark relational datasets <em>w.r.t.</em> the SOTA solutions.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"132 ","pages":"Article 102551"},"PeriodicalIF":3.0,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A JSON document algebra for query optimization 用于查询优化的JSON文档代数
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-03-19 DOI: 10.1016/j.is.2025.102537
Tomas Llano-Rios , Mohamed Khalefa , Antonio Badia
{"title":"A JSON document algebra for query optimization","authors":"Tomas Llano-Rios ,&nbsp;Mohamed Khalefa ,&nbsp;Antonio Badia","doi":"10.1016/j.is.2025.102537","DOIUrl":"10.1016/j.is.2025.102537","url":null,"abstract":"<div><div>Due to the popularity of JSON, several systems have been developed that store data in collections of JSON documents. Each system has developed its own query language, sometimes in an ad-hoc manner. This makes difficult to formally define and analyze query optimization techniques. We propose an algebra tailored to JSON documents. First, we argue that JSON is different from nested relations and XML and therefore requires its own solution. Then, we propose an algebra on 3 levels: the first level defines operators to manipulate individual documents, providing an abstraction over different serializations. The second level provides operators over collections of JSON documents, while the third level defines also collection operators which are not primitive, but that enable direct and efficient implementation of data manipulation operations. We provide a number of properties of the algebraic operators which provide a solid basis for query optimization.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"132 ","pages":"Article 102537"},"PeriodicalIF":3.0,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The effects of data quality on machine learning performance on tabular data 数据质量对表数据机器学习性能的影响
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-03-14 DOI: 10.1016/j.is.2025.102549
Sedir Mohammed , Lukas Budach , Moritz Feuerpfeil , Nina Ihde , Andrea Nathansen , Nele Noack , Hendrik Patzlaff , Felix Naumann , Hazar Harmouch
{"title":"The effects of data quality on machine learning performance on tabular data","authors":"Sedir Mohammed ,&nbsp;Lukas Budach ,&nbsp;Moritz Feuerpfeil ,&nbsp;Nina Ihde ,&nbsp;Andrea Nathansen ,&nbsp;Nele Noack ,&nbsp;Hendrik Patzlaff ,&nbsp;Felix Naumann ,&nbsp;Hazar Harmouch","doi":"10.1016/j.is.2025.102549","DOIUrl":"10.1016/j.is.2025.102549","url":null,"abstract":"<div><div>Modern artificial intelligence (AI) applications require large quantities of training and test data. This need creates critical challenges not only concerning the availability of such data, but also regarding its quality. For example, incomplete, erroneous, or inappropriate training data can lead to unreliable models that produce ultimately poor decisions. Trustworthy AI applications require high-quality training and test data along many quality dimensions, such as accuracy, completeness, and consistency.</div><div>We explore empirically the relationship between six data quality dimensions and the performance of 19 popular machine learning algorithms covering the tasks of classification, regression, and clustering, with the goal of explaining their performance in terms of data quality. Our experiments distinguish three scenarios based on the AI pipeline steps that were fed with polluted data: polluted training data, test data, or both. We conclude the paper with an extensive discussion of our observations.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"132 ","pages":"Article 102549"},"PeriodicalIF":3.0,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143642966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Process mining over sensor data: Goal recognition for powered transhumeral prostheses 传感器数据的过程挖掘:动力肱骨假体的目标识别
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-03-06 DOI: 10.1016/j.is.2025.102540
Zihang Su , Tianshi Yu , Artem Polyvyanyy , Ying Tan , Nir Lipovetzky , Sebastian Sardiña , Nick van Beest , Alireza Mohammadi , Denny Oetomo
{"title":"Process mining over sensor data: Goal recognition for powered transhumeral prostheses","authors":"Zihang Su ,&nbsp;Tianshi Yu ,&nbsp;Artem Polyvyanyy ,&nbsp;Ying Tan ,&nbsp;Nir Lipovetzky ,&nbsp;Sebastian Sardiña ,&nbsp;Nick van Beest ,&nbsp;Alireza Mohammadi ,&nbsp;Denny Oetomo","doi":"10.1016/j.is.2025.102540","DOIUrl":"10.1016/j.is.2025.102540","url":null,"abstract":"<div><div>Process mining (PM)-based goal recognition (GR) techniques, which infer goals or targets based on sequences of observed actions, have shown efficacy in real-world engineering applications. This study explores the applicability of PM-based GR in identifying target poses for users employing powered transhumeral prosthetics. These prosthetics are designed to restore missing anatomical segments below the shoulder, including the hand. In this article, we aim to apply the GR techniques to identify the intended movements of users, enabling the motors on the powered transhumeral prosthesis to execute the desired motions precisely. In this way, a powered transhumeral prosthesis can assist individuals with disabilities in completing movement tasks. PM-based GR techniques were initially designed to infer goals from sequences of observed actions, where discrete event names represent actions. However, the electromyography electrodes and kinematic sensors on powered transhumeral prosthetic devices register sequences of continuous, real-valued data measurements. Therefore, we rely on methods to transform sensor data into discrete events and integrate these methods with the PM-based GR system to develop target pose recognition approaches. Two data transformation approaches are introduced. The first approach relies on the clustering of data measurements collected before the target pose is reached (the clustering approach). The second approach uses the time series of measurements collected while the dynamic user movement to perform linear discriminant analysis (LDA) classification and identify discrete events (the dynamic LDA approach). These methods are evaluated through offline and human-in-the-loop (online) experiments and compared with established techniques, such as static LDA, an LDA classification based on data collected at static target poses, and GR approaches based on neural networks. Real-time human-in-the-loop experiments further validate the effectiveness of the proposed methods, demonstrating that PM-based GR using the dynamic LDA classifier achieves superior <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> score and balanced accuracy compared to state-of-the-art techniques.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"132 ","pages":"Article 102540"},"PeriodicalIF":3.0,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resource allocation in business process executions—A systematic literature study 业务流程执行中的资源配置——系统的文献研究
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-03-05 DOI: 10.1016/j.is.2025.102541
Luise Pufahl , Fabian Stiehle , Sven Ihde , Mathias Weske , Ingo Weber
{"title":"Resource allocation in business process executions—A systematic literature study","authors":"Luise Pufahl ,&nbsp;Fabian Stiehle ,&nbsp;Sven Ihde ,&nbsp;Mathias Weske ,&nbsp;Ingo Weber","doi":"10.1016/j.is.2025.102541","DOIUrl":"10.1016/j.is.2025.102541","url":null,"abstract":"<div><div>To achieve their goals, organizations execute business processes, which require effective allocation of resources to process activities. This results in the decision-making problem: Which resources should be allocated to which process activities? This problem significantly impacts both process efficiency and effectiveness. Over the past decades, various system-initiated (largely automated) resource allocation approaches have been developed. This study presents a comprehensive overview of this field by analyzing 61 primary studies identified through a rigorous, structured literature review covering publications from 1995 to 2023. We investigate resource allocation goals and cardinalities and describe how process models, execution data, and task attributes, as well as resource attributes, are used to specify the resource allocation problem. Additionally, the type of algorithmic solution and evaluation methods are discussed. This study shows that most approaches support 1-to-1 allocation cardinalities only, specify process-oriented goals, focus on process models, and utilize rule-based methods. Based on the results, we call for future research to define common terminology, support evidence-oriented resource allocation and adaptability, and improve reproducibility and comparability by performing benchmarking studies.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"132 ","pages":"Article 102541"},"PeriodicalIF":3.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Context-aware automated ICD coding: A semantic-driven approach 上下文感知的自动ICD编码:语义驱动的方法
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-03-04 DOI: 10.1016/j.is.2025.102539
O.K. Reshma, N. Saleena, K.A. Abdul Nazeer
{"title":"Context-aware automated ICD coding: A semantic-driven approach","authors":"O.K. Reshma,&nbsp;N. Saleena,&nbsp;K.A. Abdul Nazeer","doi":"10.1016/j.is.2025.102539","DOIUrl":"10.1016/j.is.2025.102539","url":null,"abstract":"<div><div>Identifying the exact International Classification of Diseases (ICD) codes describing a patient’ s health condition is essential in classifying patients with similar disease conditions. Numerous studies have devised automated approaches to retrieve the ICD codes from patients’ health records. However, majority of these methodologies have considered ICD codes solely as alphanumeric codes, overlooking their descriptions and thus neglecting the inherent semantics. Also, these methodologies overlook the one-to-many semantic relationships between diagnosis and assigned ICD code descriptions. Subsequently, this constrains these approaches from effectively assigning ICD codes with meaningful context. This work addresses these limitations by capturing the semantic similarity between the diagnosis and ICD code descriptions, while utilising the inherent one-to-many relationships between them, to accurately assign ICD codes. For this, we formulate the ICD coding problem as a Semantic Text Similarity task. The proposed approach uses a siamese stacked Bi-LSTM network to learn context-aware representations of diagnoses and ICD code descriptions. We transform each patient-visit data into sentence pairs by considering the one-to-many relationships between diagnosis and assigned ICD code descriptions. Further, we compute their semantic similarity and classify them as similar or dissimilar. The proposed approach was evaluated using 5-fold cross-validation on MIMIC-III dataset and achieved the highest evaluation metric scores (F1-score 0.66, precision 0.67, recall 0.84) compared with other sequential models. The per-label evaluation demonstrates the performance of the proposed approach for each ICD code. Furthermore, the proposed approach outperformed several existing attention-based models, demonstrating the potential use of semantics in automated ICD coding.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"132 ","pages":"Article 102539"},"PeriodicalIF":3.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143579980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信