Methods of Information in Medicine最新文献_第2页

Deciphering Abbreviations in Malaysian Clinical Notes Using Machine Learning. 使用机器学习破译马来西亚临床记录中的缩写。

IF 1.3 4区医学

Methods of Information in Medicine Pub Date : 2025-02-11 DOI: 10.1055/a-2521-4372

Ismat Mohd Sulaiman, Awang Bulgiba, Sameem Abdul Kareem, Abdul Aziz Latip

{"title":"Deciphering Abbreviations in Malaysian Clinical Notes Using Machine Learning.","authors":"Ismat Mohd Sulaiman, Awang Bulgiba, Sameem Abdul Kareem, Abdul Aziz Latip","doi":"10.1055/a-2521-4372","DOIUrl":"10.1055/a-2521-4372","url":null,"abstract":"Objective: This is the first Malaysian machine learning model to detect and disambiguate abbreviations in clinical notes. The model has been designed to be incorporated into MyHarmony, a natural language processing system, that extracts clinical information for health care management. The model utilizes word embedding to ensure feasibility of use, not in real-time but for secondary analysis, within the constraints of low-resource settings.Methods: A Malaysian clinical embedding, based on Word2Vec model, was developed using 29,895 electronic discharge summaries. The embedding was compared against conventional rule-based and FastText embedding on two tasks: abbreviation detection and abbreviation disambiguation. Machine learning classifiers were applied to assess performance.Results: The Malaysian clinical word embedding contained 7 million word tokens, 24,352 unique vocabularies, and 100 dimensions. For abbreviation detection, the Decision Tree classifier augmented with the Malaysian clinical embedding showed the best performance (F-score of 0.9519). For abbreviation disambiguation, the classifier with the Malaysian clinical embedding had the best performance for most of the abbreviations (F-score of 0.9903).Conclusion: Despite having a smaller vocabulary and dimension, our local clinical word embedding performed better than the larger nonclinical FastText embedding. Word embedding with simple machine learning algorithms can decipher abbreviations well. It also requires lower computational resources and is suitable for implementation in low-resource settings such as Malaysia. The integration of this model into MyHarmony will improve recognition of clinical terms, thus improving the information generated for monitoring Malaysian health care services and policymaking.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143025162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Significance of Information Quality for the Secondary Use of the Information in the National Health Care Quality Registers in Finland. 信息质量对芬兰国家医疗保健质量登记信息二次利用的重要性。

IF 1.3 4区医学

Methods of Information in Medicine Pub Date : 2025-01-29 DOI: 10.1055/a-2511-7866

Anna Frondelius, Ulla-Mari Kinnunen, Vesa Jormanainen

{"title":"The Significance of Information Quality for the Secondary Use of the Information in the National Health Care Quality Registers in Finland.","authors":"Anna Frondelius, Ulla-Mari Kinnunen, Vesa Jormanainen","doi":"10.1055/a-2511-7866","DOIUrl":"10.1055/a-2511-7866","url":null,"abstract":"Background: The aim of the national health care quality registers is to monitor, assess, and improve the quality of care. The information utilized in quality registers must be of high quality to ensure that the information produced by the registers is reliable and useful. In Finland, one of the key sources of information for the quality registers is the national Kanta services.Objectives: The objective of the study was to increase understanding of the significance of information quality for the secondary use of the information in the national health care quality registers and to provide information on whether the information quality of the national Kanta services supports the information needs of the national quality registers, and how information quality should be developed.Methods: The research data were collected by interviewing six experts responsible for national health care quality registers, and it was analyzed using theory-driven qualitative content analysis based on the DeLone and McLean model.Results: Based on the results, the relevance of the information in the Kanta services met the information needs of the national quality registers. However, due to the limited amount of structured information and deficiencies in the completeness of the information, relevant information could not be fully utilized. Deficiencies in information quality posed challenges in information retrieval and hindered drawing conclusions in reporting. Challenges in information quality did not diminish the intention to use the information when information was considered relevant. Solutions to improve information quality included structuring, development of documentation practices, patient information systems and quality assurance, as well as collaboration among stakeholders.Conclusion: The Kanta services' information is relevant for the national health care quality registers, but developing the quality of the information, especially in terms of structures and completeness, is the key to fully enabling the secondary use of this information.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142957856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cross-lingual Natural Language Processing on Limited Annotated Case/Radiology Reports in English and Japanese: Insights from the Real-MedNLP Workshop. 英语和日语有限注释病例/放射学报告的跨语言自然语言处理：Real-MedNLP 研讨会的启示。

IF 1.3 4区医学

Methods of Information in Medicine Pub Date : 2024-10-29 DOI: 10.1055/a-2405-2489

Shuntaro Yada, Yuta Nakamura, Shoko Wakamiya, Eiji Aramaki

{"title":"Cross-lingual Natural Language Processing on Limited Annotated Case/Radiology Reports in English and Japanese: Insights from the Real-MedNLP Workshop.","authors":"Shuntaro Yada, Yuta Nakamura, Shoko Wakamiya, Eiji Aramaki","doi":"10.1055/a-2405-2489","DOIUrl":"10.1055/a-2405-2489","url":null,"abstract":"Background: Textual datasets (corpora) are crucial for the application of natural language processing (NLP) models. However, corpus creation in the medical field is challenging, primarily because of privacy issues with raw clinical data such as health records. Thus, the existing clinical corpora are generally small and scarce. Medical NLP (MedNLP) methodologies perform well with limited data availability.Objectives: We present the outcomes of the Real-MedNLP workshop, which was conducted using limited and parallel medical corpora. Real-MedNLP exhibits three distinct characteristics: (1) limited annotated documents: the training data comprise only a small set (∼100) of case reports (CRs) and radiology reports (RRs) that have been annotated. (2) Bilingually parallel: the constructed corpora are parallel in Japanese and English. (3) Practical tasks: the workshop addresses fundamental tasks, such as named entity recognition (NER) and applied practical tasks.Methods: We propose three tasks: NER of ∼100 available documents (Task 1), NER based only on annotation guidelines for humans (Task 2), and clinical applications (Task 3) consisting of adverse drug effect (ADE) detection for CRs and identical case identification (CI) for RRs.Results: Nine teams participated in this study. The best systems achieved 0.65 and 0.89 F1-scores for CRs and RRs in Task 1, whereas the top scores in Task 2 decreased by 50 to 70%. In Task 3, ADE reports were detected by up to 0.64 F1-score, and CI scored up to 0.96 binary accuracy.Conclusion: Most systems adopt medical-domain-specific pretrained language models using data augmentation methods. Despite the challenge of limited corpus size in Tasks 1 and 2, recent approaches are promising because the partial match scores reached ∼0.8-0.9 F1-scores. Task 3 applications revealed that the different availabilities of external language resources affected the performance per language.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142114054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Learning for Predicting Progression of Patellofemoral Osteoarthritis Based on Lateral Knee Radiographs, Demographic Data, and Symptomatic Assessments. 基于膝关节外侧X光片、人口统计学数据和症状评估的深度学习预测髌骨骨关节炎的进展情况

IF 1.3 4区医学

Methods of Information in Medicine Pub Date : 2024-05-01 Epub Date: 2024-04-11 DOI: 10.1055/a-2305-2115

Neslihan Bayramoglu, Martin Englund, Ida K Haugen, Muneaki Ishijima, Simo Saarakkala

{"title":"Deep Learning for Predicting Progression of Patellofemoral Osteoarthritis Based on Lateral Knee Radiographs, Demographic Data, and Symptomatic Assessments.","authors":"Neslihan Bayramoglu, Martin Englund, Ida K Haugen, Muneaki Ishijima, Simo Saarakkala","doi":"10.1055/a-2305-2115","DOIUrl":"10.1055/a-2305-2115","url":null,"abstract":"Objective: In this study, we propose a novel framework that utilizes deep learning and attention mechanisms to predict the radiographic progression of patellofemoral osteoarthritis (PFOA) over a period of 7 years.Material and methods: This study included subjects (1,832 subjects, 3,276 knees) from the baseline of the Multicenter Osteoarthritis Study (MOST). Patellofemoral joint regions of interest were identified using an automated landmark detection tool (BoneFinder) on lateral knee X-rays. An end-to-end deep learning method was developed for predicting PFOA progression based on imaging data in a five-fold cross-validation setting. To evaluate the performance of the models, a set of baselines based on known risk factors were developed and analyzed using gradient boosting machine (GBM). Risk factors included age, sex, body mass index, and Western Ontario and McMaster Universities Arthritis Index score, and the radiographic osteoarthritis stage of the tibiofemoral joint (Kellgren and Lawrence [KL] score). Finally, to increase predictive power, we trained an ensemble model using both imaging and clinical data.Results: Among the individual models, the performance of our deep convolutional neural network attention model achieved the best performance with an area under the receiver operating characteristic curve (AUC) of 0.856 and average precision (AP) of 0.431, slightly outperforming the deep learning approach without attention (AUC = 0.832, AP = 0.4) and the best performing reference GBM model (AUC = 0.767, AP = 0.334). The inclusion of imaging data and clinical variables in an ensemble model allowed statistically more powerful prediction of PFOA progression (AUC = 0.865, AP = 0.447), although the clinical significance of this minor performance gain remains unknown. The spatial attention module improved the predictive performance of the backbone model, and the visual interpretation of attention maps focused on the joint space and the regions where osteophytes typically occur.Conclusion: This study demonstrated the potential of machine learning models to predict the progression of PFOA using imaging and clinical variables. These models could be used to identify patients who are at high risk of progression and prioritize them for new treatments. However, even though the accuracy of the models were excellent in this study using the MOST dataset, they should be still validated using external patient cohorts in the future.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"1-10"},"PeriodicalIF":1.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495941/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140854286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Development and Validation of a Natural Language Processing Algorithm to Pseudonymize Documents in the Context of a Clinical Data Warehouse. 开发和验证自然语言处理算法，在临床数据仓库中对文档进行匿名化处理。

IF 1.3 4区医学

Methods of Information in Medicine Pub Date : 2024-05-01 Epub Date: 2024-03-05 DOI: 10.1055/s-0044-1778693

Xavier Tannier, Perceval Wajsbürt, Alice Calliger, Basile Dura, Alexandre Mouchet, Martin Hilka, Romain Bey

{"title":"Development and Validation of a Natural Language Processing Algorithm to Pseudonymize Documents in the Context of a Clinical Data Warehouse.","authors":"Xavier Tannier, Perceval Wajsbürt, Alice Calliger, Basile Dura, Alexandre Mouchet, Martin Hilka, Romain Bey","doi":"10.1055/s-0044-1778693","DOIUrl":"10.1055/s-0044-1778693","url":null,"abstract":"Objective: The objective of this study is to address the critical issue of deidentification of clinical reports to allow access to data for research purposes, while ensuring patient privacy. The study highlights the difficulties faced in sharing tools and resources in this domain and presents the experience of the Greater Paris University Hospitals (AP-HP for Assistance Publique-Hôpitaux de Paris) in implementing a systematic pseudonymization of text documents from its Clinical Data Warehouse.Methods: We annotated a corpus of clinical documents according to 12 types of identifying entities and built a hybrid system, merging the results of a deep learning model as well as manual rules.Results and discussion: Our results show an overall performance of 0.99 of F1-score. We discuss implementation choices and present experiments to better understand the effort involved in such a task, including dataset size, document types, language models, or rule addition. We share guidelines and code under a 3-Clause BSD license.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"21-34"},"PeriodicalIF":1.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495938/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140040727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Does Differentially Private Synthetic Data Lead to Synthetic Discoveries? 差异化私有合成数据会带来合成发现吗？

IF 1.3 4区医学

Methods of Information in Medicine Pub Date : 2024-05-01 Epub Date: 2024-08-13 DOI: 10.1055/a-2385-1355

Ileana Montoya Perez, Parisa Movahedi, Valtteri Nieminen, Antti Airola, Tapio Pahikkala

{"title":"Does Differentially Private Synthetic Data Lead to Synthetic Discoveries?","authors":"Ileana Montoya Perez, Parisa Movahedi, Valtteri Nieminen, Antti Airola, Tapio Pahikkala","doi":"10.1055/a-2385-1355","DOIUrl":"10.1055/a-2385-1355","url":null,"abstract":"Background: Synthetic data have been proposed as a solution for sharing anonymized versions of sensitive biomedical datasets. Ideally, synthetic data should preserve the structure and statistical properties of the original data, while protecting the privacy of the individual subjects. Differential Privacy (DP) is currently considered the gold standard approach for balancing this trade-off.Objectives: The aim of this study is to investigate how trustworthy are group differences discovered by independent sample tests from DP-synthetic data. The evaluation is carried out in terms of the tests' Type I and Type II errors. With the former, we can quantify the tests' validity, i.e., whether the probability of false discoveries is indeed below the significance level, and the latter indicates the tests' power in making real discoveries.Methods: We evaluate the Mann-Whitney U test, Student's t-test, chi-squared test, and median test on DP-synthetic data. The private synthetic datasets are generated from real-world data, including a prostate cancer dataset (n = 500) and a cardiovascular dataset (n = 70,000), as well as on bivariate and multivariate simulated data. Five different DP-synthetic data generation methods are evaluated, including two basic DP histogram release methods and MWEM, Private-PGM, and DP GAN algorithms.Conclusion: A large portion of the evaluation results expressed dramatically inflated Type I errors, especially at levels of ϵ ≤ 1. This result calls for caution when releasing and analyzing DP-synthetic data: low p-values may be obtained in statistical tests simply as a byproduct of the noise added to protect privacy. A DP Smoothed Histogram-based synthetic data generation method was shown to produce valid Type I error for all privacy levels tested but required a large original dataset size and a modest privacy budget (ϵ ≥ 5) in order to have reasonable Type II error levels.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"35-51"},"PeriodicalIF":1.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495942/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141977081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Artificial Intelligence-Based Prediction of Contrast Medium Doses for Computed Tomography Angiography Using Optimized Clinical Parameter Sets. 基于人工智能的计算机断层扫描血管造影术造影剂剂量预测，使用优化的临床参数集。

IF 1.3 4区医学

Methods of Information in Medicine Pub Date : 2024-05-01 Epub Date: 2024-01-23 DOI: 10.1055/s-0044-1778694

Marja Fleitmann, Hristina Uzunova, René Pallenberg, Andreas M Stroth, Jan Gerlach, Alexander Fürschke, Jörg Barkhausen, Arpad Bischof, Heinz Handels

{"title":"Artificial Intelligence-Based Prediction of Contrast Medium Doses for Computed Tomography Angiography Using Optimized Clinical Parameter Sets.","authors":"Marja Fleitmann, Hristina Uzunova, René Pallenberg, Andreas M Stroth, Jan Gerlach, Alexander Fürschke, Jörg Barkhausen, Arpad Bischof, Heinz Handels","doi":"10.1055/s-0044-1778694","DOIUrl":"10.1055/s-0044-1778694","url":null,"abstract":"Objectives: In this paper, an artificial intelligence-based algorithm for predicting the optimal contrast medium dose for computed tomography (CT) angiography of the aorta is presented and evaluated in a clinical study. The prediction of the contrast dose reduction is modelled as a classification problem using the image contrast as the main feature.Methods: This classification is performed by random decision forests (RDF) and k-nearest-neighbor methods (KNN). For the selection of optimal parameter subsets all possible combinations of the 22 clinical parameters (age, blood pressure, etc.) are considered using the classification accuracy and precision of the KNN classifier and RDF as quality criteria. Subsequently, the results of the evaluation were optimized by means of feature transformation using regression neural networks (RNN). These were used for a direct classification based on regressed Hounsfield units as well as preprocessing for a subsequent KNN classification.Results: For feature selection, an RDF model achieved the highest accuracy of 84.42% and a KNN model achieved the best precision of 86.21%. The most important parameters include age, height, and hemoglobin. The feature transformation using an RNN considerably exceeded these values with an accuracy of 90.00% and a precision of 97.62% using all 22 parameters as input. However, also the feasibility of the parameter sets in routine clinical practice has to be considered, because some of the 22 parameters are not measured in routine clinical practice and additional measurement time of 15 to 20 minutes per patient is needed. Using the standard feature set available in clinical routine the best accuracy of 86.67% and precision of 93.18% was achieved by the RNN.Conclusion: We developed a reliable hybrid system that helps radiologists determine the optimal contrast dose for CT angiography based on patient-specific parameters.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"11-20"},"PeriodicalIF":1.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495943/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139543328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Europe's Largest Research Infrastructure for Curated Medical Data Models with Semantic Annotations. 欧洲最大的带语义注释的医学数据模型研究基础设施。

IF 1.3 4区医学

Methods of Information in Medicine Pub Date : 2024-05-01 Epub Date: 2024-05-13 DOI: 10.1055/s-0044-1786839

Sarah Riepenhausen, Max Blumenstock, Christian Niklas, Stefan Hegselmann, Philipp Neuhaus, Alexandra Meidt, Cornelia Püttmann, Michael Storck, Matthias Ganzinger, Julian Varghese, Martin Dugas

{"title":"Europe's Largest Research Infrastructure for Curated Medical Data Models with Semantic Annotations.","authors":"Sarah Riepenhausen, Max Blumenstock, Christian Niklas, Stefan Hegselmann, Philipp Neuhaus, Alexandra Meidt, Cornelia Püttmann, Michael Storck, Matthias Ganzinger, Julian Varghese, Martin Dugas","doi":"10.1055/s-0044-1786839","DOIUrl":"10.1055/s-0044-1786839","url":null,"abstract":"Background: Structural metadata from the majority of clinical studies and routine health care systems is currently not yet available to the scientific community.Objective: To provide an overview of available contents in the Portal of Medical Data Models (MDM Portal).Methods: The MDM Portal is a registered European information infrastructure for research and health care, and its contents are curated and semantically annotated by medical experts. It enables users to search, view, discuss, and download existing medical data models.Results: The most frequent keyword is \"clinical trial\" (n = 18,777), and the most frequent disease-specific keyword is \"breast neoplasms\" (n = 1,943). Most data items are available in English (n = 545,749) and German (n = 109,267). Manually curated semantic annotations are available for 805,308 elements (554,352 items, 58,101 item groups, and 192,855 code list items), which were derived from 25,257 data models. In total, 1,609,225 Unified Medical Language System (UMLS) codes have been assigned, with 66,373 unique UMLS codes.Conclusion: To our knowledge, the MDM Portal constitutes Europe's largest collection of medical data models with semantically annotated elements. As such, it can be used to increase compatibility of medical datasets and can be utilized as a large expert-annotated medical text corpus for natural language processing.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"52-61"},"PeriodicalIF":1.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495939/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140917387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Performance Characteristics of a Rule-Based Electronic Health Record Algorithm to Identify Patients with Gross and Microscopic Hematuria. 基于规则的电子健康记录算法识别肉眼和显微镜下血尿患者的性能特征。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2023-12-01 Epub Date: 2023-09-04 DOI: 10.1055/a-2165-5552

Jasmine Kashkoush, Mudit Gupta, Matthew A Meissner, Matthew E Nielsen, H Lester Kirchner, Tullika Garg

{"title":"Performance Characteristics of a Rule-Based Electronic Health Record Algorithm to Identify Patients with Gross and Microscopic Hematuria.","authors":"Jasmine Kashkoush, Mudit Gupta, Matthew A Meissner, Matthew E Nielsen, H Lester Kirchner, Tullika Garg","doi":"10.1055/a-2165-5552","DOIUrl":"10.1055/a-2165-5552","url":null,"abstract":"Background: Two million patients per year are referred to urologists for hematuria, or blood in the urine. The American Urological Association recently adopted a risk-stratified hematuria evaluation guideline to limit multi-phase computed tomography to individuals at highest risk of occult malignancy.Objectives: To understand population-level hematuria evaluations, we developed an algorithm to accurately identify hematuria cases from electronic health records (EHRs).Methods: We used International Classification of Diseases (ICD)-9/ICD-10 diagnosis codes, urine color, and urine microscopy values to identify hematuria cases and to differentiate between gross and microscopic hematuria. Using an iterative process, we refined the ICD-9 algorithm on a gold standard, chart-reviewed cohort of 3,094 hematuria cases, and the ICD-10 algorithm on a 300 patient cohort. We applied the algorithm to Geisinger patients ≥35 years (n = 539,516) and determined performance by conducting chart review (n = 500).Results: After applying the hematuria algorithm, we identified 51,500 hematuria cases and 488,016 clean controls. Of the hematuria cases, 11,435 were categorized as gross, 26,658 as microscopic, 12,562 as indeterminate, and 845 were uncategorized. The positive predictive value (PPV) of identifying hematuria cases using the algorithm was 100% and the negative predictive value (NPV) was 99%. The gross hematuria algorithm had a PPV of 100% and NPV of 99%. The microscopic hematuria algorithm had lower PPV of 78% and NPV of 100%.Conclusion: We developed an algorithm utilizing diagnosis codes and urine laboratory values to accurately identify hematuria and categorize as gross or microscopic in EHRs. Applying the algorithm will help researchers to understand patterns of care for this common condition.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"183-192"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10153429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Report from the 68th GMDS Annual Meeting: Science. Close to People. 第 68 届 GMDS 年会报告：科学。贴近人类。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2023-12-01 Epub Date: 2024-02-20 DOI: 10.1055/s-0043-1777733

Jonas Bienzeisler, Ariadna Perez-Garriga, Lea C Brandl, Ann-Kristin Kock-Schoppenhauer, Yasmin Hollenbenders, Maximilian Kurscheidt, Christina Schüttler

引用次数: 0