在电子健康记录中发现诊断语言的因果林

IF 1.5 4区 数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS
Alessandro Albano, Chiara Di Maria, Mariangela Sciandra, Antonella Plaia
{"title":"在电子健康记录中发现诊断语言的因果林","authors":"Alessandro Albano,&nbsp;Chiara Di Maria,&nbsp;Mariangela Sciandra,&nbsp;Antonella Plaia","doi":"10.1002/asmb.70038","DOIUrl":null,"url":null,"abstract":"<p>Textual analysis has gained significant interest in medical research, particularly for automated patient diagnosis based on clinical narratives. While traditional approaches often focus on associational methods, this paper explores the application of causal forests to analyze textual data from electronic health records (EHRs), aiming to identify causal relationships between specific words and the likelihood of receiving certain medical diagnoses. Utilizing the MIMIC-III dataset, we assess how linguistic factors influence diagnosis probabilities for three conditions: diabetes, hypothyroidism, and adrenal gland disorders. Our findings reveal significant causal links between certain clinical terms and diagnosis probabilities, emphasizing the potential of causal inference techniques to improve the analysis of language in clinical narratives. Additionally, we uncover heterogeneity in treatment effects, demonstrating that specific words can identify high-risk patient subgroups. This study highlights the importance of integrating causal inference in natural language processing within healthcare settings.</p>","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 5","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70038","citationCount":"0","resultStr":"{\"title\":\"Causal Forests for Discovering Diagnostic Language in Electronic Health Records\",\"authors\":\"Alessandro Albano,&nbsp;Chiara Di Maria,&nbsp;Mariangela Sciandra,&nbsp;Antonella Plaia\",\"doi\":\"10.1002/asmb.70038\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Textual analysis has gained significant interest in medical research, particularly for automated patient diagnosis based on clinical narratives. While traditional approaches often focus on associational methods, this paper explores the application of causal forests to analyze textual data from electronic health records (EHRs), aiming to identify causal relationships between specific words and the likelihood of receiving certain medical diagnoses. Utilizing the MIMIC-III dataset, we assess how linguistic factors influence diagnosis probabilities for three conditions: diabetes, hypothyroidism, and adrenal gland disorders. Our findings reveal significant causal links between certain clinical terms and diagnosis probabilities, emphasizing the potential of causal inference techniques to improve the analysis of language in clinical narratives. Additionally, we uncover heterogeneity in treatment effects, demonstrating that specific words can identify high-risk patient subgroups. This study highlights the importance of integrating causal inference in natural language processing within healthcare settings.</p>\",\"PeriodicalId\":55495,\"journal\":{\"name\":\"Applied Stochastic Models in Business and Industry\",\"volume\":\"41 5\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70038\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Stochastic Models in Business and Industry\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/asmb.70038\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Stochastic Models in Business and Industry","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/asmb.70038","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

文本分析在医学研究中获得了极大的兴趣,特别是在基于临床叙述的患者自动诊断方面。传统方法通常侧重于关联方法,而本文探索了因果森林的应用,以分析电子健康记录(EHRs)的文本数据,旨在确定特定单词与接受某些医学诊断的可能性之间的因果关系。利用MIMIC-III数据集,我们评估了语言因素如何影响三种疾病的诊断概率:糖尿病、甲状腺功能减退和肾上腺疾病。我们的研究结果揭示了某些临床术语与诊断概率之间的重要因果关系,强调了因果推理技术在改善临床叙述中语言分析方面的潜力。此外,我们发现治疗效果的异质性,证明特定的单词可以识别高危患者亚组。本研究强调了在医疗环境中整合自然语言处理中的因果推理的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Causal Forests for Discovering Diagnostic Language in Electronic Health Records

Causal Forests for Discovering Diagnostic Language in Electronic Health Records

Textual analysis has gained significant interest in medical research, particularly for automated patient diagnosis based on clinical narratives. While traditional approaches often focus on associational methods, this paper explores the application of causal forests to analyze textual data from electronic health records (EHRs), aiming to identify causal relationships between specific words and the likelihood of receiving certain medical diagnoses. Utilizing the MIMIC-III dataset, we assess how linguistic factors influence diagnosis probabilities for three conditions: diabetes, hypothyroidism, and adrenal gland disorders. Our findings reveal significant causal links between certain clinical terms and diagnosis probabilities, emphasizing the potential of causal inference techniques to improve the analysis of language in clinical narratives. Additionally, we uncover heterogeneity in treatment effects, demonstrating that specific words can identify high-risk patient subgroups. This study highlights the importance of integrating causal inference in natural language processing within healthcare settings.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.70
自引率
0.00%
发文量
67
审稿时长
>12 weeks
期刊介绍: ASMBI - Applied Stochastic Models in Business and Industry (formerly Applied Stochastic Models and Data Analysis) was first published in 1985, publishing contributions in the interface between stochastic modelling, data analysis and their applications in business, finance, insurance, management and production. In 2007 ASMBI became the official journal of the International Society for Business and Industrial Statistics (www.isbis.org). The main objective is to publish papers, both technical and practical, presenting new results which solve real-life problems or have great potential in doing so. Mathematical rigour, innovative stochastic modelling and sound applications are the key ingredients of papers to be published, after a very selective review process. The journal is very open to new ideas, like Data Science and Big Data stemming from problems in business and industry or uncertainty quantification in engineering, as well as more traditional ones, like reliability, quality control, design of experiments, managerial processes, supply chains and inventories, insurance, econometrics, financial modelling (provided the papers are related to real problems). The journal is interested also in papers addressing the effects of business and industrial decisions on the environment, healthcare, social life. State-of-the art computational methods are very welcome as well, when combined with sound applications and innovative models.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信