开发一种从自由文本临床记录中识别临床风险的工具:自然语言处理研究。

IF 2
JMIR AI Pub Date : 2025-09-22 DOI:10.2196/64898
Natasha Biscoe, Daniel Leightley, Dominic Murphy
{"title":"开发一种从自由文本临床记录中识别临床风险的工具:自然语言处理研究。","authors":"Natasha Biscoe, Daniel Leightley, Dominic Murphy","doi":"10.2196/64898","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Electronic patient records are a valuable yet underused data source; they have been explored in research using natural language processing, but not yet within a third-sector organization.</p><p><strong>Objective: </strong>This study aimed to apply natural language processing to develop a risk identification tool capable of discerning high and low suicide risk among veterans, using electronic patient records from a United Kingdom-based veteran mental health charity.</p><p><strong>Methods: </strong>A total of 20,342 notes were extracted for this purpose. To develop the risk tool, 70% of the records formed the training dataset, while the remaining 30% were allocated for testing and evaluation. The classification framework was devised and trained to categorize risk as a binary outcome: 1 indicating high risk and 0 indicating low risk.</p><p><strong>Results: </strong>The efficacy of each classifier model was assessed by comparing its results with those from clinical risk assessments. A logistic regression classifier was found to perform best and was used to develop the final model. This comparison allowed for the calculation of the positive predictive value (mean 0.74, SD 0.059; 95% CI 0.70-0.77), negative predictive value (mean 0.73, SD 0.024; 95% CI 0.72-0.75), sensitivity (mean 0.75, SD 0.017; 95% CI 0.74-0.76), F<sub>1</sub>-score (mean 0.74, SD 0.033; 95% CI 0.72-0.76), and accuracy, which was measured using the Youden index (mean 0.73, SD 0.035; 95% CI 0.71-0.76).</p><p><strong>Conclusions: </strong>The risk identification tool successfully determined the correct risk category of veterans from a large sample of clinical notes. Future studies should investigate whether this tool can detect more nuanced differences in risk and be generalizable across data sources.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e64898"},"PeriodicalIF":2.0000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12501529/pdf/","citationCount":"0","resultStr":"{\"title\":\"Developing a Tool for Identifying Clinical Risk From Free-Text Clinical Records: Natural Language Processing Study.\",\"authors\":\"Natasha Biscoe, Daniel Leightley, Dominic Murphy\",\"doi\":\"10.2196/64898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Electronic patient records are a valuable yet underused data source; they have been explored in research using natural language processing, but not yet within a third-sector organization.</p><p><strong>Objective: </strong>This study aimed to apply natural language processing to develop a risk identification tool capable of discerning high and low suicide risk among veterans, using electronic patient records from a United Kingdom-based veteran mental health charity.</p><p><strong>Methods: </strong>A total of 20,342 notes were extracted for this purpose. To develop the risk tool, 70% of the records formed the training dataset, while the remaining 30% were allocated for testing and evaluation. The classification framework was devised and trained to categorize risk as a binary outcome: 1 indicating high risk and 0 indicating low risk.</p><p><strong>Results: </strong>The efficacy of each classifier model was assessed by comparing its results with those from clinical risk assessments. A logistic regression classifier was found to perform best and was used to develop the final model. This comparison allowed for the calculation of the positive predictive value (mean 0.74, SD 0.059; 95% CI 0.70-0.77), negative predictive value (mean 0.73, SD 0.024; 95% CI 0.72-0.75), sensitivity (mean 0.75, SD 0.017; 95% CI 0.74-0.76), F<sub>1</sub>-score (mean 0.74, SD 0.033; 95% CI 0.72-0.76), and accuracy, which was measured using the Youden index (mean 0.73, SD 0.035; 95% CI 0.71-0.76).</p><p><strong>Conclusions: </strong>The risk identification tool successfully determined the correct risk category of veterans from a large sample of clinical notes. Future studies should investigate whether this tool can detect more nuanced differences in risk and be generalizable across data sources.</p>\",\"PeriodicalId\":73551,\"journal\":{\"name\":\"JMIR AI\",\"volume\":\"4 \",\"pages\":\"e64898\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12501529/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR AI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/64898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/64898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:电子病历是一个有价值但未充分利用的数据源;它们已经在使用自然语言处理的研究中进行了探索,但尚未在第三部门组织中进行。目的:本研究旨在利用英国退伍军人心理健康慈善机构的电子病历,应用自然语言处理技术开发一种能够识别退伍军人自杀风险高低的风险识别工具。方法:共提取20,342个音符。为了开发风险工具,70%的记录形成了训练数据集,而剩余的30%被分配用于测试和评估。设计并训练了分类框架,将风险分类为二元结果:1表示高风险,0表示低风险。结果:通过与临床风险评估结果的比较,评价各分类器模型的疗效。发现逻辑回归分类器表现最好,并用于开发最终模型。该比较允许计算阳性预测值(平均值0.74,SD 0.059, 95% CI 0.70-0.77)、阴性预测值(平均值0.73,SD 0.024, 95% CI 0.72-0.75)、敏感性(平均值0.75,SD 0.017, 95% CI 0.74-0.76)、f1评分(平均值0.74,SD 0.033, 95% CI 0.72-0.76)和准确度,使用约登指数(平均值0.73,SD 0.035, 95% CI 0.71-0.76)进行测量。结论:风险识别工具成功地从大量临床记录样本中确定了退伍军人的正确风险类别。未来的研究应该调查该工具是否可以检测到更细微的风险差异,并在数据来源之间进行推广。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Developing a Tool for Identifying Clinical Risk From Free-Text Clinical Records: Natural Language Processing Study.

Background: Electronic patient records are a valuable yet underused data source; they have been explored in research using natural language processing, but not yet within a third-sector organization.

Objective: This study aimed to apply natural language processing to develop a risk identification tool capable of discerning high and low suicide risk among veterans, using electronic patient records from a United Kingdom-based veteran mental health charity.

Methods: A total of 20,342 notes were extracted for this purpose. To develop the risk tool, 70% of the records formed the training dataset, while the remaining 30% were allocated for testing and evaluation. The classification framework was devised and trained to categorize risk as a binary outcome: 1 indicating high risk and 0 indicating low risk.

Results: The efficacy of each classifier model was assessed by comparing its results with those from clinical risk assessments. A logistic regression classifier was found to perform best and was used to develop the final model. This comparison allowed for the calculation of the positive predictive value (mean 0.74, SD 0.059; 95% CI 0.70-0.77), negative predictive value (mean 0.73, SD 0.024; 95% CI 0.72-0.75), sensitivity (mean 0.75, SD 0.017; 95% CI 0.74-0.76), F1-score (mean 0.74, SD 0.033; 95% CI 0.72-0.76), and accuracy, which was measured using the Youden index (mean 0.73, SD 0.035; 95% CI 0.71-0.76).

Conclusions: The risk identification tool successfully determined the correct risk category of veterans from a large sample of clinical notes. Future studies should investigate whether this tool can detect more nuanced differences in risk and be generalizable across data sources.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信