Use of natural language processing techniques to predict patient selection for total hip and knee arthroplasty from radiology reports.

IF 4.9 1区 医学 Q1 ORTHOPEDICS
Luke Farrow, Mingjun Zhong, Lesley Anderson
{"title":"Use of natural language processing techniques to predict patient selection for total hip and knee arthroplasty from radiology reports.","authors":"Luke Farrow, Mingjun Zhong, Lesley Anderson","doi":"10.1302/0301-620X.106B7.BJJ-2024-0136","DOIUrl":null,"url":null,"abstract":"<p><strong>Aims: </strong>To examine whether natural language processing (NLP) using a clinically based large language model (LLM) could be used to predict patient selection for total hip or total knee arthroplasty (THA/TKA) from routinely available free-text radiology reports.</p><p><strong>Methods: </strong>Data pre-processing and analyses were conducted according to the Artificial intelligence to Revolutionize the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project protocol. This included use of de-identified Scottish regional clinical data of patients referred for consideration of THA/TKA, held in a secure data environment designed for artificial intelligence (AI) inference. Only preoperative radiology reports were included. NLP algorithms were based on the freely available GatorTron model, a LLM trained on over 82 billion words of de-identified clinical text. Two inference tasks were performed: assessment after model-fine tuning (50 Epochs and three cycles of k-fold cross validation), and external validation.</p><p><strong>Results: </strong>For THA, there were 5,558 patient radiology reports included, of which 4,137 were used for model training and testing, and 1,421 for external validation. Following training, model performance demonstrated average (mean across three folds) accuracy, F1 score, and area under the receiver operating curve (AUROC) values of 0.850 (95% confidence interval (CI) 0.833 to 0.867), 0.813 (95% CI 0.785 to 0.841), and 0.847 (95% CI 0.822 to 0.872), respectively. For TKA, 7,457 patient radiology reports were included, with 3,478 used for model training and testing, and 3,152 for external validation. Performance metrics included accuracy, F1 score, and AUROC values of 0.757 (95% CI 0.702 to 0.811), 0.543 (95% CI 0.479 to 0.607), and 0.717 (95% CI 0.657 to 0.778) respectively. There was a notable deterioration in performance on external validation in both cohorts.</p><p><strong>Conclusion: </strong>The use of routinely available preoperative radiology reports provides promising potential to help screen suitable candidates for THA, but not for TKA. The external validation results demonstrate the importance of further model testing and training when confronted with new clinical cohorts.</p>","PeriodicalId":48944,"journal":{"name":"Bone & Joint Journal","volume":"106-B 7","pages":"688-695"},"PeriodicalIF":4.9000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bone & Joint Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1302/0301-620X.106B7.BJJ-2024-0136","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0

Abstract

Aims: To examine whether natural language processing (NLP) using a clinically based large language model (LLM) could be used to predict patient selection for total hip or total knee arthroplasty (THA/TKA) from routinely available free-text radiology reports.

Methods: Data pre-processing and analyses were conducted according to the Artificial intelligence to Revolutionize the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project protocol. This included use of de-identified Scottish regional clinical data of patients referred for consideration of THA/TKA, held in a secure data environment designed for artificial intelligence (AI) inference. Only preoperative radiology reports were included. NLP algorithms were based on the freely available GatorTron model, a LLM trained on over 82 billion words of de-identified clinical text. Two inference tasks were performed: assessment after model-fine tuning (50 Epochs and three cycles of k-fold cross validation), and external validation.

Results: For THA, there were 5,558 patient radiology reports included, of which 4,137 were used for model training and testing, and 1,421 for external validation. Following training, model performance demonstrated average (mean across three folds) accuracy, F1 score, and area under the receiver operating curve (AUROC) values of 0.850 (95% confidence interval (CI) 0.833 to 0.867), 0.813 (95% CI 0.785 to 0.841), and 0.847 (95% CI 0.822 to 0.872), respectively. For TKA, 7,457 patient radiology reports were included, with 3,478 used for model training and testing, and 3,152 for external validation. Performance metrics included accuracy, F1 score, and AUROC values of 0.757 (95% CI 0.702 to 0.811), 0.543 (95% CI 0.479 to 0.607), and 0.717 (95% CI 0.657 to 0.778) respectively. There was a notable deterioration in performance on external validation in both cohorts.

Conclusion: The use of routinely available preoperative radiology reports provides promising potential to help screen suitable candidates for THA, but not for TKA. The external validation results demonstrate the importance of further model testing and training when confronted with new clinical cohorts.

利用自然语言处理技术从放射学报告中预测全髋关节和膝关节置换术的患者选择。
目的:研究是否可以使用基于临床的大型语言模型(LLM)进行自然语言处理(NLP),以便从日常可用的自由文本放射学报告中预测全髋关节或全膝关节置换术(THA/TKA)的患者选择:根据人工智能革新髋关节和膝关节置换术患者护理路径(ARCHERY)项目协议进行数据预处理和分析。其中包括使用苏格兰地区转诊考虑 THA/TKA 患者的去标识化临床数据,这些数据保存在专为人工智能(AI)推理设计的安全数据环境中。其中只包括术前放射学报告。NLP 算法基于免费提供的 GatorTron 模型,该模型是在超过 820 亿字的去标识化临床文本基础上训练而成的 LLM。进行了两项推理任务:模型微调后的评估(50 个 Epochs 和三个周期的 k-fold 交叉验证)以及外部验证:就 THA 而言,共有 5,558 份患者放射学报告,其中 4,137 份用于模型训练和测试,1,421 份用于外部验证。训练后,模型表现出的平均(三折平均值)准确率、F1得分和接收者操作曲线下面积(AUROC)值分别为0.850(95% 置信区间(CI)0.833 至 0.867)、0.813(95% CI 0.785 至 0.841)和 0.847(95% CI 0.822 至 0.872)。对于 TKA,共纳入 7,457 份患者放射学报告,其中 3,478 份用于模型训练和测试,3,152 份用于外部验证。性能指标包括准确度、F1 分数和 AUROC 值,分别为 0.757(95% CI 0.702 至 0.811)、0.543(95% CI 0.479 至 0.607)和 0.717(95% CI 0.657 至 0.778)。两组患者的外部验证结果均明显下降:结论:使用常规的术前放射学报告可帮助筛选THA的合适候选者,但TKA则不然。外部验证结果表明,在面对新的临床队列时,进一步的模型测试和训练非常重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Bone & Joint Journal
Bone & Joint Journal ORTHOPEDICS-SURGERY
CiteScore
9.40
自引率
10.90%
发文量
318
期刊介绍: We welcome original articles from any part of the world. The papers are assessed by members of the Editorial Board and our international panel of expert reviewers, then either accepted for publication or rejected by the Editor. We receive over 2000 submissions each year and accept about 250 for publication, many after revisions recommended by the reviewers, editors or statistical advisers. A decision usually takes between six and eight weeks. Each paper is assessed by two reviewers with a special interest in the subject covered by the paper, and also by members of the editorial team. Controversial papers will be discussed at a full meeting of the Editorial Board. Publication is between four and six months after acceptance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信