Classification of Interventional Radiology Reports into Technique Categories with a Fine-Tuned Large Language Model.

Koichiro Yasaka, Takuto Nomura, Jun Kamohara, Hiroshi Hirakawa, Takatoshi Kubo, Shigeru Kiryu, Osamu Abe
{"title":"Classification of Interventional Radiology Reports into Technique Categories with a Fine-Tuned Large Language Model.","authors":"Koichiro Yasaka, Takuto Nomura, Jun Kamohara, Hiroshi Hirakawa, Takatoshi Kubo, Shigeru Kiryu, Osamu Abe","doi":"10.1007/s10278-024-01370-w","DOIUrl":null,"url":null,"abstract":"<p><p>The aim of this study is to develop a fine-tuned large language model that classifies interventional radiology reports into technique categories and to compare its performance with readers. This retrospective study included 3198 patients (1758 males and 1440 females; age, 62.8 ± 16.8 years) who underwent interventional radiology from January 2018 to July 2024. Training, validation, and test datasets involved 2292, 250, and 656 patients, respectively. Input data involved texts in clinical indication, imaging diagnosis, and image-finding sections of interventional radiology reports. Manually classified technique categories (15 categories in total) were utilized as reference data. Fine-tuning of the Bidirectional Encoder Representations model was performed using training and validation datasets. This process was repeated 15 times due to the randomness of the learning process. The best-performed model, which showed the highest accuracy among 15 trials, was selected to further evaluate its performance in the independent test dataset. The report classification involved one radiologist (reader 1) and two radiology residents (readers 2 and 3). The accuracy and macrosensitivity (average of each category's sensitivity) of the best-performed model in the validation dataset were 0.996 and 0.994, respectively. For the test dataset, the accuracy/macrosensitivity were 0.988/0.980, 0.986/0.977, 0.989/0.979, and 0.988/0.980 in the best model, reader 1, reader 2, and reader 3, respectively. The model required 0.178 s required for classification per patient, which was 17.5-19.9 times faster than readers. In conclusion, fine-tuned large language model classified interventional radiology reports into technique categories with high accuracy similar to readers within a remarkably shorter time.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of imaging informatics in medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10278-024-01370-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The aim of this study is to develop a fine-tuned large language model that classifies interventional radiology reports into technique categories and to compare its performance with readers. This retrospective study included 3198 patients (1758 males and 1440 females; age, 62.8 ± 16.8 years) who underwent interventional radiology from January 2018 to July 2024. Training, validation, and test datasets involved 2292, 250, and 656 patients, respectively. Input data involved texts in clinical indication, imaging diagnosis, and image-finding sections of interventional radiology reports. Manually classified technique categories (15 categories in total) were utilized as reference data. Fine-tuning of the Bidirectional Encoder Representations model was performed using training and validation datasets. This process was repeated 15 times due to the randomness of the learning process. The best-performed model, which showed the highest accuracy among 15 trials, was selected to further evaluate its performance in the independent test dataset. The report classification involved one radiologist (reader 1) and two radiology residents (readers 2 and 3). The accuracy and macrosensitivity (average of each category's sensitivity) of the best-performed model in the validation dataset were 0.996 and 0.994, respectively. For the test dataset, the accuracy/macrosensitivity were 0.988/0.980, 0.986/0.977, 0.989/0.979, and 0.988/0.980 in the best model, reader 1, reader 2, and reader 3, respectively. The model required 0.178 s required for classification per patient, which was 17.5-19.9 times faster than readers. In conclusion, fine-tuned large language model classified interventional radiology reports into technique categories with high accuracy similar to readers within a remarkably shorter time.

求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信