用于ct引导的介入放射学报告分类的微调大语言模型。

IF 1.1 4区医学 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Acta radiologica Pub Date : 2025-06-23 DOI:10.1177/02841851251349495

Koichiro Yasaka, Naoaki Nishimura, Takahiro Fukushima, Takatoshi Kubo, Shigeru Kiryu, Osamu Abe

{"title":"用于ct引导的介入放射学报告分类的微调大语言模型。","authors":"Koichiro Yasaka, Naoaki Nishimura, Takahiro Fukushima, Takatoshi Kubo, Shigeru Kiryu, Osamu Abe","doi":"10.1177/02841851251349495","DOIUrl":null,"url":null,"abstract":"BackgroundManual data curation was necessary to extract radiology reports due to the ambiguities of natural language.PurposeTo develop a fine-tuned large language model that classifies computed tomography (CT)-guided interventional radiology reports into technique categories and to compare its performance with that of the readers.Material and MethodsThis retrospective study included patients who underwent CT-guided interventional radiology between August 2008 and November 2024. Patients were chronologically assigned to the training (n = 1142; 646 men; mean age = 64.1 ± 15.7 years), validation (n = 131; 83 men; mean age = 66.1 ± 16.1 years), and test (n = 332; 196 men; mean age = 66.1 ± 14.8 years) datasets. In establishing a reference standard, reports were manually classified into categories 1 (drainage), 2 (lesion biopsy within fat or soft tissue density tissues), 3 (lung biopsy), and 4 (bone biopsy). The bi-directional encoder representation from the transformers model was fine-tuned with the training dataset, and the model with the best performance in the validation dataset was selected. The performance and required time for classification in the test dataset were compared between the best-performing model and the two readers.ResultsCategories 1/2/3/4 included 309/367/270/196, 30/42/40/19, and 75/124/78/55 patients for the training, validation, and test datasets, respectively. The model demonstrated an accuracy of 0.979 in the test dataset, which was significantly better than that of the readers (0.922-0.940) (P ≤0.012). The model classified reports within a 49.8-53.5-fold shorter time compared to readers.ConclusionThe fine-tuned large language model classified CT-guided interventional radiology reports into four categories demonstrating high accuracy within a remarkably short time.","PeriodicalId":7143,"journal":{"name":"Acta radiologica","volume":" ","pages":"2841851251349495"},"PeriodicalIF":1.1000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fine-tuned large language model for classifying CT-guided interventional radiology reports.\",\"authors\":\"Koichiro Yasaka, Naoaki Nishimura, Takahiro Fukushima, Takatoshi Kubo, Shigeru Kiryu, Osamu Abe\",\"doi\":\"10.1177/02841851251349495\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"BackgroundManual data curation was necessary to extract radiology reports due to the ambiguities of natural language.PurposeTo develop a fine-tuned large language model that classifies computed tomography (CT)-guided interventional radiology reports into technique categories and to compare its performance with that of the readers.Material and MethodsThis retrospective study included patients who underwent CT-guided interventional radiology between August 2008 and November 2024. Patients were chronologically assigned to the training (n = 1142; 646 men; mean age = 64.1 ± 15.7 years), validation (n = 131; 83 men; mean age = 66.1 ± 16.1 years), and test (n = 332; 196 men; mean age = 66.1 ± 14.8 years) datasets. In establishing a reference standard, reports were manually classified into categories 1 (drainage), 2 (lesion biopsy within fat or soft tissue density tissues), 3 (lung biopsy), and 4 (bone biopsy). The bi-directional encoder representation from the transformers model was fine-tuned with the training dataset, and the model with the best performance in the validation dataset was selected. The performance and required time for classification in the test dataset were compared between the best-performing model and the two readers.ResultsCategories 1/2/3/4 included 309/367/270/196, 30/42/40/19, and 75/124/78/55 patients for the training, validation, and test datasets, respectively. The model demonstrated an accuracy of 0.979 in the test dataset, which was significantly better than that of the readers (0.922-0.940) (P ≤0.012). The model classified reports within a 49.8-53.5-fold shorter time compared to readers.ConclusionThe fine-tuned large language model classified CT-guided interventional radiology reports into four categories demonstrating high accuracy within a remarkably short time.\",\"PeriodicalId\":7143,\"journal\":{\"name\":\"Acta radiologica\",\"volume\":\" \",\"pages\":\"2841851251349495\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2025-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta radiologica\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/02841851251349495\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta radiologica","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/02841851251349495","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

背景：由于自然语言的模糊性，人工数据管理对于提取放射学报告是必要的。目的开发一个微调的大型语言模型，将计算机断层扫描（CT）引导的介入放射学报告分类为技术类别，并将其与读者的表现进行比较。材料与方法本回顾性研究包括2008年8月至2024年11月期间接受ct引导介入放射治疗的患者。按时间顺序将患者分配到训练组(n = 1142；646人;平均年龄= 64.1±15.7岁)，验证(n = 131；83人;平均年龄= 66.1±16.1岁)，试验(n = 332；196人;平均年龄= 66.1±14.8岁)。在建立参考标准时，报告被人工分类为1类（引流）、2类（脂肪或软组织密度组织内病变活检）、3类（肺活检）和4类（骨活检）。利用训练数据集对变压器模型的双向编码器表示进行微调，选择验证数据集中性能最好的模型。比较了性能最好的模型和两种阅读器在测试数据集中的分类性能和所需时间。结果1/2/3/4类别分别包括309/367/270/196、30/42/40/19和75/124/78/55例患者用于训练、验证和测试数据集。该模型在测试数据集中的准确率为0.979，显著优于读者（0.922-0.940）（P≤0.012）。与读者相比，该模型分类报告的时间缩短了49.8-53.5倍。结论经微调的大语言模型将ct引导下的介入放射学报告分为四类，在极短的时间内具有较高的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fine-tuned large language model for classifying CT-guided interventional radiology reports.

BackgroundManual data curation was necessary to extract radiology reports due to the ambiguities of natural language.PurposeTo develop a fine-tuned large language model that classifies computed tomography (CT)-guided interventional radiology reports into technique categories and to compare its performance with that of the readers.Material and MethodsThis retrospective study included patients who underwent CT-guided interventional radiology between August 2008 and November 2024. Patients were chronologically assigned to the training (n = 1142; 646 men; mean age = 64.1 ± 15.7 years), validation (n = 131; 83 men; mean age = 66.1 ± 16.1 years), and test (n = 332; 196 men; mean age = 66.1 ± 14.8 years) datasets. In establishing a reference standard, reports were manually classified into categories 1 (drainage), 2 (lesion biopsy within fat or soft tissue density tissues), 3 (lung biopsy), and 4 (bone biopsy). The bi-directional encoder representation from the transformers model was fine-tuned with the training dataset, and the model with the best performance in the validation dataset was selected. The performance and required time for classification in the test dataset were compared between the best-performing model and the two readers.ResultsCategories 1/2/3/4 included 309/367/270/196, 30/42/40/19, and 75/124/78/55 patients for the training, validation, and test datasets, respectively. The model demonstrated an accuracy of 0.979 in the test dataset, which was significantly better than that of the readers (0.922-0.940) (P ≤0.012). The model classified reports within a 49.8-53.5-fold shorter time compared to readers.ConclusionThe fine-tuned large language model classified CT-guided interventional radiology reports into four categories demonstrating high accuracy within a remarkably short time.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Acta radiologica 医学-核医学

CiteScore

2.70

自引率

0.00%

发文量

170

审稿时长

3-8 weeks

期刊介绍： Acta Radiologica publishes articles on all aspects of radiology, from clinical radiology to experimental work. It is known for articles based on experimental work and contrast media research, giving priority to scientific original papers. The distinguished international editorial board also invite review articles, short communications and technical and instrumental notes.