土耳其放射学报告中命名实体识别的深度学习。

IF 1.4 4区 医学 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Abubakar Ahmad Abdullahi, Murat Can Ganiz, Ural Koç, Muhammet Batuhan Gökhan, Ceren Aydın, Ali Bahadır Özdemir
{"title":"土耳其放射学报告中命名实体识别的深度学习。","authors":"Abubakar Ahmad Abdullahi, Murat Can Ganiz, Ural Koç, Muhammet Batuhan Gökhan, Ceren Aydın, Ali Bahadır Özdemir","doi":"10.4274/dir.2025.243100","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The primary objective of this research is to enhance the accuracy and efficiency of information extraction from radiology reports. In addressing this objective, the study aims to develop and evaluate a deep learning framework for named entity recognition (NER).</p><p><strong>Methods: </strong>We used a synthetic dataset of 1,056 Turkish radiology reports created and labeled by the radiologists in our research team. Due to privacy concerns, actual patient data could not be used; however, the synthetic reports closely mimic genuine reports in structure and content. We employed the four-stage DYGIE++ model for the experiments. First, we performed token encoding using four bidirectional encoder representations from transformers (BERT) models: BERTurk, BioBERTurk, PubMedBERT, and XLM-RoBERTa. Second, we introduced adaptive span enumeration, considering the word count of a sentence in Turkish. Third, we adopted span graph propagation to generate a multidirectional graph crucial for coreference resolution. Finally, we used a two-layered feed-forward neural network to classify the named entity.</p><p><strong>Results: </strong>The experiments conducted on the labeled dataset showcase the approach's effectiveness. The study achieved an F1 score of 80.1 for the NER task, with the BioBERTurk model, which is pre-trained on Turkish Wikipedia, radiology reports, and biomedical texts, proving to be the most effective of the four BERT models used in the experiment.</p><p><strong>Conclusion: </strong>We show how different dataset labels affect the model's performance. The results demonstrate the model's ability to handle the intricacies of Turkish radiology reports, providing a detailed analysis of precision, recall, and F1 scores for each label. Additionally, this study compares its findings with related research in other languages.</p><p><strong>Clinical significance: </strong>Our approach provides clinicians with more precise and comprehensive insights to improve patient care by extracting relevant information from radiology reports. This innovation in information extraction streamlines the diagnostic process and helps expedite patient treatment decisions.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep learning for named entity recognition in Turkish radiology reports.\",\"authors\":\"Abubakar Ahmad Abdullahi, Murat Can Ganiz, Ural Koç, Muhammet Batuhan Gökhan, Ceren Aydın, Ali Bahadır Özdemir\",\"doi\":\"10.4274/dir.2025.243100\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>The primary objective of this research is to enhance the accuracy and efficiency of information extraction from radiology reports. In addressing this objective, the study aims to develop and evaluate a deep learning framework for named entity recognition (NER).</p><p><strong>Methods: </strong>We used a synthetic dataset of 1,056 Turkish radiology reports created and labeled by the radiologists in our research team. Due to privacy concerns, actual patient data could not be used; however, the synthetic reports closely mimic genuine reports in structure and content. We employed the four-stage DYGIE++ model for the experiments. First, we performed token encoding using four bidirectional encoder representations from transformers (BERT) models: BERTurk, BioBERTurk, PubMedBERT, and XLM-RoBERTa. Second, we introduced adaptive span enumeration, considering the word count of a sentence in Turkish. Third, we adopted span graph propagation to generate a multidirectional graph crucial for coreference resolution. Finally, we used a two-layered feed-forward neural network to classify the named entity.</p><p><strong>Results: </strong>The experiments conducted on the labeled dataset showcase the approach's effectiveness. The study achieved an F1 score of 80.1 for the NER task, with the BioBERTurk model, which is pre-trained on Turkish Wikipedia, radiology reports, and biomedical texts, proving to be the most effective of the four BERT models used in the experiment.</p><p><strong>Conclusion: </strong>We show how different dataset labels affect the model's performance. The results demonstrate the model's ability to handle the intricacies of Turkish radiology reports, providing a detailed analysis of precision, recall, and F1 scores for each label. Additionally, this study compares its findings with related research in other languages.</p><p><strong>Clinical significance: </strong>Our approach provides clinicians with more precise and comprehensive insights to improve patient care by extracting relevant information from radiology reports. This innovation in information extraction streamlines the diagnostic process and helps expedite patient treatment decisions.</p>\",\"PeriodicalId\":11341,\"journal\":{\"name\":\"Diagnostic and interventional radiology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Diagnostic and interventional radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.4274/dir.2025.243100\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnostic and interventional radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4274/dir.2025.243100","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

摘要

目的:本研究的主要目的是提高从放射学报告中提取信息的准确性和效率。为了实现这一目标,本研究旨在开发和评估用于命名实体识别(NER)的深度学习框架。方法:我们使用了由我们研究团队的放射科医生创建和标记的1,056份土耳其放射学报告的合成数据集。出于隐私考虑,不能使用患者的实际数据;然而,合成报告在结构和内容上与真实报告非常相似。我们采用四阶段DYGIE++模型进行实验。首先,我们使用来自转换器(BERT)模型的四种双向编码器表示执行令牌编码:BERTurk、BioBERTurk、PubMedBERT和XLM-RoBERTa。其次,考虑到土耳其语句子的字数,我们引入了自适应跨度枚举。第三,我们采用跨度图传播来生成对共参考分辨率至关重要的多向图。最后,采用两层前馈神经网络对命名实体进行分类。结果:在标记数据集上进行的实验显示了该方法的有效性。该研究在NER任务中获得了80.1的F1分,其中BioBERTurk模型在土耳其语维基百科、放射学报告和生物医学文本上进行了预训练,被证明是实验中使用的四个BERT模型中最有效的。结论:我们展示了不同的数据集标签如何影响模型的性能。结果证明了该模型处理土耳其放射学报告的复杂性的能力,为每个标签提供了精度、召回率和F1分数的详细分析。此外,本研究还将其研究结果与其他语言的相关研究进行了比较。临床意义:我们的方法通过从放射学报告中提取相关信息,为临床医生提供更精确和全面的见解,以改善患者的护理。这种信息提取方面的创新简化了诊断过程,并有助于加快患者的治疗决策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deep learning for named entity recognition in Turkish radiology reports.

Purpose: The primary objective of this research is to enhance the accuracy and efficiency of information extraction from radiology reports. In addressing this objective, the study aims to develop and evaluate a deep learning framework for named entity recognition (NER).

Methods: We used a synthetic dataset of 1,056 Turkish radiology reports created and labeled by the radiologists in our research team. Due to privacy concerns, actual patient data could not be used; however, the synthetic reports closely mimic genuine reports in structure and content. We employed the four-stage DYGIE++ model for the experiments. First, we performed token encoding using four bidirectional encoder representations from transformers (BERT) models: BERTurk, BioBERTurk, PubMedBERT, and XLM-RoBERTa. Second, we introduced adaptive span enumeration, considering the word count of a sentence in Turkish. Third, we adopted span graph propagation to generate a multidirectional graph crucial for coreference resolution. Finally, we used a two-layered feed-forward neural network to classify the named entity.

Results: The experiments conducted on the labeled dataset showcase the approach's effectiveness. The study achieved an F1 score of 80.1 for the NER task, with the BioBERTurk model, which is pre-trained on Turkish Wikipedia, radiology reports, and biomedical texts, proving to be the most effective of the four BERT models used in the experiment.

Conclusion: We show how different dataset labels affect the model's performance. The results demonstrate the model's ability to handle the intricacies of Turkish radiology reports, providing a detailed analysis of precision, recall, and F1 scores for each label. Additionally, this study compares its findings with related research in other languages.

Clinical significance: Our approach provides clinicians with more precise and comprehensive insights to improve patient care by extracting relevant information from radiology reports. This innovation in information extraction streamlines the diagnostic process and helps expedite patient treatment decisions.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Diagnostic and interventional radiology
Diagnostic and interventional radiology Medicine-Radiology, Nuclear Medicine and Imaging
自引率
4.80%
发文量
0
期刊介绍: Diagnostic and Interventional Radiology (Diagn Interv Radiol) is the open access, online-only official publication of Turkish Society of Radiology. It is published bimonthly and the journal’s publication language is English. The journal is a medium for original articles, reviews, pictorial essays, technical notes related to all fields of diagnostic and interventional radiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信