Automatic Detection of Distant Metastasis Mentions in Radiology Reports in Spanish.

IF 3.3 Q2 ONCOLOGY
Ricardo Ahumada, Jocelyn Dunstan, Matías Rojas, Sergio Peñafiel, Inti Paredes, Pablo Báez
{"title":"Automatic Detection of Distant Metastasis Mentions in Radiology Reports in Spanish.","authors":"Ricardo Ahumada, Jocelyn Dunstan, Matías Rojas, Sergio Peñafiel, Inti Paredes, Pablo Báez","doi":"10.1200/CCI.23.00130","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>A critical task in oncology is extracting information related to cancer metastasis from electronic health records. Metastasis-related information is crucial for planning treatment, evaluating patient prognoses, and cancer research. However, the unstructured way in which findings of distant metastasis are often written in radiology reports makes it difficult to extract information automatically. The main aim of this study was to extract distant metastasis findings from free-text imaging and nuclear medicine reports to classify the patient status according to the presence or absence of distant metastasis.</p><p><strong>Materials and methods: </strong>We created a distant metastasis annotated corpus using positron emission tomography-computed tomography and computed tomography reports of patients with prostate, colorectal, and breast cancers. Entities were labeled M1 or M0 according to affirmative or negative metastasis descriptions. We used a named entity recognition model on the basis of a bidirectional long short-term memory model and conditional random fields to identify entities. Mentions were subsequently used to classify whole reports into M1 or M0.</p><p><strong>Results: </strong>The model detected distant metastasis mentions with a weighted average <i>F</i><sub>1</sub> score performance of 0.84. Whole reports were classified with an <i>F</i><sub>1</sub> score of 0.92 for M0 documents and 0.90 for M1 documents.</p><p><strong>Conclusion: </strong>These results show the usefulness of the model in detecting distant metastasis findings in three different types of cancer and the consequent classification of reports. The relevance of this study is to generate structured distant metastasis information from free-text imaging reports in Spanish. In addition, the manually annotated corpus, annotation guidelines, and code are freely released to the research community.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10793975/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI.23.00130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: A critical task in oncology is extracting information related to cancer metastasis from electronic health records. Metastasis-related information is crucial for planning treatment, evaluating patient prognoses, and cancer research. However, the unstructured way in which findings of distant metastasis are often written in radiology reports makes it difficult to extract information automatically. The main aim of this study was to extract distant metastasis findings from free-text imaging and nuclear medicine reports to classify the patient status according to the presence or absence of distant metastasis.

Materials and methods: We created a distant metastasis annotated corpus using positron emission tomography-computed tomography and computed tomography reports of patients with prostate, colorectal, and breast cancers. Entities were labeled M1 or M0 according to affirmative or negative metastasis descriptions. We used a named entity recognition model on the basis of a bidirectional long short-term memory model and conditional random fields to identify entities. Mentions were subsequently used to classify whole reports into M1 or M0.

Results: The model detected distant metastasis mentions with a weighted average F1 score performance of 0.84. Whole reports were classified with an F1 score of 0.92 for M0 documents and 0.90 for M1 documents.

Conclusion: These results show the usefulness of the model in detecting distant metastasis findings in three different types of cancer and the consequent classification of reports. The relevance of this study is to generate structured distant metastasis information from free-text imaging reports in Spanish. In addition, the manually annotated corpus, annotation guidelines, and code are freely released to the research community.

用西班牙语自动检测放射学报告中的远处转移病灶。
目的:肿瘤学的一项关键任务是从电子健康记录中提取与癌症转移相关的信息。转移相关信息对于制定治疗计划、评估病人预后和癌症研究至关重要。然而,由于放射学报告中的远处转移发现通常采用非结构化的书写方式,因此很难自动提取信息。本研究的主要目的是从自由文本的影像学和核医学报告中提取远处转移的结果,并根据有无远处转移对患者状态进行分类:我们利用前列腺癌、结直肠癌和乳腺癌患者的正电子发射断层扫描-计算机断层扫描和计算机断层扫描报告创建了远处转移注释语料库。根据肯定或否定的转移描述,实体被标记为 M1 或 M0。我们在双向长短期记忆模型和条件随机场的基础上使用命名实体识别模型来识别实体。随后,我们使用实体识别模型将整个报告分为 M1 或 M0:结果:该模型检测到的远处转移提及加权平均 F1 分数为 0.84。对整个报告进行分类时,M0 文档的 F1 得分为 0.92,M1 文档的 F1 得分为 0.90:这些结果表明,该模型在检测三种不同类型癌症的远处转移结果以及随后对报告进行分类方面非常有用。这项研究的意义在于从西班牙语的自由文本成像报告中生成结构化的远处转移信息。此外,人工标注的语料库、标注指南和代码也免费向研究界发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.20
自引率
4.80%
发文量
190
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信