从电子病历中预测肿瘤反应评估和生存曲线的一系列自然语言处理。

IF 3.3 3区 医学 Q2 MEDICAL INFORMATICS
Toshiki Takeuchi, Hidehito Horinouchi, Ken Takasawa, Masami Mukai, Ken Masuda, Yuki Shinno, Yusuke Okuma, Tatsuya Yoshida, Yasushi Goto, Noboru Yamamoto, Yuichiro Ohe, Mototaka Miyake, Hirokazu Watanabe, Masahiko Kusumoto, Takashi Aoki, Kunihiro Nishimura, Ryuji Hamamoto
{"title":"从电子病历中预测肿瘤反应评估和生存曲线的一系列自然语言处理。","authors":"Toshiki Takeuchi, Hidehito Horinouchi, Ken Takasawa, Masami Mukai, Ken Masuda, Yuki Shinno, Yusuke Okuma, Tatsuya Yoshida, Yasushi Goto, Noboru Yamamoto, Yuichiro Ohe, Mototaka Miyake, Hirokazu Watanabe, Masahiko Kusumoto, Takashi Aoki, Kunihiro Nishimura, Ryuji Hamamoto","doi":"10.1186/s12911-025-02928-6","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The clinical information housed within unstructured electronic health records (EHRs) has the potential to promote cancer research. The National Cancer Center Hospital (NCCH) is widely recognized as a leading institution for the treatment of thoracic malignancies in Japan. Information on medical treatment, particularly the characteristics of malignant tumors that occur in patients, tumor response evaluation, and adverse events, was compiled into the databases of each NCCH department from EHRs. However, there have been few opportunities for integrated analysis of data on both the hospital and research institute.</p><p><strong>Methods: </strong>We developed a method for predicting tumor response evaluation and survival curves of drug therapy from the EHRs of lung cancer patients using natural language processing. First, we developed a rule-based algorithm to predict treatment duration using a dictionary of anticancer drugs and regimens used for lung cancer treatment. Thereafter, we applied supervised learning to radiology reports during each treatment period and constructed a classification model to predict the tumor response evaluation of anticancer drugs and date when the progressive disease (PD) was determined. The predicted response and PD date can be used to draw a survival curve for the progression-free survival.</p><p><strong>Results: </strong>We used the EHRs of 716 lung cancer treatments at the NCCH and structured data of the cases as labels for the training and testing of supervised learning. The structured data were manually curated by physicians and CRCs. We investigated the results and performance of the proposed method. Individual predictions of tumor response evaluation and PD date were not extremely high. However, the final predicted survival curves were nearly similar to the actual survival curves.</p><p><strong>Conclusions: </strong>Although it is difficult to construct a fully automated system using our method, we believe that it achieves sufficient performance for supporting physicians and CRCs constructing the database and providing clinical information to help researchers find out a chance of clinical studies.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"85"},"PeriodicalIF":3.3000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11834625/pdf/","citationCount":"0","resultStr":"{\"title\":\"A series of natural language processing for predicting tumor response evaluation and survival curve from electronic health records.\",\"authors\":\"Toshiki Takeuchi, Hidehito Horinouchi, Ken Takasawa, Masami Mukai, Ken Masuda, Yuki Shinno, Yusuke Okuma, Tatsuya Yoshida, Yasushi Goto, Noboru Yamamoto, Yuichiro Ohe, Mototaka Miyake, Hirokazu Watanabe, Masahiko Kusumoto, Takashi Aoki, Kunihiro Nishimura, Ryuji Hamamoto\",\"doi\":\"10.1186/s12911-025-02928-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The clinical information housed within unstructured electronic health records (EHRs) has the potential to promote cancer research. The National Cancer Center Hospital (NCCH) is widely recognized as a leading institution for the treatment of thoracic malignancies in Japan. Information on medical treatment, particularly the characteristics of malignant tumors that occur in patients, tumor response evaluation, and adverse events, was compiled into the databases of each NCCH department from EHRs. However, there have been few opportunities for integrated analysis of data on both the hospital and research institute.</p><p><strong>Methods: </strong>We developed a method for predicting tumor response evaluation and survival curves of drug therapy from the EHRs of lung cancer patients using natural language processing. First, we developed a rule-based algorithm to predict treatment duration using a dictionary of anticancer drugs and regimens used for lung cancer treatment. Thereafter, we applied supervised learning to radiology reports during each treatment period and constructed a classification model to predict the tumor response evaluation of anticancer drugs and date when the progressive disease (PD) was determined. The predicted response and PD date can be used to draw a survival curve for the progression-free survival.</p><p><strong>Results: </strong>We used the EHRs of 716 lung cancer treatments at the NCCH and structured data of the cases as labels for the training and testing of supervised learning. The structured data were manually curated by physicians and CRCs. We investigated the results and performance of the proposed method. Individual predictions of tumor response evaluation and PD date were not extremely high. However, the final predicted survival curves were nearly similar to the actual survival curves.</p><p><strong>Conclusions: </strong>Although it is difficult to construct a fully automated system using our method, we believe that it achieves sufficient performance for supporting physicians and CRCs constructing the database and providing clinical information to help researchers find out a chance of clinical studies.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"25 1\",\"pages\":\"85\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-02-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11834625/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-025-02928-6\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-02928-6","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

摘要

背景:存放在非结构化电子健康记录(EHRs)中的临床信息具有促进癌症研究的潜力。国立癌症中心医院(NCCH)被广泛认为是日本胸部恶性肿瘤治疗的领先机构。医疗信息,特别是发生在患者身上的恶性肿瘤的特征、肿瘤反应评估和不良事件,被汇编到NCCH各部门的电子病历数据库中。然而,很少有机会对医院和研究所的数据进行综合分析。方法:采用自然语言处理的方法,从肺癌患者的电子病历中预测肿瘤反应评价和药物治疗的生存曲线。首先,我们开发了一种基于规则的算法,使用用于肺癌治疗的抗癌药物和方案字典来预测治疗时间。随后,我们将监督学习应用于每个治疗期的放射学报告,并构建分类模型来预测抗癌药物的肿瘤反应评价和确定进展性疾病(PD)的时间。预测反应和PD日期可用于绘制无进展生存期的生存曲线。结果:我们使用NCCH 716例肺癌治疗的电子病历和病例的结构化数据作为监督学习训练和测试的标签。结构化数据由医生和crc手工整理。我们研究了该方法的结果和性能。个体对肿瘤反应评价和PD日期的预测并不是很高。然而,最终的预测生存曲线与实际生存曲线几乎相似。结论:虽然我们的方法很难构建一个完全自动化的系统,但我们认为它在支持医生和crc构建数据库,提供临床信息,帮助研究人员找到临床研究机会方面取得了足够的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A series of natural language processing for predicting tumor response evaluation and survival curve from electronic health records.

Background: The clinical information housed within unstructured electronic health records (EHRs) has the potential to promote cancer research. The National Cancer Center Hospital (NCCH) is widely recognized as a leading institution for the treatment of thoracic malignancies in Japan. Information on medical treatment, particularly the characteristics of malignant tumors that occur in patients, tumor response evaluation, and adverse events, was compiled into the databases of each NCCH department from EHRs. However, there have been few opportunities for integrated analysis of data on both the hospital and research institute.

Methods: We developed a method for predicting tumor response evaluation and survival curves of drug therapy from the EHRs of lung cancer patients using natural language processing. First, we developed a rule-based algorithm to predict treatment duration using a dictionary of anticancer drugs and regimens used for lung cancer treatment. Thereafter, we applied supervised learning to radiology reports during each treatment period and constructed a classification model to predict the tumor response evaluation of anticancer drugs and date when the progressive disease (PD) was determined. The predicted response and PD date can be used to draw a survival curve for the progression-free survival.

Results: We used the EHRs of 716 lung cancer treatments at the NCCH and structured data of the cases as labels for the training and testing of supervised learning. The structured data were manually curated by physicians and CRCs. We investigated the results and performance of the proposed method. Individual predictions of tumor response evaluation and PD date were not extremely high. However, the final predicted survival curves were nearly similar to the actual survival curves.

Conclusions: Although it is difficult to construct a fully automated system using our method, we believe that it achieves sufficient performance for supporting physicians and CRCs constructing the database and providing clinical information to help researchers find out a chance of clinical studies.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信