使用自然语言处理自动提取计算机断层成像指征以评估长期肺癌幸存者的监测模式。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics Pub Date : 2025-07-01 Epub Date: 2025-07-23 DOI:10.1200/CCI-24-00279

Aparajita Khan, Eunji Choi, Chloe Su, Anna Graber-Naidich, Solomon Henry, Mina L Satoyoshi, Archana Bhat, Allison W Kurian, Su-Ying Liang, Joel Neal, Michael Gould, Ann Leung, Heather A Wakelee, Leah M Backhus, Curtis Langlotz, Julie Wu, Summer S Han

{"title":"使用自然语言处理自动提取计算机断层成像指征以评估长期肺癌幸存者的监测模式。","authors":"Aparajita Khan, Eunji Choi, Chloe Su, Anna Graber-Naidich, Solomon Henry, Mina L Satoyoshi, Archana Bhat, Allison W Kurian, Su-Ying Liang, Joel Neal, Michael Gould, Ann Leung, Heather A Wakelee, Leah M Backhus, Curtis Langlotz, Julie Wu, Summer S Han","doi":"10.1200/CCI-24-00279","DOIUrl":null,"url":null,"abstract":"Purpose: Despite its routine use to monitor patients with lung cancer (LC), real-world evaluations of the impact of computed tomography (CT) surveillance on overall survival (OS) have been inconsistent. A major confounder is the absence of imaging indications because patients undergo CT scans for purposes beyond surveillance, like symptom evaluations (eg, cough) linked to poor survival. We propose a novel natural language processing model to predict CT imaging indications (surveillance v others).Methods: We used electronic health records of 585 long-term LC survivors (≥5 years) at Stanford, followed for up to 22 years. Their 3,362 post-5-year CT reports (including 1,672 manually annotated) were used for modeling by integrating structured variables (eg, CT intervals) with key-phrase analysis of radiology reports. Naïve analysis compared OS in patients with CT for any indications (including symptoms) versus those without post-5-year CT, as in previous studies. Using model-predicted indications, we conducted exploratory analyses to compare OS between those with post-5-year surveillance CT and those without.Results: The model showed high discrimination (AUC, 0.86), with key predictors including a longer interval (≥6-month) from the previous CT (odds ratios [OR], 5.50; P < .001) and surveillance-related key phrases (OR, 1.37; P = .03). Propensity-adjusted survival analysis indicated better OS for patients with any post-5-year surveillance CT versus those without (adjusted hazard ratio, 0.60; P = .016). By contrast, no significant survival difference was found (P = .53) between patients with any CT versus those without post-5-year CT.Conclusion: Our model abstracted CT indications from real-world data with high discrimination. Exploratory analyses revealed the obscured imaging-OS association when considering indications, highlighting the model's potential for future real-world studies.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400279"},"PeriodicalIF":2.8000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12309515/pdf/","citationCount":"0","resultStr":"{\"title\":\"Automatic Abstraction of Computed Tomography Imaging Indication Using Natural Language Processing for Evaluation of Surveillance Patterns in Long-Term Lung Cancer Survivors.\",\"authors\":\"Aparajita Khan, Eunji Choi, Chloe Su, Anna Graber-Naidich, Solomon Henry, Mina L Satoyoshi, Archana Bhat, Allison W Kurian, Su-Ying Liang, Joel Neal, Michael Gould, Ann Leung, Heather A Wakelee, Leah M Backhus, Curtis Langlotz, Julie Wu, Summer S Han\",\"doi\":\"10.1200/CCI-24-00279\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Despite its routine use to monitor patients with lung cancer (LC), real-world evaluations of the impact of computed tomography (CT) surveillance on overall survival (OS) have been inconsistent. A major confounder is the absence of imaging indications because patients undergo CT scans for purposes beyond surveillance, like symptom evaluations (eg, cough) linked to poor survival. We propose a novel natural language processing model to predict CT imaging indications (surveillance v others).Methods: We used electronic health records of 585 long-term LC survivors (≥5 years) at Stanford, followed for up to 22 years. Their 3,362 post-5-year CT reports (including 1,672 manually annotated) were used for modeling by integrating structured variables (eg, CT intervals) with key-phrase analysis of radiology reports. Naïve analysis compared OS in patients with CT for any indications (including symptoms) versus those without post-5-year CT, as in previous studies. Using model-predicted indications, we conducted exploratory analyses to compare OS between those with post-5-year surveillance CT and those without.Results: The model showed high discrimination (AUC, 0.86), with key predictors including a longer interval (≥6-month) from the previous CT (odds ratios [OR], 5.50; P < .001) and surveillance-related key phrases (OR, 1.37; P = .03). Propensity-adjusted survival analysis indicated better OS for patients with any post-5-year surveillance CT versus those without (adjusted hazard ratio, 0.60; P = .016). By contrast, no significant survival difference was found (P = .53) between patients with any CT versus those without post-5-year CT.Conclusion: Our model abstracted CT indications from real-world data with high discrimination. Exploratory analyses revealed the obscured imaging-OS association when considering indications, highlighting the model's potential for future real-world studies.\",\"PeriodicalId\":51626,\"journal\":{\"name\":\"JCO Clinical Cancer Informatics\",\"volume\":\"9 \",\"pages\":\"e2400279\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12309515/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JCO Clinical Cancer Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1200/CCI-24-00279\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/23 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-24-00279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/23 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

目的：尽管常规用于监测肺癌（LC）患者，但计算机断层扫描（CT）监测对总生存期（OS）影响的实际评估一直不一致。一个主要的混杂因素是缺乏成像指征，因为患者接受CT扫描的目的不仅仅是为了监测，比如与生存率低有关的症状评估（如咳嗽）。我们提出了一种新的自然语言处理模型来预测CT成像指征（监视或其他）。方法：我们使用斯坦福大学585名长期LC幸存者（≥5岁）的电子健康记录，随访长达22年。他们的3,362份5年后的CT报告（包括1,672份手工注释的报告）通过将结构化变量（如CT间隔）与放射学报告的关键短语分析相结合，用于建模。Naïve分析比较了任何适应症（包括症状）的CT患者与未进行5年后CT的患者的OS，如先前的研究。利用模型预测的适应症，我们进行了探索性分析，比较5年后监测CT患者和未监测CT患者的OS。结果：该模型显示出高判别性（AUC, 0.86），关键预测因素包括较长时间间隔（≥6个月）(比值比[OR]， 5.50；P < 0.001)和与监视相关的关键短语(OR, 1.37；P = .03)。经倾向校正的生存分析显示，接受任何5年后监测CT的患者比未接受监测CT的患者有更好的OS(校正风险比，0.60；P = .016)。相比之下，任何5年后CT与未进行5年后CT的患者的生存率无显著差异（P = 0.53）。结论：我们的模型从真实世界的数据中提取了CT指征，具有很高的识别率。探索性分析揭示了在考虑适应症时模糊的成像- os关联，强调了该模型在未来现实世界研究中的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Automatic Abstraction of Computed Tomography Imaging Indication Using Natural Language Processing for Evaluation of Surveillance Patterns in Long-Term Lung Cancer Survivors.

查看原文本刊更多论文

Automatic Abstraction of Computed Tomography Imaging Indication Using Natural Language Processing for Evaluation of Surveillance Patterns in Long-Term Lung Cancer Survivors.

Purpose: Despite its routine use to monitor patients with lung cancer (LC), real-world evaluations of the impact of computed tomography (CT) surveillance on overall survival (OS) have been inconsistent. A major confounder is the absence of imaging indications because patients undergo CT scans for purposes beyond surveillance, like symptom evaluations (eg, cough) linked to poor survival. We propose a novel natural language processing model to predict CT imaging indications (surveillance v others).

Methods: We used electronic health records of 585 long-term LC survivors (≥5 years) at Stanford, followed for up to 22 years. Their 3,362 post-5-year CT reports (including 1,672 manually annotated) were used for modeling by integrating structured variables (eg, CT intervals) with key-phrase analysis of radiology reports. Naïve analysis compared OS in patients with CT for any indications (including symptoms) versus those without post-5-year CT, as in previous studies. Using model-predicted indications, we conducted exploratory analyses to compare OS between those with post-5-year surveillance CT and those without.

Results: The model showed high discrimination (AUC, 0.86), with key predictors including a longer interval (≥6-month) from the previous CT (odds ratios [OR], 5.50; P < .001) and surveillance-related key phrases (OR, 1.37; P = .03). Propensity-adjusted survival analysis indicated better OS for patients with any post-5-year surveillance CT versus those without (adjusted hazard ratio, 0.60; P = .016). By contrast, no significant survival difference was found (P = .53) between patients with any CT versus those without post-5-year CT.

Conclusion: Our model abstracted CT indications from real-world data with high discrimination. Exploratory analyses revealed the obscured imaging-OS association when considering indications, highlighting the model's potential for future real-world studies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JCO Clinical Cancer Informatics ONCOLOGY-

CiteScore

6.20

自引率

4.80%

发文量

190