Evaluation of lung cancer risk prediction models to select lung cancer screening participants in China: a real-world analysis in regional healthcare big data, Yinzhou, China

IF 7.6 1区医学 Q1 HEALTH CARE SCIENCES & SERVICES

The Lancet Regional Health: Western Pacific Pub Date : 2025-02-01 DOI:10.1016/j.lanwpc.2024.101354

Ziqing Ye , Yongyue Wei

{"title":"Evaluation of lung cancer risk prediction models to select lung cancer screening participants in China: a real-world analysis in regional healthcare big data, Yinzhou, China","authors":"Ziqing Ye , Yongyue Wei","doi":"10.1016/j.lanwpc.2024.101354","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>A number of lung cancer prediction models have been developed worldwide. However, few validation studies have been conducted on Chinese populations. The objective of this study is to evaluate the feasibility and efficacy of 17 global lung cancer risk prediction models when applied to Chinese healthcare big data.</div></div><div><h3>Methods</h3><div>The study included individuals with information recorded in the Yinzhou regional health care database from January 1, 2010 to December 31, 2021. Seventeen lung cancer risk prediction models (Bach, Spitz, Hoggart, PLCO<sub>m2012</sub>, Korean Men, PLCO<sub>all2014</sub>, Pittsburgh Predictor, LLPi, LCRAT, HUNT, JPHC, Reduced HUNT, LLPv3, LCRS, OWL, UCL-I, Shanghai-LCM) were evaluated for their performance in overall population and subgroups. The discrimination of the 17 models was assessed using the Harrell's C-index and time-dependent area under the curve (AUC) as metrics. The calibration of the models was evaluated using the expected-to-observed ratio (EOR) and calibration curves. Moreover, the models were recalibrated in the Yinzhou population, and the calibration of the recalibrated models was evaluated.</div></div><div><h3>Findings</h3><div>A total of 907,200 study participants were included in the analysis, comprising 69,263 smokers and 837,937 non-smokers. Of the 17 models initially considered, only 6 (Bach, Hoggart, Pittsburgh Predictor, JPHC, Reduced HUNT, UCL-I) were available in the Yinzhou regional health care database with complete predictor data. Models that predicted risk over a ten-year period (Bach, JPHC, LCRS, and Shanghai-LCM) exhibited C-indices and AUCs of 0.75 or greater in the ever smokers. The majority of models demonstrated an overestimation of incidence risk in the ever smokers and an underestimation in the never smokers. The JPHC and LCRS models exhibited the most optimal calibration curves and the best EOR, whereas the other prediction models had suboptimal calibration. After recalibration, all models showed improved calibration; meanwhile, the JPHC and LCRS models retained the highest level of calibration.</div></div><div><h3>Interpretation</h3><div>Only six models can be directly applied to the Yinzhou regional health care database. The JPHC model developed for the Japanese population and the LCRS model developed based on the China Kadoorie Biobank (CKB) performed better in the Chinese population than other models.</div></div><div><h3>Funding</h3><div>This work was supported by the <span>National Natural Science Foundation of China</span> (82473728 to Y.W.) and <span>Medical and Health Science and Technology Project of Zhejiang Province</span>, China.</div></div>","PeriodicalId":22792,"journal":{"name":"The Lancet Regional Health: Western Pacific","volume":"55 ","pages":"Article 101354"},"PeriodicalIF":7.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Lancet Regional Health: Western Pacific","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666606524003481","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background

A number of lung cancer prediction models have been developed worldwide. However, few validation studies have been conducted on Chinese populations. The objective of this study is to evaluate the feasibility and efficacy of 17 global lung cancer risk prediction models when applied to Chinese healthcare big data.

Methods

The study included individuals with information recorded in the Yinzhou regional health care database from January 1, 2010 to December 31, 2021. Seventeen lung cancer risk prediction models (Bach, Spitz, Hoggart, PLCO_m2012, Korean Men, PLCO_all2014, Pittsburgh Predictor, LLPi, LCRAT, HUNT, JPHC, Reduced HUNT, LLPv3, LCRS, OWL, UCL-I, Shanghai-LCM) were evaluated for their performance in overall population and subgroups. The discrimination of the 17 models was assessed using the Harrell's C-index and time-dependent area under the curve (AUC) as metrics. The calibration of the models was evaluated using the expected-to-observed ratio (EOR) and calibration curves. Moreover, the models were recalibrated in the Yinzhou population, and the calibration of the recalibrated models was evaluated.

Findings

A total of 907,200 study participants were included in the analysis, comprising 69,263 smokers and 837,937 non-smokers. Of the 17 models initially considered, only 6 (Bach, Hoggart, Pittsburgh Predictor, JPHC, Reduced HUNT, UCL-I) were available in the Yinzhou regional health care database with complete predictor data. Models that predicted risk over a ten-year period (Bach, JPHC, LCRS, and Shanghai-LCM) exhibited C-indices and AUCs of 0.75 or greater in the ever smokers. The majority of models demonstrated an overestimation of incidence risk in the ever smokers and an underestimation in the never smokers. The JPHC and LCRS models exhibited the most optimal calibration curves and the best EOR, whereas the other prediction models had suboptimal calibration. After recalibration, all models showed improved calibration; meanwhile, the JPHC and LCRS models retained the highest level of calibration.

Interpretation

Only six models can be directly applied to the Yinzhou regional health care database. The JPHC model developed for the Japanese population and the LCRS model developed based on the China Kadoorie Biobank (CKB) performed better in the Chinese population than other models.

Funding

This work was supported by the National Natural Science Foundation of China (82473728 to Y.W.) and Medical and Health Science and Technology Project of Zhejiang Province, China.

查看原文本刊更多论文

筛选中国肺癌筛查参与者的肺癌风险预测模型评估——基于区域医疗大数据的现实世界分析，中国鄞州

世界范围内已经建立了许多肺癌预测模型。然而，在中国人群中进行的验证性研究很少。本研究的目的是评估17种全球肺癌风险预测模型应用于中国医疗大数据的可行性和有效性。方法选取2010年1月1日至2021年12月31日在鄞州地区卫生保健数据库中记录的个体为研究对象。17个肺癌风险预测模型（Bach、Spitz、Hoggart、PLCOm2012、Korean Men、PLCOall2014、Pittsburgh Predictor、LLPi、LCRAT、HUNT、JPHC、Reduced HUNT、LLPv3、LCRS、OWL、UCL-I、Shanghai-LCM）在总体人群和亚组中的表现进行了评估。采用Harrell’sc指数和随时间变化的曲线下面积（AUC）作为指标，对17种模型的判别性进行了评估。使用期望与观测比（EOR）和校准曲线评估模型的校准。并在鄞州人口中对模型进行了重新校正，并对模型的校正效果进行了评价。共有907,200名研究参与者被纳入分析，其中包括69,263名吸烟者和837,937名非吸烟者。在最初考虑的17个模型中，只有6个模型（Bach、Hoggart、Pittsburgh Predictor、JPHC、Reduced HUNT、UCL-I）在鄞州地区卫生保健数据库中具有完整的预测数据。预测十年风险的模型（Bach、JPHC、LCRS和Shanghai-LCM）显示，曾经吸烟者的c指数和auc均为0.75或更高。大多数模型显示对曾经吸烟者的发病率风险估计过高，而对从未吸烟者的发病率风险估计过低。JPHC模型和LCRS模型具有最佳的校正曲线和最佳的提高采收率，而其他预测模型具有次优校正曲线。重新校准后，所有模型都显示出更好的校准；同时，JPHC和LCRS模型保持了最高的校准水平。只有6个模型可以直接应用于鄞州区域卫生保健数据库。针对日本人群开发的JPHC模型和基于中国嘉道理生物库（CKB）开发的LCRS模型在中国人群中的表现优于其他模型。本工作得到国家自然科学基金项目（82473728 - yw）和浙江省医药卫生科技项目的支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The Lancet Regional Health: Western Pacific Medicine-Pediatrics, Perinatology and Child Health

CiteScore

8.80

自引率

2.80%

发文量

305

审稿时长

11 weeks

期刊介绍： The Lancet Regional Health – Western Pacific, a gold open access journal, is an integral part of The Lancet's global initiative advocating for healthcare quality and access worldwide. It aims to advance clinical practice and health policy in the Western Pacific region, contributing to enhanced health outcomes. The journal publishes high-quality original research shedding light on clinical practice and health policy in the region. It also includes reviews, commentaries, and opinion pieces covering diverse regional health topics, such as infectious diseases, non-communicable diseases, child and adolescent health, maternal and reproductive health, aging health, mental health, the health workforce and systems, and health policy.