A deep learning-based model to estimate pulmonary function from chest x-rays: multi-institutional model development and validation study in Japan

IF 24.1 1区医学 Q1 MEDICAL INFORMATICS

Lancet Digital Health Pub Date : 2024-08-01 DOI:10.1016/S2589-7500(24)00113-4

{"title":"A deep learning-based model to estimate pulmonary function from chest x-rays: multi-institutional model development and validation study in Japan","authors":"","doi":"10.1016/S2589-7500(24)00113-4","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Chest x-ray is a basic, cost-effective, and widely available imaging method that is used for static assessments of organic diseases and anatomical abnormalities, but its ability to estimate dynamic measurements such as pulmonary function is unknown. We aimed to estimate two major pulmonary functions from chest x-rays.</p></div><div><h3>Methods</h3><p>In this retrospective model development and validation study, we trained, validated, and externally tested a deep learning-based artificial intelligence (AI) model to estimate forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV<sub>1</sub>) from chest x-rays. We included consecutively collected results of spirometry and any associated chest x-rays that had been obtained between July 1, 2003, and Dec 31, 2021, from five institutions in Japan (labelled institutions A–E). Eligible x-rays had been acquired within 14 days of spirometry and were labelled with the FVC and FEV<sub>1</sub>. X-rays from three institutions (A–C) were used for training, validation, and internal testing, with the testing dataset being independent of the training and validation datasets, and then x-rays from the two other institutions (D and E) were used for independent external testing. Performance for estimating FVC and FEV<sub>1</sub> was evaluated by calculating the Pearson's correlation coefficient (<em>r</em>), intraclass correlation coefficient (ICC), mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) compared with the results of spirometry.</p></div><div><h3>Findings</h3><p>We included 141 734 x-ray and spirometry pairs from 81 902 patients from the five institutions. The training, validation, and internal test datasets included 134 307 x-rays from 75 768 patients (37 718 [50%] female, 38 050 [50%] male; mean age 56 years [SD 18]), and the external test datasets included 2137 x-rays from 1861 patients (742 [40%] female, 1119 [60%] male; mean age 65 years [SD 17]) from institution D and 5290 x-rays from 4273 patients (1972 [46%] female, 2301 [54%] male; mean age 63 years [SD 17]) from institution E. External testing for FVC yielded <em>r</em> values of 0·91 (99% CI 0·90–0·92) for institution D and 0·90 (0·89–0·91) for institution E, ICC of 0·91 (99% CI 0·90–0·92) and 0·89 (0·88–0·90), MSE of 0·17 L<sup>2</sup> (99% CI 0·15–0·19) and 0·17 L<sup>2</sup> (0·16–0·19), RMSE of 0·41 L (99% CI 0·39–0·43) and 0·41 L (0·39–0·43), and MAE of 0·31 L (99% CI 0·29–0·32) and 0·31 L (0·30–0·32). External testing for FEV<sub>1</sub> yielded <em>r</em> values of 0·91 (99% CI 0·90–0·92) for institution D and 0·91 (0·90–0·91) for institution E, ICC of 0·90 (99% CI 0·89–0·91) and 0·90 (0·90–0·91), MSE of 0·13 L<sup>2</sup> (99% CI 0·12–0·15) and 0·11 L<sup>2</sup> (0·10–0·12), RMSE of 0·37 L (99% CI 0·35–0·38) and 0·33 L (0·32–0·35), and MAE of 0·28 L (99% CI 0·27–0·29) and 0·25 L (0·25–0·26).</p></div><div><h3>Interpretation</h3><p>This deep learning model allowed estimation of FVC and FEV<sub>1</sub> from chest x-rays, showing high agreement with spirometry. The model offers an alternative to spirometry for assessing pulmonary function, which is especially useful for patients who are unable to undergo spirometry, and might enhance the customisation of CT imaging protocols based on insights gained from chest x-rays, improving the diagnosis and management of lung diseases. Future studies should investigate the performance of this AI model in combination with clinical information to enable more appropriate and targeted use.</p></div><div><h3>Funding</h3><p>None.</p></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":"6 8","pages":"Pages e580-e588"},"PeriodicalIF":24.1000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024001134/pdfft?md5=d7024a15c05d0bb8522e24f48c3cce86&pid=1-s2.0-S2589750024001134-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lancet Digital Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589750024001134","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Chest x-ray is a basic, cost-effective, and widely available imaging method that is used for static assessments of organic diseases and anatomical abnormalities, but its ability to estimate dynamic measurements such as pulmonary function is unknown. We aimed to estimate two major pulmonary functions from chest x-rays.

Methods

In this retrospective model development and validation study, we trained, validated, and externally tested a deep learning-based artificial intelligence (AI) model to estimate forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV₁) from chest x-rays. We included consecutively collected results of spirometry and any associated chest x-rays that had been obtained between July 1, 2003, and Dec 31, 2021, from five institutions in Japan (labelled institutions A–E). Eligible x-rays had been acquired within 14 days of spirometry and were labelled with the FVC and FEV₁. X-rays from three institutions (A–C) were used for training, validation, and internal testing, with the testing dataset being independent of the training and validation datasets, and then x-rays from the two other institutions (D and E) were used for independent external testing. Performance for estimating FVC and FEV₁ was evaluated by calculating the Pearson's correlation coefficient (r), intraclass correlation coefficient (ICC), mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) compared with the results of spirometry.

Findings

We included 141 734 x-ray and spirometry pairs from 81 902 patients from the five institutions. The training, validation, and internal test datasets included 134 307 x-rays from 75 768 patients (37 718 [50%] female, 38 050 [50%] male; mean age 56 years [SD 18]), and the external test datasets included 2137 x-rays from 1861 patients (742 [40%] female, 1119 [60%] male; mean age 65 years [SD 17]) from institution D and 5290 x-rays from 4273 patients (1972 [46%] female, 2301 [54%] male; mean age 63 years [SD 17]) from institution E. External testing for FVC yielded r values of 0·91 (99% CI 0·90–0·92) for institution D and 0·90 (0·89–0·91) for institution E, ICC of 0·91 (99% CI 0·90–0·92) and 0·89 (0·88–0·90), MSE of 0·17 L² (99% CI 0·15–0·19) and 0·17 L² (0·16–0·19), RMSE of 0·41 L (99% CI 0·39–0·43) and 0·41 L (0·39–0·43), and MAE of 0·31 L (99% CI 0·29–0·32) and 0·31 L (0·30–0·32). External testing for FEV₁ yielded r values of 0·91 (99% CI 0·90–0·92) for institution D and 0·91 (0·90–0·91) for institution E, ICC of 0·90 (99% CI 0·89–0·91) and 0·90 (0·90–0·91), MSE of 0·13 L² (99% CI 0·12–0·15) and 0·11 L² (0·10–0·12), RMSE of 0·37 L (99% CI 0·35–0·38) and 0·33 L (0·32–0·35), and MAE of 0·28 L (99% CI 0·27–0·29) and 0·25 L (0·25–0·26).

Interpretation

This deep learning model allowed estimation of FVC and FEV₁ from chest x-rays, showing high agreement with spirometry. The model offers an alternative to spirometry for assessing pulmonary function, which is especially useful for patients who are unable to undergo spirometry, and might enhance the customisation of CT imaging protocols based on insights gained from chest x-rays, improving the diagnosis and management of lung diseases. Future studies should investigate the performance of this AI model in combination with clinical information to enable more appropriate and targeted use.

Funding

None.

查看原文本刊更多论文

基于深度学习的胸部 X 射线肺功能估算模型：日本多机构模型开发与验证研究。

背景：胸部 X 光片是一种基本、经济、广泛使用的成像方法，可用于器质性疾病和解剖异常的静态评估，但其估算肺功能等动态测量值的能力尚不清楚。我们的目的是通过胸部 X 光片估测两种主要的肺功能：在这项回顾性模型开发和验证研究中，我们对基于深度学习的人工智能（AI）模型进行了训练、验证和外部测试，以便从胸部 X 光片中估算出用力肺活量（FVC）和 1 秒用力呼气容积（FEV1）。我们纳入了从 2003 年 7 月 1 日到 2021 年 12 月 31 日期间从日本五家机构（标注为机构 A-E）连续收集的肺活量测定结果和任何相关的胸部 X 光片。符合条件的 X 光片是在肺活量测定后 14 天内获得的，并标有 FVC 和 FEV1。来自三个机构（A-C）的 X 光片被用于训练、验证和内部测试，测试数据集独立于训练和验证数据集，然后来自其他两个机构（D 和 E）的 X 光片被用于独立的外部测试。通过计算与肺活量测定结果相比的皮尔逊相关系数（r）、类内相关系数（ICC）、均方误差（MSE）、均方根误差（RMSE）和平均绝对误差（MAE）来评估 FVC 和 FEV1 的估算结果：我们纳入了五家机构 81 902 名患者的 141 734 对 X 光片和肺活量测定结果。训练、验证和内部测试数据集包括 75 768 名患者的 134 307 张 X 光片（女性 37 718 [50%]，男性 38 050 [50%]；平均年龄 56 岁 [SD 18]），外部测试数据集包括 1861 名患者的 2137 张 X 光片（女性 742 [40%]，男性 1119 [60%]；平均年龄 65 岁 [SD 17]）；外部检测数据集包括 D 机构 1861 名患者（女性 742 人 [40%]，男性 1119 人 [60%]；平均年龄 65 岁 [SD 17]）的 2137 张 X 光片和 E 机构 4273 名患者（女性 1972 人 [46%]，男性 2301 人 [54%]；平均年龄 63 岁 [SD 17]）的 5290 张 X 光片。对 FVC 的外部测试结果显示，D 机构的 r 值为 0-91（99% CI 0-90-0-92），E 机构为 0-90（0-89-0-91），ICC 为 0-91（99% CI 0-90-0-92）和 0-89（0-88-0-90）、MSE为 0-17 L2 (99% CI 0-15-0-19) 和 0-17 L2 (0-16-0-19)，RMSE为 0-41 L (99% CI 0-39-0-43) 和 0-41 L (0-39-0-43)，MAE为 0-31 L (99% CI 0-29-0-32) 和 0-31 L (0-30-0-32)。对 FEV1 的外部测试结果显示，D 机构的 r 值为 0-91（99% CI 0-90-0-92），E 机构为 0-91（0-90-0-91），ICC 为 0-90（99% CI 0-89-0-91）和 0-90（0-90-0-91）、MSE 为 0-13 L2 (99% CI 0-12-0-15) 和 0-11 L2 (0-10-0-12)，RMSE 为 0-37 L (99% CI 0-35-0-38) 和 0-33 L (0-32-0-35)，MAE 为 0-28 L (99% CI 0-27-0-29) 和 0-25 L (0-25-0-26)。解释：该深度学习模型可通过胸部 X 光片估算出 FVC 和 FEV1，与肺活量测量法显示出很高的一致性。该模型为肺活量测定提供了一种评估肺功能的替代方法，尤其适用于无法进行肺活量测定的患者，并可根据从胸部X光片中获得的信息加强CT成像方案的定制，从而改善肺部疾病的诊断和管理。未来的研究应调查该人工智能模型与临床信息相结合的性能，以便更恰当、更有针对性地使用：无。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Lancet Digital Health Multiple-

CiteScore

41.20

自引率

1.60%

发文量

232

审稿时长

13 weeks

期刊介绍： The Lancet Digital Health publishes important, innovative, and practice-changing research on any topic connected with digital technology in clinical medicine, public health, and global health. The journal’s open access content crosses subject boundaries, building bridges between health professionals and researchers.By bringing together the most important advances in this multidisciplinary field,The Lancet Digital Health is the most prominent publishing venue in digital health. We publish a range of content types including Articles,Review, Comment, and Correspondence, contributing to promoting digital technologies in health practice worldwide.