Reproducibility in Radiomics: A Comparison of Feature Extraction Methods and Two Independent Datasets.

IF 2.5 4区 综合性期刊 Q2 CHEMISTRY, MULTIDISCIPLINARY
Hannah Mary T Thomas, Helen Y C Wang, Amal Joseph Varghese, Ellen M Donovan, Chris P South, Helen Saxby, Andrew Nisbet, Vineet Prakash, Balu Krishna Sasidharan, Simon Pradeep Pavamani, Devakumar Devadhas, Manu Mathew, Rajesh Gunasingam Isiah, Philip M Evans
{"title":"Reproducibility in Radiomics: A Comparison of Feature Extraction Methods and Two Independent Datasets.","authors":"Hannah Mary T Thomas, Helen Y C Wang, Amal Joseph Varghese, Ellen M Donovan, Chris P South, Helen Saxby, Andrew Nisbet, Vineet Prakash, Balu Krishna Sasidharan, Simon Pradeep Pavamani, Devakumar Devadhas, Manu Mathew, Rajesh Gunasingam Isiah, Philip M Evans","doi":"10.3390/app13127291","DOIUrl":null,"url":null,"abstract":"<p><p>Radiomics involves the extraction of information from medical images that are not visible to the human eye. There is evidence that these features can be used for treatment stratification and outcome prediction. However, there is much discussion about the reproducibility of results between different studies. This paper studies the reproducibility of CT texture features used in radiomics, comparing two feature extraction implementations, namely the MATLAB toolkit and Pyradiomics, when applied to independent datasets of CT scans of patients: (i) the open access RIDER dataset containing a set of repeat CT scans taken 15 min apart for 31 patients (RIDER Scan 1 and Scan 2, respectively) treated for lung cancer; and (ii) the open access HN1 dataset containing 137 patients treated for head and neck cancer. Gross tumor volume (GTV), manually outlined by an experienced observer available on both datasets, was used. The 43 common radiomics features available in MATLAB and Pyradiomics were calculated using two intensity-level quantization methods with and without an intensity threshold. Cases were ranked for each feature for all combinations of quantization parameters, and the Spearman's rank coefficient, <b><i>rs</i></b>, calculated. Reproducibility was defined when a highly correlated feature in the RIDER dataset also correlated highly in the HN1 dataset, and vice versa. A total of 29 out of the 43 reported stable features were found to be highly reproducible between MATLAB and Pyradiomics implementations, having a consistently high correlation in rank ordering for RIDER Scan 1 and RIDER Scan 2 (<b><i>rs</i></b> > 0.8). 18/43 reported features were common in the RIDER and HN1 datasets, suggesting they may be agnostic to disease site. Useful radiomics features should be selected based on reproducibility. This study identified a set of features that meet this requirement and validated the methodology for evaluating reproducibility between datasets.</p>","PeriodicalId":48760,"journal":{"name":"Applied Sciences-Basel","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7615943/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Sciences-Basel","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.3390/app13127291","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Radiomics involves the extraction of information from medical images that are not visible to the human eye. There is evidence that these features can be used for treatment stratification and outcome prediction. However, there is much discussion about the reproducibility of results between different studies. This paper studies the reproducibility of CT texture features used in radiomics, comparing two feature extraction implementations, namely the MATLAB toolkit and Pyradiomics, when applied to independent datasets of CT scans of patients: (i) the open access RIDER dataset containing a set of repeat CT scans taken 15 min apart for 31 patients (RIDER Scan 1 and Scan 2, respectively) treated for lung cancer; and (ii) the open access HN1 dataset containing 137 patients treated for head and neck cancer. Gross tumor volume (GTV), manually outlined by an experienced observer available on both datasets, was used. The 43 common radiomics features available in MATLAB and Pyradiomics were calculated using two intensity-level quantization methods with and without an intensity threshold. Cases were ranked for each feature for all combinations of quantization parameters, and the Spearman's rank coefficient, rs, calculated. Reproducibility was defined when a highly correlated feature in the RIDER dataset also correlated highly in the HN1 dataset, and vice versa. A total of 29 out of the 43 reported stable features were found to be highly reproducible between MATLAB and Pyradiomics implementations, having a consistently high correlation in rank ordering for RIDER Scan 1 and RIDER Scan 2 (rs > 0.8). 18/43 reported features were common in the RIDER and HN1 datasets, suggesting they may be agnostic to disease site. Useful radiomics features should be selected based on reproducibility. This study identified a set of features that meet this requirement and validated the methodology for evaluating reproducibility between datasets.

放射组学中的再现性:特征提取方法和两个独立数据集的比较
放射组学涉及从人眼看不见的医学图像中提取信息。有证据表明,这些特征可用于治疗分层和结果预测。然而,不同研究之间对结果的可重复性有很多讨论。本文研究了放射组学中使用的CT纹理特征的再现性,比较了MATLAB工具包和Pyradiomics两种特征提取方法,当应用于患者CT扫描的独立数据集时:(i)开放访问的RIDER数据集,其包含对31名接受癌症治疗的患者(分别为RIDER扫描1和扫描2)间隔15分钟进行的一组重复CT扫描;和(ii)包含137名接受头颈癌症治疗的患者的开放访问HN1数据集。使用由两个数据集上的经验丰富的观察者手动绘制的肿瘤总体积(GTV)。MATLAB和Pyradiomics中可用的43个常见放射组学特征是使用两种具有和不具有强度阈值的强度水平量化方法计算的。对量化参数的所有组合的每个特征的情况进行排序,并计算Spearman秩系数rs。当RIDER数据集中的高度相关特征在HN1数据集中也高度相关时,定义了再现性,反之亦然。43个报告的稳定特征中,共有29个在MATLAB和Pyradiomics实现之间具有高度可重复性,在RIDER扫描1和RIDER扫描2的排序中具有一致的高相关性(rs>0.8)。18/43个报告特征在RIDER和HN1数据集中很常见,这表明它们可能与疾病部位无关。应根据再现性选择有用的放射组学特征。这项研究确定了一组符合这一要求的特征,并验证了评估数据集之间再现性的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied Sciences-Basel
Applied Sciences-Basel CHEMISTRY, MULTIDISCIPLINARYMATERIALS SCIE-MATERIALS SCIENCE, MULTIDISCIPLINARY
CiteScore
5.30
自引率
11.10%
发文量
10882
期刊介绍: Applied Sciences (ISSN 2076-3417) provides an advanced forum on all aspects of applied natural sciences. It publishes reviews, research papers and communications. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced. Electronic files and software regarding the full details of the calculation or experimental procedure, if unable to be published in a normal way, can be deposited as supplementary electronic material.
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信