通过NLST中结节大小范围的特征选择改进恶性肿瘤预测。

Dmitry Cherezov, Samuel Hawkins, Dmitry Goldgof, Lawrence Hall, Yoganand Balagurunathan, Robert J Gillies, Matthew B Schabath
{"title":"通过NLST中结节大小范围的特征选择改进恶性肿瘤预测。","authors":"Dmitry Cherezov, Samuel Hawkins, Dmitry Goldgof, Lawrence Hall, Yoganand Balagurunathan, Robert J Gillies, Matthew B Schabath","doi":"10.1109/SMC.2016.7844523","DOIUrl":null,"url":null,"abstract":"<p><p>Computed tomography (CT) is widely used during diagnosis and treatment of Non-Small Cell Lung Cancer (NSCLC). Current computer-aided diagnosis (CAD) models, designed for the classification of malignant and benign nodules, use image features, selected by feature selectors, for making a decision. In this paper, we investigate automated selection of different image features informed by different nodule size ranges to increase the overall accuracy of the classification. The NLST dataset is one of the largest available datasets on CT screening for NSCLC. We used 261 cases as a training dataset and 237 cases as a test dataset. The nodule size, which may indicate biological variability, can vary substantially. For example, in the training set, there are nodules with a diameter of a couple millimeters up to a couple dozen millimeters. The premise is that benign and malignant nodules have different radiomic quantitative descriptors related to size. After splitting training and testing datasets into three subsets based on the longest nodule diameter (LD) parameter accuracy was improved from 74.68% to 81.01% and the AUC improved from 0.69 to 0.79. We show that if AUC is the main factor in choosing parameters then accuracy improved from 72.57% to 77.5% and AUC improved from 0.78 to 0.82. Additionally, we show the impact of an oversampling technique for the minority cancer class. In some particular cases from 0.82 to 0.87.</p>","PeriodicalId":72691,"journal":{"name":"Conference proceedings. IEEE International Conference on Systems, Man, and Cybernetics","volume":"2016 ","pages":"001939-1944"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6251413/pdf/nihms-994650.pdf","citationCount":"0","resultStr":"{\"title\":\"Improving malignancy prediction through feature selection informed by nodule size ranges in NLST.\",\"authors\":\"Dmitry Cherezov, Samuel Hawkins, Dmitry Goldgof, Lawrence Hall, Yoganand Balagurunathan, Robert J Gillies, Matthew B Schabath\",\"doi\":\"10.1109/SMC.2016.7844523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Computed tomography (CT) is widely used during diagnosis and treatment of Non-Small Cell Lung Cancer (NSCLC). Current computer-aided diagnosis (CAD) models, designed for the classification of malignant and benign nodules, use image features, selected by feature selectors, for making a decision. In this paper, we investigate automated selection of different image features informed by different nodule size ranges to increase the overall accuracy of the classification. The NLST dataset is one of the largest available datasets on CT screening for NSCLC. We used 261 cases as a training dataset and 237 cases as a test dataset. The nodule size, which may indicate biological variability, can vary substantially. For example, in the training set, there are nodules with a diameter of a couple millimeters up to a couple dozen millimeters. The premise is that benign and malignant nodules have different radiomic quantitative descriptors related to size. After splitting training and testing datasets into three subsets based on the longest nodule diameter (LD) parameter accuracy was improved from 74.68% to 81.01% and the AUC improved from 0.69 to 0.79. We show that if AUC is the main factor in choosing parameters then accuracy improved from 72.57% to 77.5% and AUC improved from 0.78 to 0.82. Additionally, we show the impact of an oversampling technique for the minority cancer class. In some particular cases from 0.82 to 0.87.</p>\",\"PeriodicalId\":72691,\"journal\":{\"name\":\"Conference proceedings. IEEE International Conference on Systems, Man, and Cybernetics\",\"volume\":\"2016 \",\"pages\":\"001939-1944\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6251413/pdf/nihms-994650.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference proceedings. IEEE International Conference on Systems, Man, and Cybernetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SMC.2016.7844523\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2017/2/9 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference proceedings. IEEE International Conference on Systems, Man, and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMC.2016.7844523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2017/2/9 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

计算机断层扫描(CT)在癌症(NSCLC)的诊断和治疗中得到广泛应用。目前为恶性和良性结节分类而设计的计算机辅助诊断(CAD)模型使用由特征选择器选择的图像特征来做出决策。在本文中,我们研究了根据不同结节大小范围自动选择不同的图像特征,以提高分类的整体准确性。NLST数据集是关于NSCLC CT筛查的最大可用数据集之一。我们使用261个案例作为训练数据集,237个案例作为测试数据集。结节大小可能表明生物变异性,但变化很大。例如,在训练集中,有一些直径从几毫米到几十毫米的结节。前提是良性和恶性结节具有不同的与大小相关的放射学定量描述符。根据最长结节直径(LD)将训练和测试数据集划分为三个子集后,参数准确性从74.68%提高到81.01%,AUC从0.69提高到0.79。我们发现,如果AUC是选择参数的主要因素,那么准确率从72.57%提高到77.5%,AUC从0.78提高到0.82。此外,我们还展示了过采样技术对少数癌症人群的影响。在某些特定情况下,从0.82到0.87。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Improving malignancy prediction through feature selection informed by nodule size ranges in NLST.

Improving malignancy prediction through feature selection informed by nodule size ranges in NLST.

Improving malignancy prediction through feature selection informed by nodule size ranges in NLST.

Computed tomography (CT) is widely used during diagnosis and treatment of Non-Small Cell Lung Cancer (NSCLC). Current computer-aided diagnosis (CAD) models, designed for the classification of malignant and benign nodules, use image features, selected by feature selectors, for making a decision. In this paper, we investigate automated selection of different image features informed by different nodule size ranges to increase the overall accuracy of the classification. The NLST dataset is one of the largest available datasets on CT screening for NSCLC. We used 261 cases as a training dataset and 237 cases as a test dataset. The nodule size, which may indicate biological variability, can vary substantially. For example, in the training set, there are nodules with a diameter of a couple millimeters up to a couple dozen millimeters. The premise is that benign and malignant nodules have different radiomic quantitative descriptors related to size. After splitting training and testing datasets into three subsets based on the longest nodule diameter (LD) parameter accuracy was improved from 74.68% to 81.01% and the AUC improved from 0.69 to 0.79. We show that if AUC is the main factor in choosing parameters then accuracy improved from 72.57% to 77.5% and AUC improved from 0.78 to 0.82. Additionally, we show the impact of an oversampling technique for the minority cancer class. In some particular cases from 0.82 to 0.87.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信