在资源有限的环境中进行世代扩增以改进肺结节的检测

Q3 Mathematics
N. Gusarova, A. Lobantsev, A. Vatian, Anton Klochrov, M. Kabyshev, A. Shalyto, A. Tatarinova, T. Treshkur, Min Li
{"title":"在资源有限的环境中进行世代扩增以改进肺结节的检测","authors":"N. Gusarova, A. Lobantsev, A. Vatian, Anton Klochrov, M. Kabyshev, A. Shalyto, A. Tatarinova, T. Treshkur, Min Li","doi":"10.31799/1684-8853-2020-6-60-69","DOIUrl":null,"url":null,"abstract":"Introduction: Lung cancer is one of the most formidable cancers. The use of neural networks technologies in its diagnostics is promising, but the datasets collected from real clinical practice cannot cover a variety of lung cancer manifestations.  Purpose: Assessment of the possibility of improving the classification of pulmonary nodules by means of generative augmentation of available datasets under resource constraints. Methods: We used part of LIDC-IDRI dataset,  the StyleGAN architecture for generating artificial lung nodules and the VGG11 model as a classifier. We generated pulmonary nodules using the proposed pipeline and invited four  experts to visually evaluate them. We formed four experimental datasets with different types of augmentation, including use of synthesized data, and we compared the effectiveness of the classification performed by the VGG11 network when training for each dataset. Results: 10 generated nodules in each group of characteristics were presented for assessment. In all cases, positive expert assessments were obtained with a Fleiss's kappa coefficient k = 0.6–0.9. We got the best values of ROCAUC=0.9604 and PRAUC=0.9625 with the proposed approach of a generative augmentation. Discussion: The obtained efficience metrics are superior to the baseline  results obtained using comparably small training datasets, and slightly less than the best results achieved using much more powerful computational resources. So, we have shown that one can effectively use for augmenting an unbalanced dataset a combination of StyleGAN and VGG11, which does not require large computing resources as well as a large initial dataset for training.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generative augmentation to improve lung nodules detection in resource-limited settings\",\"authors\":\"N. Gusarova, A. Lobantsev, A. Vatian, Anton Klochrov, M. Kabyshev, A. Shalyto, A. Tatarinova, T. Treshkur, Min Li\",\"doi\":\"10.31799/1684-8853-2020-6-60-69\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Lung cancer is one of the most formidable cancers. The use of neural networks technologies in its diagnostics is promising, but the datasets collected from real clinical practice cannot cover a variety of lung cancer manifestations.  Purpose: Assessment of the possibility of improving the classification of pulmonary nodules by means of generative augmentation of available datasets under resource constraints. Methods: We used part of LIDC-IDRI dataset,  the StyleGAN architecture for generating artificial lung nodules and the VGG11 model as a classifier. We generated pulmonary nodules using the proposed pipeline and invited four  experts to visually evaluate them. We formed four experimental datasets with different types of augmentation, including use of synthesized data, and we compared the effectiveness of the classification performed by the VGG11 network when training for each dataset. Results: 10 generated nodules in each group of characteristics were presented for assessment. In all cases, positive expert assessments were obtained with a Fleiss's kappa coefficient k = 0.6–0.9. We got the best values of ROCAUC=0.9604 and PRAUC=0.9625 with the proposed approach of a generative augmentation. Discussion: The obtained efficience metrics are superior to the baseline  results obtained using comparably small training datasets, and slightly less than the best results achieved using much more powerful computational resources. So, we have shown that one can effectively use for augmenting an unbalanced dataset a combination of StyleGAN and VGG11, which does not require large computing resources as well as a large initial dataset for training.\",\"PeriodicalId\":36977,\"journal\":{\"name\":\"Informatsionno-Upravliaiushchie Sistemy\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informatsionno-Upravliaiushchie Sistemy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31799/1684-8853-2020-6-60-69\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatsionno-Upravliaiushchie Sistemy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31799/1684-8853-2020-6-60-69","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

摘要

简介:癌症是最可怕的癌症之一。神经网络技术在其诊断中的应用是有前景的,但从实际临床实践中收集的数据集无法涵盖各种癌症表现。目的:评估在资源有限的情况下,通过生成性增强现有数据集来改进肺结节分类的可能性。方法:我们使用LIDC-IDRI数据集的一部分、用于生成人工肺结节的StyleGAN架构和VGG11模型作为分类器。我们使用拟议的管道生成了肺结节,并邀请了四位专家对其进行视觉评估。我们形成了四个具有不同类型增强的实验数据集,包括使用合成数据,并在对每个数据集进行训练时比较了VGG11网络执行的分类的有效性。结果:每组特征中有10个生成的结节可供评估。在所有情况下,获得了积极的专家评估,Fleiss的kappa系数k=0.6–0.9。利用所提出的生成增广方法,我们得到了ROCUC=0.9604和PRAUC=0.9625的最佳值。讨论:获得的效率指标优于使用相对较小的训练数据集获得的基线结果,略低于使用更强大的计算资源获得的最佳结果。因此,我们已经证明,可以有效地使用StyleGAN和VGG11的组合来扩充不平衡的数据集,这不需要大量的计算资源以及用于训练的大型初始数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Generative augmentation to improve lung nodules detection in resource-limited settings
Introduction: Lung cancer is one of the most formidable cancers. The use of neural networks technologies in its diagnostics is promising, but the datasets collected from real clinical practice cannot cover a variety of lung cancer manifestations.  Purpose: Assessment of the possibility of improving the classification of pulmonary nodules by means of generative augmentation of available datasets under resource constraints. Methods: We used part of LIDC-IDRI dataset,  the StyleGAN architecture for generating artificial lung nodules and the VGG11 model as a classifier. We generated pulmonary nodules using the proposed pipeline and invited four  experts to visually evaluate them. We formed four experimental datasets with different types of augmentation, including use of synthesized data, and we compared the effectiveness of the classification performed by the VGG11 network when training for each dataset. Results: 10 generated nodules in each group of characteristics were presented for assessment. In all cases, positive expert assessments were obtained with a Fleiss's kappa coefficient k = 0.6–0.9. We got the best values of ROCAUC=0.9604 and PRAUC=0.9625 with the proposed approach of a generative augmentation. Discussion: The obtained efficience metrics are superior to the baseline  results obtained using comparably small training datasets, and slightly less than the best results achieved using much more powerful computational resources. So, we have shown that one can effectively use for augmenting an unbalanced dataset a combination of StyleGAN and VGG11, which does not require large computing resources as well as a large initial dataset for training.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Informatsionno-Upravliaiushchie Sistemy
Informatsionno-Upravliaiushchie Sistemy Mathematics-Control and Optimization
CiteScore
1.40
自引率
0.00%
发文量
35
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信