A hybrid learning method for distinguishing lung adenocarcinoma and squamous cell carcinoma

IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Anil Kumar Swain, A. Swetapadma, Jitendra Kumar Rout, Bunil Kumar Balabantaray
{"title":"A hybrid learning method for distinguishing lung adenocarcinoma and squamous cell carcinoma","authors":"Anil Kumar Swain, A. Swetapadma, Jitendra Kumar Rout, Bunil Kumar Balabantaray","doi":"10.1108/dta-10-2022-0384","DOIUrl":null,"url":null,"abstract":"PurposeThe objective of the proposed work is to identify the most commonly occurring non–small cell carcinoma types, such as adenocarcinoma and squamous cell carcinoma, within the human population. Another objective of the work is to reduce the false positive rate during the classification.Design/methodology/approachIn this work, a hybrid method using convolutional neural networks (CNNs), extreme gradient boosting (XGBoost) and long-short-term memory networks (LSTMs) has been proposed to distinguish between lung adenocarcinoma and squamous cell carcinoma. To extract features from non–small cell lung carcinoma images, a three-layer convolution and three-layer max-pooling-based CNN is used. A few important features have been selected from the extracted features using the XGBoost algorithm as the optimal feature. Finally, LSTM has been used for the classification of carcinoma types. The accuracy of the proposed method is 99.57 per cent, and the false positive rate is 0.427 per cent.FindingsThe proposed CNN–XGBoost–LSTM hybrid method has significantly improved the results in distinguishing between adenocarcinoma and squamous cell carcinoma. The importance of the method can be outlined as follows: It has a very low false positive rate of 0.427 per cent. It has very high accuracy, i.e. 99.57 per cent. CNN-based features are providing accurate results in classifying lung carcinoma. It has the potential to serve as an assisting aid for doctors.Practical implicationsIt can be used by doctors as a secondary tool for the analysis of non–small cell lung cancers.Social implicationsIt can help rural doctors by sending the patients to specialized doctors for more analysis of lung cancer.Originality/valueIn this work, a hybrid method using CNN, XGBoost and LSTM has been proposed to distinguish between lung adenocarcinoma and squamous cell carcinoma. A three-layer convolution and three-layer max-pooling-based CNN is used to extract features from the non–small cell lung carcinoma images. A few important features have been selected from the extracted features using the XGBoost algorithm as the optimal feature. Finally, LSTM has been used for the classification of carcinoma types.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":" ","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Technologies and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1108/dta-10-2022-0384","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

PurposeThe objective of the proposed work is to identify the most commonly occurring non–small cell carcinoma types, such as adenocarcinoma and squamous cell carcinoma, within the human population. Another objective of the work is to reduce the false positive rate during the classification.Design/methodology/approachIn this work, a hybrid method using convolutional neural networks (CNNs), extreme gradient boosting (XGBoost) and long-short-term memory networks (LSTMs) has been proposed to distinguish between lung adenocarcinoma and squamous cell carcinoma. To extract features from non–small cell lung carcinoma images, a three-layer convolution and three-layer max-pooling-based CNN is used. A few important features have been selected from the extracted features using the XGBoost algorithm as the optimal feature. Finally, LSTM has been used for the classification of carcinoma types. The accuracy of the proposed method is 99.57 per cent, and the false positive rate is 0.427 per cent.FindingsThe proposed CNN–XGBoost–LSTM hybrid method has significantly improved the results in distinguishing between adenocarcinoma and squamous cell carcinoma. The importance of the method can be outlined as follows: It has a very low false positive rate of 0.427 per cent. It has very high accuracy, i.e. 99.57 per cent. CNN-based features are providing accurate results in classifying lung carcinoma. It has the potential to serve as an assisting aid for doctors.Practical implicationsIt can be used by doctors as a secondary tool for the analysis of non–small cell lung cancers.Social implicationsIt can help rural doctors by sending the patients to specialized doctors for more analysis of lung cancer.Originality/valueIn this work, a hybrid method using CNN, XGBoost and LSTM has been proposed to distinguish between lung adenocarcinoma and squamous cell carcinoma. A three-layer convolution and three-layer max-pooling-based CNN is used to extract features from the non–small cell lung carcinoma images. A few important features have been selected from the extracted features using the XGBoost algorithm as the optimal feature. Finally, LSTM has been used for the classification of carcinoma types.
区分肺腺癌和鳞状细胞癌的混合学习方法
目的提出的工作的目的是确定最常见的非小细胞癌类型,如腺癌和鳞状细胞癌,在人群中。工作的另一个目标是降低分类过程中的误报率。在这项工作中,提出了一种使用卷积神经网络(cnn)、极端梯度增强(XGBoost)和长短期记忆网络(LSTMs)的混合方法来区分肺腺癌和鳞状细胞癌。为了从非小细胞肺癌图像中提取特征,使用了三层卷积和基于三层最大池化的CNN。使用XGBoost算法从提取的特征中选择一些重要的特征作为最优特征。最后,LSTM已被用于肿瘤类型的分类。结果提出的CNN-XGBoost-LSTM混合方法对腺癌和鳞状细胞癌的鉴别结果有显著提高。该方法的重要性可以概括如下:它的假阳性率非常低,为0.427%。准确率非常高,为99.57%。基于cnn的特征在肺癌分类中提供了准确的结果。它有可能成为医生的辅助工具。它可以被医生用作分析非小细胞肺癌的辅助工具。它可以帮助农村医生将患者送到专科医生那里进行更多的肺癌分析。在这项工作中,我们提出了一种使用CNN、XGBoost和LSTM的混合方法来区分肺腺癌和鳞状细胞癌。采用三层卷积和基于三层最大池化的CNN对非小细胞肺癌图像进行特征提取。使用XGBoost算法从提取的特征中选择一些重要的特征作为最优特征。最后,LSTM已被用于肿瘤类型的分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Data Technologies and Applications
Data Technologies and Applications Social Sciences-Library and Information Sciences
CiteScore
3.80
自引率
6.20%
发文量
29
期刊介绍: Previously published as: Program Online from: 2018 Subject Area: Information & Knowledge Management, Library Studies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信