Examining feature extraction and classification modules in machine learning for diagnosis of low-dose computed tomographic screening-detected in vivo lesions.

IF 16.4 1区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Accounts of Chemical Research Pub Date : 2024-07-01 Epub Date: 2024-07-09 DOI:10.1117/1.JMI.11.4.044501
Daniel D Liang, David D Liang, Marc J Pomeroy, Yongfeng Gao, Licheng R Kuo, Lihong C Li
{"title":"Examining feature extraction and classification modules in machine learning for diagnosis of low-dose computed tomographic screening-detected <i>in vivo</i> lesions.","authors":"Daniel D Liang, David D Liang, Marc J Pomeroy, Yongfeng Gao, Licheng R Kuo, Lihong C Li","doi":"10.1117/1.JMI.11.4.044501","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Medical imaging-based machine learning (ML) for computer-aided diagnosis of <i>in vivo</i> lesions consists of two basic components or modules of (i) feature extraction from non-invasively acquired medical images and (ii) feature classification for prediction of malignancy of lesions detected or localized in the medical images. This study investigates their individual performances for diagnosis of low-dose computed tomography (CT) screening-detected lesions of pulmonary nodules and colorectal polyps.</p><p><strong>Approach: </strong>Three feature extraction methods were investigated. One uses the mathematical descriptor of gray-level co-occurrence image texture measure to extract the Haralick image texture features (HFs). One uses the convolutional neural network (CNN) architecture to extract deep learning (DL) image abstractive features (DFs). The third one uses the interactions between lesion tissues and X-ray energy of CT to extract tissue-energy specific characteristic features (TFs). All the above three categories of extracted features were classified by the random forest (RF) classifier with comparison to the DL-CNN method, which reads the images, extracts the DFs, and classifies the DFs in an end-to-end manner. The ML diagnosis of lesions or prediction of lesion malignancy was measured by the area under the receiver operating characteristic curve (AUC). Three lesion image datasets were used. The lesions' tissue pathological reports were used as the learning labels.</p><p><strong>Results: </strong>Experiments on the three datasets produced AUC values of 0.724 to 0.878 for the HFs, 0.652 to 0.965 for the DFs, and 0.985 to 0.996 for the TFs, compared to the DL-CNN of 0.694 to 0.964. These experimental outcomes indicate that the RF classifier performed comparably to the DL-CNN classification module and the extraction of tissue-energy specific characteristic features dramatically improved AUC value.</p><p><strong>Conclusions: </strong>The feature extraction module is more important than the feature classification module. Extraction of tissue-energy specific characteristic features is more important than extraction of image abstractive and characteristic features.</p>","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11234229/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1117/1.JMI.11.4.044501","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/9 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: Medical imaging-based machine learning (ML) for computer-aided diagnosis of in vivo lesions consists of two basic components or modules of (i) feature extraction from non-invasively acquired medical images and (ii) feature classification for prediction of malignancy of lesions detected or localized in the medical images. This study investigates their individual performances for diagnosis of low-dose computed tomography (CT) screening-detected lesions of pulmonary nodules and colorectal polyps.

Approach: Three feature extraction methods were investigated. One uses the mathematical descriptor of gray-level co-occurrence image texture measure to extract the Haralick image texture features (HFs). One uses the convolutional neural network (CNN) architecture to extract deep learning (DL) image abstractive features (DFs). The third one uses the interactions between lesion tissues and X-ray energy of CT to extract tissue-energy specific characteristic features (TFs). All the above three categories of extracted features were classified by the random forest (RF) classifier with comparison to the DL-CNN method, which reads the images, extracts the DFs, and classifies the DFs in an end-to-end manner. The ML diagnosis of lesions or prediction of lesion malignancy was measured by the area under the receiver operating characteristic curve (AUC). Three lesion image datasets were used. The lesions' tissue pathological reports were used as the learning labels.

Results: Experiments on the three datasets produced AUC values of 0.724 to 0.878 for the HFs, 0.652 to 0.965 for the DFs, and 0.985 to 0.996 for the TFs, compared to the DL-CNN of 0.694 to 0.964. These experimental outcomes indicate that the RF classifier performed comparably to the DL-CNN classification module and the extraction of tissue-energy specific characteristic features dramatically improved AUC value.

Conclusions: The feature extraction module is more important than the feature classification module. Extraction of tissue-energy specific characteristic features is more important than extraction of image abstractive and characteristic features.

研究用于诊断低剂量计算机断层扫描筛查检测到的体内病变的机器学习中的特征提取和分类模块。
目的:用于计算机辅助诊断体内病变的基于医学影像的机器学习(ML)由两个基本组件或模块组成:(i) 从非侵入性获取的医学影像中提取特征;(ii) 对医学影像中检测或定位的病变进行预测的特征分类。本研究探讨了它们在诊断低剂量计算机断层扫描(CT)筛查检测到的肺结节和结直肠息肉病变时的各自性能:方法:研究了三种特征提取方法。一种方法使用灰度级共现图像纹理度量的数学描述符来提取哈拉利克图像纹理特征(HFs)。一种使用卷积神经网络(CNN)架构提取深度学习(DL)图像抽象特征(DFs)。第三种是利用病变组织与 CT X 射线能量之间的相互作用来提取组织能量特异性特征(TFs)。与 DL-CNN 方法相比,上述三类提取的特征均由随机森林(RF)分类器进行分类,而 DL-CNN 方法是以端到端的方式读取图像、提取 DFs 并对 DFs 进行分类。病变的 ML 诊断或病变恶性程度的预测是通过接收者操作特征曲线下面积(AUC)来衡量的。研究使用了三个病变图像数据集。病变组织的病理报告被用作学习标签:在三个数据集上的实验结果显示,HF 的 AUC 值为 0.724 到 0.878,DF 为 0.652 到 0.965,TF 为 0.985 到 0.996,而 DL-CNN 为 0.694 到 0.964。这些实验结果表明,射频分类器的性能与 DL-CNN 分类模块相当,而组织能量特异性特征的提取则显著提高了 AUC 值:结论:特征提取模块比特征分类模块更重要。结论:特征提取模块比特征分类模块更重要,组织能量特定特征的提取比图像抽象特征和特征的提取更重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Accounts of Chemical Research
Accounts of Chemical Research 化学-化学综合
CiteScore
31.40
自引率
1.10%
发文量
312
审稿时长
2 months
期刊介绍: Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance. Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信