Examining feature extraction and classification modules in machine learning for diagnosis of low-dose computed tomographic screening-detected in vivo lesions.
Daniel D Liang, David D Liang, Marc J Pomeroy, Yongfeng Gao, Licheng R Kuo, Lihong C Li
{"title":"Examining feature extraction and classification modules in machine learning for diagnosis of low-dose computed tomographic screening-detected <i>in vivo</i> lesions.","authors":"Daniel D Liang, David D Liang, Marc J Pomeroy, Yongfeng Gao, Licheng R Kuo, Lihong C Li","doi":"10.1117/1.JMI.11.4.044501","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Medical imaging-based machine learning (ML) for computer-aided diagnosis of <i>in vivo</i> lesions consists of two basic components or modules of (i) feature extraction from non-invasively acquired medical images and (ii) feature classification for prediction of malignancy of lesions detected or localized in the medical images. This study investigates their individual performances for diagnosis of low-dose computed tomography (CT) screening-detected lesions of pulmonary nodules and colorectal polyps.</p><p><strong>Approach: </strong>Three feature extraction methods were investigated. One uses the mathematical descriptor of gray-level co-occurrence image texture measure to extract the Haralick image texture features (HFs). One uses the convolutional neural network (CNN) architecture to extract deep learning (DL) image abstractive features (DFs). The third one uses the interactions between lesion tissues and X-ray energy of CT to extract tissue-energy specific characteristic features (TFs). All the above three categories of extracted features were classified by the random forest (RF) classifier with comparison to the DL-CNN method, which reads the images, extracts the DFs, and classifies the DFs in an end-to-end manner. The ML diagnosis of lesions or prediction of lesion malignancy was measured by the area under the receiver operating characteristic curve (AUC). Three lesion image datasets were used. The lesions' tissue pathological reports were used as the learning labels.</p><p><strong>Results: </strong>Experiments on the three datasets produced AUC values of 0.724 to 0.878 for the HFs, 0.652 to 0.965 for the DFs, and 0.985 to 0.996 for the TFs, compared to the DL-CNN of 0.694 to 0.964. These experimental outcomes indicate that the RF classifier performed comparably to the DL-CNN classification module and the extraction of tissue-energy specific characteristic features dramatically improved AUC value.</p><p><strong>Conclusions: </strong>The feature extraction module is more important than the feature classification module. Extraction of tissue-energy specific characteristic features is more important than extraction of image abstractive and characteristic features.</p>","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11234229/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1117/1.JMI.11.4.044501","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/9 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Medical imaging-based machine learning (ML) for computer-aided diagnosis of in vivo lesions consists of two basic components or modules of (i) feature extraction from non-invasively acquired medical images and (ii) feature classification for prediction of malignancy of lesions detected or localized in the medical images. This study investigates their individual performances for diagnosis of low-dose computed tomography (CT) screening-detected lesions of pulmonary nodules and colorectal polyps.
Approach: Three feature extraction methods were investigated. One uses the mathematical descriptor of gray-level co-occurrence image texture measure to extract the Haralick image texture features (HFs). One uses the convolutional neural network (CNN) architecture to extract deep learning (DL) image abstractive features (DFs). The third one uses the interactions between lesion tissues and X-ray energy of CT to extract tissue-energy specific characteristic features (TFs). All the above three categories of extracted features were classified by the random forest (RF) classifier with comparison to the DL-CNN method, which reads the images, extracts the DFs, and classifies the DFs in an end-to-end manner. The ML diagnosis of lesions or prediction of lesion malignancy was measured by the area under the receiver operating characteristic curve (AUC). Three lesion image datasets were used. The lesions' tissue pathological reports were used as the learning labels.
Results: Experiments on the three datasets produced AUC values of 0.724 to 0.878 for the HFs, 0.652 to 0.965 for the DFs, and 0.985 to 0.996 for the TFs, compared to the DL-CNN of 0.694 to 0.964. These experimental outcomes indicate that the RF classifier performed comparably to the DL-CNN classification module and the extraction of tissue-energy specific characteristic features dramatically improved AUC value.
Conclusions: The feature extraction module is more important than the feature classification module. Extraction of tissue-energy specific characteristic features is more important than extraction of image abstractive and characteristic features.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.