Subrata Bhattacharjee, Yeong-Byn Hwang, Rashadul Islam Sumon, H. Rahman, Dong-Woo Hyeon, Damin Moon, Kouayep Sonia Carole, Hee-Cheol Kim, Heung-Kook Choi
{"title":"Cluster Analysis: Unsupervised Classification for Identifying Benign and Malignant Tumors on Whole Slide Image of Prostate Cancer","authors":"Subrata Bhattacharjee, Yeong-Byn Hwang, Rashadul Islam Sumon, H. Rahman, Dong-Woo Hyeon, Damin Moon, Kouayep Sonia Carole, Hee-Cheol Kim, Heung-Kook Choi","doi":"10.1109/IPAS55744.2022.10052952","DOIUrl":null,"url":null,"abstract":"Recently, many fields have widely used cluster analysis: psychology, biology, statistics, pattern recognition, information retrieval, machine learning, and data mining. Diagnosis of histopathological images of prostate cancer is one of the routine tasks for pathologists and it is challenging for pathologists to analyze the formation of glands and tumors based on the Gleason grading system. In this study, unsupervised classification has been performed for differentiating malignant (cancerous) from benign (non-cancerous) tumors. Therefore, the unsupervised-based computer-aided diagnosis (CAD) technique would be of great benefit in easing the workloads of pathologists. This technique is used to find meaningful clustering objects (i.e., individuals, entities, patterns, or cases) and identify useful patterns. Radiomic-based features were extracted for cluster analysis using the gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), and gray-level size zone matrix (GLSZM) techniques. Multi-clustering techniques used for the unsupervised classification are K-means clustering, K-medoids clustering, Agglomerative Hierarchical (AH) clustering, Gaussian mixture model (GMM) clustering, and Spectral clustering. The quality of the clustering algorithms was determined using Purity, Silhouettes, Adjusted Rand, Fowlkes Mallows, and Calinski Harabasz (CH) scores. However, the best-performing algorithm (i.e., K-means) has been applied to predict and annotate the cancerous regions in the whole slide image (WSI) to compare with the pathologist annotation.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPAS55744.2022.10052952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, many fields have widely used cluster analysis: psychology, biology, statistics, pattern recognition, information retrieval, machine learning, and data mining. Diagnosis of histopathological images of prostate cancer is one of the routine tasks for pathologists and it is challenging for pathologists to analyze the formation of glands and tumors based on the Gleason grading system. In this study, unsupervised classification has been performed for differentiating malignant (cancerous) from benign (non-cancerous) tumors. Therefore, the unsupervised-based computer-aided diagnosis (CAD) technique would be of great benefit in easing the workloads of pathologists. This technique is used to find meaningful clustering objects (i.e., individuals, entities, patterns, or cases) and identify useful patterns. Radiomic-based features were extracted for cluster analysis using the gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), and gray-level size zone matrix (GLSZM) techniques. Multi-clustering techniques used for the unsupervised classification are K-means clustering, K-medoids clustering, Agglomerative Hierarchical (AH) clustering, Gaussian mixture model (GMM) clustering, and Spectral clustering. The quality of the clustering algorithms was determined using Purity, Silhouettes, Adjusted Rand, Fowlkes Mallows, and Calinski Harabasz (CH) scores. However, the best-performing algorithm (i.e., K-means) has been applied to predict and annotate the cancerous regions in the whole slide image (WSI) to compare with the pathologist annotation.