利用白血病基因表达数据和基于稀疏度的基因选择方法预测血癌

Iranian Journal of Pediatric Hematology & Oncology Pub Date : 2023-01-03 DOI:10.18502/ijpho.v13i1.11629

S. Mehrabani, Morteza Zangeneh Soroush, Negin Kheiri, R. Sheikhpour, M. Bahrami

{"title":"利用白血病基因表达数据和基于稀疏度的基因选择方法预测血癌","authors":"S. Mehrabani, Morteza Zangeneh Soroush, Negin Kheiri, R. Sheikhpour, M. Bahrami","doi":"10.18502/ijpho.v13i1.11629","DOIUrl":null,"url":null,"abstract":"Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method. \nMaterials and Methods: In this descriptive study, the microarray gene expression data of 72 patients with acute myeloid leukemia (AML) and lymphoblastic leukemia (ALL) was used. To remove the redundant genes and identify the most important genes in the prediction of AML and ALL, a robust ℓ2,p-norm (0 < p ≤1) sparsity-based gene selection method was applied, in which the parameter p method was implemented from 1/4, 1/2, 3/4 and 1. Then, the most important genes were used by the random forest (RF) and support vector machine (SVM) classifiers for prediction of AML and ALL. \nResults: The RF and SVM classifiers correctly classified all AML and ALL samples. The RF classifier obtained the performance of 100% using 10 genes selected by the ℓ2,1/2-norm and ℓ2,1-norm sparsity-based gene selection methods. Moreover, the SVM classifier obtained a performance of 100% using 10 genes selected by the ℓ2,1/2-norm method. Seven common genes were identified by all four values of parameter p in the ℓ2,p-norm method as the most important genes in the classification of AML and ALL, and the gene with the description “PRTN3 Proteinase 3 (serine proteinase, neutrophil, Wegener granulomatosis autoantigen” was identified as the most important gene. \nConclusion: The results obtained in this study indicated that the prediction of blood cancer from leukemia microarray gene expression data can be carried out using the robust ℓ2,p-norm sparsity-based gene selection method and classification algorithms. It can be useful to examine the expression level of the genes identified by this study to predict leukemia.","PeriodicalId":129489,"journal":{"name":"Iranian Journal of Pediatric Hematology & Oncology","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods\",\"authors\":\"S. Mehrabani, Morteza Zangeneh Soroush, Negin Kheiri, R. Sheikhpour, M. Bahrami\",\"doi\":\"10.18502/ijpho.v13i1.11629\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method. \\nMaterials and Methods: In this descriptive study, the microarray gene expression data of 72 patients with acute myeloid leukemia (AML) and lymphoblastic leukemia (ALL) was used. To remove the redundant genes and identify the most important genes in the prediction of AML and ALL, a robust ℓ2,p-norm (0 < p ≤1) sparsity-based gene selection method was applied, in which the parameter p method was implemented from 1/4, 1/2, 3/4 and 1. Then, the most important genes were used by the random forest (RF) and support vector machine (SVM) classifiers for prediction of AML and ALL. \\nResults: The RF and SVM classifiers correctly classified all AML and ALL samples. The RF classifier obtained the performance of 100% using 10 genes selected by the ℓ2,1/2-norm and ℓ2,1-norm sparsity-based gene selection methods. Moreover, the SVM classifier obtained a performance of 100% using 10 genes selected by the ℓ2,1/2-norm method. Seven common genes were identified by all four values of parameter p in the ℓ2,p-norm method as the most important genes in the classification of AML and ALL, and the gene with the description “PRTN3 Proteinase 3 (serine proteinase, neutrophil, Wegener granulomatosis autoantigen” was identified as the most important gene. \\nConclusion: The results obtained in this study indicated that the prediction of blood cancer from leukemia microarray gene expression data can be carried out using the robust ℓ2,p-norm sparsity-based gene selection method and classification algorithms. It can be useful to examine the expression level of the genes identified by this study to predict leukemia.\",\"PeriodicalId\":129489,\"journal\":{\"name\":\"Iranian Journal of Pediatric Hematology & Oncology\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Iranian Journal of Pediatric Hematology & Oncology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18502/ijpho.v13i1.11629\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Iranian Journal of Pediatric Hematology & Oncology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18502/ijpho.v13i1.11629","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

背景:DNA微阵列是一种有用的技术，可以同时评估数千个基因的表达。它可用于检测癌症类型和癌症生物标志物。本研究旨在利用白血病基因表达数据和基于l2,p-norm稀疏性的稳健基因选择方法来预测血癌。材料与方法:本描述性研究采用72例急性髓性白血病(AML)和淋巴母细胞白血病(ALL)患者的微阵列基因表达数据。为了剔除冗余基因，识别在AML和ALL预测中最重要的基因，采用一种鲁棒的基于l2,p-norm (0 < p≤1)稀疏性的基因选择方法，其中参数p方法从1/4,1/2,3/4和1中实现。然后，随机森林(RF)和支持向量机(SVM)分类器使用最重要的基因来预测AML和ALL。结果:RF和SVM分类器正确分类了所有AML和all样本。RF分类器使用基于1,2,1 /2范数和1,1范数稀疏度的基因选择方法选择的10个基因获得了100%的性能。此外，SVM分类器使用1,2 /2-范数方法选择的10个基因获得了100%的性能。在2 - p-norm法中参数p的4个值均鉴定出7个常见基因为AML和all分类的最重要基因，其中描述为“PRTN3蛋白酶3(丝氨酸蛋白酶、中性粒细胞、韦格纳肉芽肿自身抗原”的基因为最重要基因。结论:本研究结果表明，基于鲁棒性的基于l2,p-norm稀疏性的基因选择方法和分类算法可以从白血病微阵列基因表达数据中预测血癌。通过检测这些基因的表达水平来预测白血病是有用的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Prediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods

Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method. Materials and Methods: In this descriptive study, the microarray gene expression data of 72 patients with acute myeloid leukemia (AML) and lymphoblastic leukemia (ALL) was used. To remove the redundant genes and identify the most important genes in the prediction of AML and ALL, a robust ℓ2,p-norm (0 < p ≤1) sparsity-based gene selection method was applied, in which the parameter p method was implemented from 1/4, 1/2, 3/4 and 1. Then, the most important genes were used by the random forest (RF) and support vector machine (SVM) classifiers for prediction of AML and ALL. Results: The RF and SVM classifiers correctly classified all AML and ALL samples. The RF classifier obtained the performance of 100% using 10 genes selected by the ℓ2,1/2-norm and ℓ2,1-norm sparsity-based gene selection methods. Moreover, the SVM classifier obtained a performance of 100% using 10 genes selected by the ℓ2,1/2-norm method. Seven common genes were identified by all four values of parameter p in the ℓ2,p-norm method as the most important genes in the classification of AML and ALL, and the gene with the description “PRTN3 Proteinase 3 (serine proteinase, neutrophil, Wegener granulomatosis autoantigen” was identified as the most important gene. Conclusion: The results obtained in this study indicated that the prediction of blood cancer from leukemia microarray gene expression data can be carried out using the robust ℓ2,p-norm sparsity-based gene selection method and classification algorithms. It can be useful to examine the expression level of the genes identified by this study to predict leukemia.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Iranian Journal of Pediatric Hematology & Oncology

自引率

0.00%

发文量