M. Yazdanparast, Razieh Sheikhpour, M. Z. Soroush, F. Ghanizadeh
{"title":"利用基于样条回归的框架在硅学中识别急性白血病分类的有效基因","authors":"M. Yazdanparast, Razieh Sheikhpour, M. Z. Soroush, F. Ghanizadeh","doi":"10.18502/ijpho.v14i2.15269","DOIUrl":null,"url":null,"abstract":"Background: Microarray technology enables the examination of gene expression in thousands of genes and can be highly effective in identifying various types of cancers, including leukemia. However, many genes in microarray data are redundant and lack useful information for cancer diagnosis. The main objective of this study is to identify relevant and effective genes in classification of leukemia microarray data using a spline regression-based method, taking into account the correlation between genes. \nMaterials and Methods: In this analytical study, leukemia microarray data are used to identify relevant genes in classification of leukemia into Acute Myeloid Leukemia (AML) and Acute Lymphoblastic Leukemia (ALL) using a spline regression-based gene selection method, called SRS3FS based on ℓ2,p-norm (0 < p ≤ 1). Subsequently, the support vector machine (SVM) algorithm is employed to classify leukemia data into AML and ALL. \nResults: In this study, the classification results of SVM algorithm for 5, 10, 15, and 20 genes reveal that the SRS3FS method, employing ℓ2,1/4-norm, ℓ2,1/2-norm and ℓ2,3/4-norm, exhibited the highest accuracy of 97.06% when identifying 10 genes for distinguishing between AML and ALL. Moreover, the leukemia data was classified into AML and ALL with an accuracy of 100%, using a gene identified by the SRS3FS method based on ℓ2,3/4-norm and ℓ2,1-norm. The gene labeled as number 3252, annotated as GLUTATHIONE S-TRANSFERASE, MICROSOMAL, is recognized as the most important gene. \nConclusion: The experimental results on leukemia microarray data demonstrate that the spline regression-based gene selection method can effectively identify relevant genes in classification and prediction of leukemia.","PeriodicalId":129489,"journal":{"name":"Iranian Journal of Pediatric Hematology & Oncology","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"In Silico Identification of Effective Genes for Acute Leukemia Classification Using a Spline Regression-based Framework\",\"authors\":\"M. Yazdanparast, Razieh Sheikhpour, M. Z. Soroush, F. Ghanizadeh\",\"doi\":\"10.18502/ijpho.v14i2.15269\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Microarray technology enables the examination of gene expression in thousands of genes and can be highly effective in identifying various types of cancers, including leukemia. However, many genes in microarray data are redundant and lack useful information for cancer diagnosis. The main objective of this study is to identify relevant and effective genes in classification of leukemia microarray data using a spline regression-based method, taking into account the correlation between genes. \\nMaterials and Methods: In this analytical study, leukemia microarray data are used to identify relevant genes in classification of leukemia into Acute Myeloid Leukemia (AML) and Acute Lymphoblastic Leukemia (ALL) using a spline regression-based gene selection method, called SRS3FS based on ℓ2,p-norm (0 < p ≤ 1). Subsequently, the support vector machine (SVM) algorithm is employed to classify leukemia data into AML and ALL. \\nResults: In this study, the classification results of SVM algorithm for 5, 10, 15, and 20 genes reveal that the SRS3FS method, employing ℓ2,1/4-norm, ℓ2,1/2-norm and ℓ2,3/4-norm, exhibited the highest accuracy of 97.06% when identifying 10 genes for distinguishing between AML and ALL. Moreover, the leukemia data was classified into AML and ALL with an accuracy of 100%, using a gene identified by the SRS3FS method based on ℓ2,3/4-norm and ℓ2,1-norm. The gene labeled as number 3252, annotated as GLUTATHIONE S-TRANSFERASE, MICROSOMAL, is recognized as the most important gene. \\nConclusion: The experimental results on leukemia microarray data demonstrate that the spline regression-based gene selection method can effectively identify relevant genes in classification and prediction of leukemia.\",\"PeriodicalId\":129489,\"journal\":{\"name\":\"Iranian Journal of Pediatric Hematology & Oncology\",\"volume\":\"10 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Iranian Journal of Pediatric Hematology & Oncology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18502/ijpho.v14i2.15269\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Iranian Journal of Pediatric Hematology & Oncology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18502/ijpho.v14i2.15269","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
背景:微阵列技术能够检测数千个基因的表达,在鉴别包括白血病在内的各类癌症方面非常有效。然而,微阵列数据中的许多基因是冗余的,缺乏对癌症诊断有用的信息。本研究的主要目的是利用基于样条回归的方法,在考虑基因间相关性的基础上,识别白血病芯片数据分类中相关的有效基因。材料与方法:在这项分析研究中,白血病微阵列数据被用来识别将白血病分为急性髓性白血病(AML)和急性淋巴细胞白血病(ALL)的相关基因,使用基于ℓ2,p-norm(0 < p ≤ 1)的基于样条回归的基因选择方法(SRS3FS)。随后,采用支持向量机(SVM)算法将白血病数据分为急性髓细胞白血病(AML)和急性淋巴细胞白血病(ALL)。结果本研究中,SVM 算法对 5、10、15 和 20 个基因的分类结果显示,SRS3FS 方法采用 ℓ2,1/4-norm, ℓ2,1/2-norm 和 ℓ2,3/4-norm 对 10 个基因进行区分 AML 和 ALL 的准确率最高,达到 97.06%。此外,使用基于ℓ2,3/4-norm 和 ℓ2,1-norm 的 SRS3FS 方法识别的基因,将白血病数据分为 AML 和 ALL 的准确率为 100%。标注为 3252 号的基因(注释为 GLUTATHIONE S-TRANSFERASE,MICROSOMAL)被认为是最重要的基因。结论白血病芯片数据的实验结果表明,基于样条回归的基因选择方法能有效识别白血病分类和预测中的相关基因。
In Silico Identification of Effective Genes for Acute Leukemia Classification Using a Spline Regression-based Framework
Background: Microarray technology enables the examination of gene expression in thousands of genes and can be highly effective in identifying various types of cancers, including leukemia. However, many genes in microarray data are redundant and lack useful information for cancer diagnosis. The main objective of this study is to identify relevant and effective genes in classification of leukemia microarray data using a spline regression-based method, taking into account the correlation between genes.
Materials and Methods: In this analytical study, leukemia microarray data are used to identify relevant genes in classification of leukemia into Acute Myeloid Leukemia (AML) and Acute Lymphoblastic Leukemia (ALL) using a spline regression-based gene selection method, called SRS3FS based on ℓ2,p-norm (0 < p ≤ 1). Subsequently, the support vector machine (SVM) algorithm is employed to classify leukemia data into AML and ALL.
Results: In this study, the classification results of SVM algorithm for 5, 10, 15, and 20 genes reveal that the SRS3FS method, employing ℓ2,1/4-norm, ℓ2,1/2-norm and ℓ2,3/4-norm, exhibited the highest accuracy of 97.06% when identifying 10 genes for distinguishing between AML and ALL. Moreover, the leukemia data was classified into AML and ALL with an accuracy of 100%, using a gene identified by the SRS3FS method based on ℓ2,3/4-norm and ℓ2,1-norm. The gene labeled as number 3252, annotated as GLUTATHIONE S-TRANSFERASE, MICROSOMAL, is recognized as the most important gene.
Conclusion: The experimental results on leukemia microarray data demonstrate that the spline regression-based gene selection method can effectively identify relevant genes in classification and prediction of leukemia.