{"title":"Diagnosis of leukemia using microarray analysis based on Hidden Markov Model and Random Convolutional Kernel Transform","authors":"Sareh Baqeri Matak , Elham Askari , Sara Motamed","doi":"10.1016/j.compbiolchem.2025.108676","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>Leukemia is one of the most prevalent cancers worldwide, and early detection is critical for effective treatment. Microarray data is a key tool in this process, given the vast number of genes involved, which makes the analysis complex and time-consuming. Identifying relevant genes is a crucial step in disease diagnosis.</div></div><div><h3>Material and methods</h3><div>This study aims to improve the diagnostic accuracy of various leukemia types by using microarray data in combination with advanced deep learning techniques. The proposed model begins with selecting essential features and sequences relevant to diagnosis. These data sequences are processed using a Generative Adversarial Network (GAN) with a U-Net architecture to generate synthetic data. Both the synthetic and original data are then labeled for analysis. Feature ranking is conducted using a Hidden Markov Model (HMM), followed by classification using the Random Convolutional Kernel Transformation (ROCKET) approach. This process ultimately predicts five leukemia categories within the sample.</div></div><div><h3>Results</h3><div>The results demonstrate that the proposed model achieves a high classification accuracy of 99.26 %, outperforming existing methods.</div></div><div><h3>Conclusion</h3><div>This research highlights the importance of leveraging DNA alterations associated with genetic mutations to improve leukemia diagnostics, emphasizing the potential for early detection and intervention. In simpler terms, identifying DNA modifications across the genome can help predict an individual's likelihood of developing leukemia. Detecting these changes can significantly aid in diagnosis.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108676"},"PeriodicalIF":3.1000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927125003378","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction
Leukemia is one of the most prevalent cancers worldwide, and early detection is critical for effective treatment. Microarray data is a key tool in this process, given the vast number of genes involved, which makes the analysis complex and time-consuming. Identifying relevant genes is a crucial step in disease diagnosis.
Material and methods
This study aims to improve the diagnostic accuracy of various leukemia types by using microarray data in combination with advanced deep learning techniques. The proposed model begins with selecting essential features and sequences relevant to diagnosis. These data sequences are processed using a Generative Adversarial Network (GAN) with a U-Net architecture to generate synthetic data. Both the synthetic and original data are then labeled for analysis. Feature ranking is conducted using a Hidden Markov Model (HMM), followed by classification using the Random Convolutional Kernel Transformation (ROCKET) approach. This process ultimately predicts five leukemia categories within the sample.
Results
The results demonstrate that the proposed model achieves a high classification accuracy of 99.26 %, outperforming existing methods.
Conclusion
This research highlights the importance of leveraging DNA alterations associated with genetic mutations to improve leukemia diagnostics, emphasizing the potential for early detection and intervention. In simpler terms, identifying DNA modifications across the genome can help predict an individual's likelihood of developing leukemia. Detecting these changes can significantly aid in diagnosis.
期刊介绍:
Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered.
Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered.
Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.