Diagnosis of leukemia using microarray analysis based on Hidden Markov Model and Random Convolutional Kernel Transform

IF 3.1 4区 生物学 Q2 BIOLOGY
Sareh Baqeri Matak , Elham Askari , Sara Motamed
{"title":"Diagnosis of leukemia using microarray analysis based on Hidden Markov Model and Random Convolutional Kernel Transform","authors":"Sareh Baqeri Matak ,&nbsp;Elham Askari ,&nbsp;Sara Motamed","doi":"10.1016/j.compbiolchem.2025.108676","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>Leukemia is one of the most prevalent cancers worldwide, and early detection is critical for effective treatment. Microarray data is a key tool in this process, given the vast number of genes involved, which makes the analysis complex and time-consuming. Identifying relevant genes is a crucial step in disease diagnosis.</div></div><div><h3>Material and methods</h3><div>This study aims to improve the diagnostic accuracy of various leukemia types by using microarray data in combination with advanced deep learning techniques. The proposed model begins with selecting essential features and sequences relevant to diagnosis. These data sequences are processed using a Generative Adversarial Network (GAN) with a U-Net architecture to generate synthetic data. Both the synthetic and original data are then labeled for analysis. Feature ranking is conducted using a Hidden Markov Model (HMM), followed by classification using the Random Convolutional Kernel Transformation (ROCKET) approach. This process ultimately predicts five leukemia categories within the sample.</div></div><div><h3>Results</h3><div>The results demonstrate that the proposed model achieves a high classification accuracy of 99.26 %, outperforming existing methods.</div></div><div><h3>Conclusion</h3><div>This research highlights the importance of leveraging DNA alterations associated with genetic mutations to improve leukemia diagnostics, emphasizing the potential for early detection and intervention. In simpler terms, identifying DNA modifications across the genome can help predict an individual's likelihood of developing leukemia. Detecting these changes can significantly aid in diagnosis.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"120 ","pages":"Article 108676"},"PeriodicalIF":3.1000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927125003378","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction

Leukemia is one of the most prevalent cancers worldwide, and early detection is critical for effective treatment. Microarray data is a key tool in this process, given the vast number of genes involved, which makes the analysis complex and time-consuming. Identifying relevant genes is a crucial step in disease diagnosis.

Material and methods

This study aims to improve the diagnostic accuracy of various leukemia types by using microarray data in combination with advanced deep learning techniques. The proposed model begins with selecting essential features and sequences relevant to diagnosis. These data sequences are processed using a Generative Adversarial Network (GAN) with a U-Net architecture to generate synthetic data. Both the synthetic and original data are then labeled for analysis. Feature ranking is conducted using a Hidden Markov Model (HMM), followed by classification using the Random Convolutional Kernel Transformation (ROCKET) approach. This process ultimately predicts five leukemia categories within the sample.

Results

The results demonstrate that the proposed model achieves a high classification accuracy of 99.26 %, outperforming existing methods.

Conclusion

This research highlights the importance of leveraging DNA alterations associated with genetic mutations to improve leukemia diagnostics, emphasizing the potential for early detection and intervention. In simpler terms, identifying DNA modifications across the genome can help predict an individual's likelihood of developing leukemia. Detecting these changes can significantly aid in diagnosis.
基于隐马尔可夫模型和随机卷积核变换的微阵列分析诊断白血病
白血病是世界上最常见的癌症之一,早期发现对有效治疗至关重要。微阵列数据是这一过程中的关键工具,因为涉及的基因数量庞大,这使得分析变得复杂和耗时。识别相关基因是疾病诊断的关键步骤。材料与方法本研究旨在利用微阵列数据与先进的深度学习技术相结合,提高各种白血病类型的诊断准确性。该模型首先选择与诊断相关的基本特征和序列。这些数据序列使用具有U-Net架构的生成对抗网络(GAN)进行处理以生成合成数据。然后,合成数据和原始数据都被标记以供分析。使用隐马尔可夫模型(HMM)进行特征排序,然后使用随机卷积核变换(ROCKET)方法进行分类。这个过程最终预测了样本中的五种白血病类型。结果该模型的分类准确率达到99.26 %,优于现有的分类方法。本研究强调了利用与基因突变相关的DNA改变来提高白血病诊断的重要性,强调了早期发现和干预的潜力。简单地说,识别基因组中的DNA修饰可以帮助预测个体患白血病的可能性。检测这些变化可以显著帮助诊断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational Biology and Chemistry
Computational Biology and Chemistry 生物-计算机:跨学科应用
CiteScore
6.10
自引率
3.20%
发文量
142
审稿时长
24 days
期刊介绍: Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered. Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered. Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信