基于马尔可夫链的肿瘤DNA序列特征提取方法。

IF 2.2 4区 工程技术 Q3 PHARMACOLOGY & PHARMACY
Bioimpacts Pub Date : 2021-01-01 Epub Date: 2020-03-24 DOI:10.34172/bi.2021.16
Amin Khodaei, Mohammad-Reza Feizi-Derakhshi, Behzad Mozaffari-Tazehkand
{"title":"基于马尔可夫链的肿瘤DNA序列特征提取方法。","authors":"Amin Khodaei,&nbsp;Mohammad-Reza Feizi-Derakhshi,&nbsp;Behzad Mozaffari-Tazehkand","doi":"10.34172/bi.2021.16","DOIUrl":null,"url":null,"abstract":"<p><p><i><b>Introduction:</b></i> In recent decades, the growing rate of cancer incidence is a big concern for most societies. Due to the genetic origins of cancer disease, its internal structure is necessary for the study of this disease. <b><i>Methods:</i></b> In this research, cancer data are analyzed based on DNA sequences. The transition probability of occurring two pairs of nucleotides in DNA sequences has Markovian property. This property inspires the idea of feature dimension reduction of DNA sequence for overcoming the high computational overhead of genes analysis. This idea is utilized in this research based on the Markovian property of DNA sequences. This mapping decreases feature dimensions and conserves basic properties for discrimination of cancerous and non-cancerous genes. <i><b>Results:</b></i> The results showed that a non-linear support vector machine (SVM) classifier with RBF and polynomial kernel functions can discriminate selected cancerous samples from non-cancerous ones. Experimental results based on the 10-fold cross-validation and accuracy metrics verified that the proposed method has low computational overhead and high accuracy. <i><b>Conclusion:</b></i> The proposed algorithm was successfully tested on related research case studies. In general, a combination of proposed Markovian-based feature reduction and non-linear SVM classifier can be considered as one of the best methods for discrimination of cancerous and non-cancerous genes.</p>","PeriodicalId":48614,"journal":{"name":"Bioimpacts","volume":"11 2","pages":"87-99"},"PeriodicalIF":2.2000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/8c/7a/bi-11-87.PMC8022238.pdf","citationCount":"3","resultStr":"{\"title\":\"A Markov chain-based feature extraction method for classification and identification of cancerous DNA sequences.\",\"authors\":\"Amin Khodaei,&nbsp;Mohammad-Reza Feizi-Derakhshi,&nbsp;Behzad Mozaffari-Tazehkand\",\"doi\":\"10.34172/bi.2021.16\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><i><b>Introduction:</b></i> In recent decades, the growing rate of cancer incidence is a big concern for most societies. Due to the genetic origins of cancer disease, its internal structure is necessary for the study of this disease. <b><i>Methods:</i></b> In this research, cancer data are analyzed based on DNA sequences. The transition probability of occurring two pairs of nucleotides in DNA sequences has Markovian property. This property inspires the idea of feature dimension reduction of DNA sequence for overcoming the high computational overhead of genes analysis. This idea is utilized in this research based on the Markovian property of DNA sequences. This mapping decreases feature dimensions and conserves basic properties for discrimination of cancerous and non-cancerous genes. <i><b>Results:</b></i> The results showed that a non-linear support vector machine (SVM) classifier with RBF and polynomial kernel functions can discriminate selected cancerous samples from non-cancerous ones. Experimental results based on the 10-fold cross-validation and accuracy metrics verified that the proposed method has low computational overhead and high accuracy. <i><b>Conclusion:</b></i> The proposed algorithm was successfully tested on related research case studies. In general, a combination of proposed Markovian-based feature reduction and non-linear SVM classifier can be considered as one of the best methods for discrimination of cancerous and non-cancerous genes.</p>\",\"PeriodicalId\":48614,\"journal\":{\"name\":\"Bioimpacts\",\"volume\":\"11 2\",\"pages\":\"87-99\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/8c/7a/bi-11-87.PMC8022238.pdf\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioimpacts\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.34172/bi.2021.16\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2020/3/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioimpacts","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.34172/bi.2021.16","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/3/24 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 3

摘要

导读:近几十年来,癌症发病率的增长是大多数社会关注的一个大问题。由于癌症的遗传起源,它的内部结构是研究这种疾病的必要条件。方法:在本研究中,基于DNA序列对癌症数据进行分析。DNA序列中出现两对核苷酸的转移概率具有马尔可夫性。这一特性激发了DNA序列特征降维的思想,以克服基因分析的高计算开销。基于DNA序列的马尔可夫性,本研究利用了这一思想。这种映射减少了特征维度,并保留了癌性和非癌性基因的基本特征。结果:基于RBF和多项式核函数的非线性支持向量机(SVM)分类器可以区分出癌变样本和非癌变样本。基于10倍交叉验证和精度度量的实验结果验证了该方法具有计算量小、精度高的特点。结论:本文提出的算法在相关研究案例中得到了成功的验证。总的来说,将本文提出的基于马尔可夫的特征约简与非线性支持向量机分类器相结合,可以被认为是鉴别癌基因与非癌基因的最佳方法之一。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

A Markov chain-based feature extraction method for classification and identification of cancerous DNA sequences.

A Markov chain-based feature extraction method for classification and identification of cancerous DNA sequences.

A Markov chain-based feature extraction method for classification and identification of cancerous DNA sequences.

A Markov chain-based feature extraction method for classification and identification of cancerous DNA sequences.

Introduction: In recent decades, the growing rate of cancer incidence is a big concern for most societies. Due to the genetic origins of cancer disease, its internal structure is necessary for the study of this disease. Methods: In this research, cancer data are analyzed based on DNA sequences. The transition probability of occurring two pairs of nucleotides in DNA sequences has Markovian property. This property inspires the idea of feature dimension reduction of DNA sequence for overcoming the high computational overhead of genes analysis. This idea is utilized in this research based on the Markovian property of DNA sequences. This mapping decreases feature dimensions and conserves basic properties for discrimination of cancerous and non-cancerous genes. Results: The results showed that a non-linear support vector machine (SVM) classifier with RBF and polynomial kernel functions can discriminate selected cancerous samples from non-cancerous ones. Experimental results based on the 10-fold cross-validation and accuracy metrics verified that the proposed method has low computational overhead and high accuracy. Conclusion: The proposed algorithm was successfully tested on related research case studies. In general, a combination of proposed Markovian-based feature reduction and non-linear SVM classifier can be considered as one of the best methods for discrimination of cancerous and non-cancerous genes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Bioimpacts
Bioimpacts Pharmacology, Toxicology and Pharmaceutics-Pharmaceutical Science
CiteScore
4.80
自引率
7.70%
发文量
36
审稿时长
5 weeks
期刊介绍: BioImpacts (BI) is a peer-reviewed multidisciplinary international journal, covering original research articles, reviews, commentaries, hypotheses, methodologies, and visions/reflections dealing with all aspects of biological and biomedical researches at molecular, cellular, functional and translational dimensions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信