利用以最佳基因水平和突变水平特征训练的改进型循环神经网络识别和分离基因。

IF 1.7 4区 医学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Irfan Rashid Pukhta, Ranjeet Kumar Rout
{"title":"利用以最佳基因水平和突变水平特征训练的改进型循环神经网络识别和分离基因。","authors":"Irfan Rashid Pukhta, Ranjeet Kumar Rout","doi":"10.1080/10255842.2024.2311322","DOIUrl":null,"url":null,"abstract":"<p><p>Even though many different approaches have been employed to address the complex mutational heterogeneity of cancer, finding driver genes is still problematic since other genomic factors cannot be fully integrated for combined analyses. This research paper presents a novel gene identification and segregation model with five key processes (a) pre-processing, (b) treatment of class imbalances, (c) feature extraction, (d) feature selection, and (e) gene classification. To increase the data quality, the gathered initial information is first pre-processed utilizing data cleaning and data normalization. This turns the raw data into something that is both useful and effective. In actuality, the sample is skewed against drivers because passenger mutation markers appear in proportionally less instances than drivers do. To address the Class Imbalance Problem, improved K-Means + SMOTE are applied to the preprocessed data. The most crucial characteristics, including those at the gene and mutation levels, are then extracted from the balanced dataset. To lessen the computational load in terms of time, the best features from the retrieved features are selected using Forensic interpretation tailored hunger food search optimization (FIHFSO). The ideal features are used to train the deep learning classifier that conducts the separation procedure. In this research, an Improved Recurrent Neural Network (I-RNN) is used to make a final decision about genes. At 90% of learning percentage, the accuracy of the proposed method achieves 0.98% of 0.83, 0.81, 0.65, 0.80, 0.92 and 0.63% which is compared to the other methods like HGS, FBIO, AOA, AO, GOA and PRO respectively.</p>","PeriodicalId":50640,"journal":{"name":"Computer Methods in Biomechanics and Biomedical Engineering","volume":" ","pages":"1111-1126"},"PeriodicalIF":1.7000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identification and segregation of genes with improved recurrent neural network trained with optimal gene level and mutation level features.\",\"authors\":\"Irfan Rashid Pukhta, Ranjeet Kumar Rout\",\"doi\":\"10.1080/10255842.2024.2311322\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Even though many different approaches have been employed to address the complex mutational heterogeneity of cancer, finding driver genes is still problematic since other genomic factors cannot be fully integrated for combined analyses. This research paper presents a novel gene identification and segregation model with five key processes (a) pre-processing, (b) treatment of class imbalances, (c) feature extraction, (d) feature selection, and (e) gene classification. To increase the data quality, the gathered initial information is first pre-processed utilizing data cleaning and data normalization. This turns the raw data into something that is both useful and effective. In actuality, the sample is skewed against drivers because passenger mutation markers appear in proportionally less instances than drivers do. To address the Class Imbalance Problem, improved K-Means + SMOTE are applied to the preprocessed data. The most crucial characteristics, including those at the gene and mutation levels, are then extracted from the balanced dataset. To lessen the computational load in terms of time, the best features from the retrieved features are selected using Forensic interpretation tailored hunger food search optimization (FIHFSO). The ideal features are used to train the deep learning classifier that conducts the separation procedure. In this research, an Improved Recurrent Neural Network (I-RNN) is used to make a final decision about genes. At 90% of learning percentage, the accuracy of the proposed method achieves 0.98% of 0.83, 0.81, 0.65, 0.80, 0.92 and 0.63% which is compared to the other methods like HGS, FBIO, AOA, AO, GOA and PRO respectively.</p>\",\"PeriodicalId\":50640,\"journal\":{\"name\":\"Computer Methods in Biomechanics and Biomedical Engineering\",\"volume\":\" \",\"pages\":\"1111-1126\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Methods in Biomechanics and Biomedical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1080/10255842.2024.2311322\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/2/29 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Methods in Biomechanics and Biomedical Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1080/10255842.2024.2311322","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/2/29 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

尽管已经采用了许多不同的方法来解决癌症复杂的突变异质性问题,但由于无法完全整合其他基因组因素进行综合分析,因此寻找驱动基因仍是一个难题。本研究论文提出了一种新型基因识别和分离模型,包括五个关键过程:(a)预处理;(b)类不平衡处理;(c)特征提取;(d)特征选择;(e)基因分类。为了提高数据质量,首先要利用数据清洗和数据规范化对收集到的初始信息进行预处理。这样就能将原始数据转化为有用和有效的数据。实际上,由于乘客突变标记出现的比例低于驾驶员,因此样本对驾驶员是有偏差的。为了解决类别不平衡问题,对预处理数据采用了改进的 K-Means + SMOTE 方法。然后从平衡数据集中提取最关键的特征,包括基因和突变层面的特征。为了减少计算时间,使用法证解释定制饥饿食物搜索优化(FIHFSO)从检索到的特征中选择最佳特征。理想的特征被用于训练执行分离程序的深度学习分类器。在这项研究中,改进型循环神经网络(I-RNN)被用来对基因做出最终决定。在学习率为 90% 的情况下,与其他方法(如 HGS、FBIO、AOA、AO、GOA 和 PRO)相比,拟议方法的准确率分别达到了 0.98%、0.83%、0.81%、0.65%、0.80%、0.92% 和 0.63%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Identification and segregation of genes with improved recurrent neural network trained with optimal gene level and mutation level features.

Even though many different approaches have been employed to address the complex mutational heterogeneity of cancer, finding driver genes is still problematic since other genomic factors cannot be fully integrated for combined analyses. This research paper presents a novel gene identification and segregation model with five key processes (a) pre-processing, (b) treatment of class imbalances, (c) feature extraction, (d) feature selection, and (e) gene classification. To increase the data quality, the gathered initial information is first pre-processed utilizing data cleaning and data normalization. This turns the raw data into something that is both useful and effective. In actuality, the sample is skewed against drivers because passenger mutation markers appear in proportionally less instances than drivers do. To address the Class Imbalance Problem, improved K-Means + SMOTE are applied to the preprocessed data. The most crucial characteristics, including those at the gene and mutation levels, are then extracted from the balanced dataset. To lessen the computational load in terms of time, the best features from the retrieved features are selected using Forensic interpretation tailored hunger food search optimization (FIHFSO). The ideal features are used to train the deep learning classifier that conducts the separation procedure. In this research, an Improved Recurrent Neural Network (I-RNN) is used to make a final decision about genes. At 90% of learning percentage, the accuracy of the proposed method achieves 0.98% of 0.83, 0.81, 0.65, 0.80, 0.92 and 0.63% which is compared to the other methods like HGS, FBIO, AOA, AO, GOA and PRO respectively.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.10
自引率
6.20%
发文量
179
审稿时长
4-8 weeks
期刊介绍: The primary aims of Computer Methods in Biomechanics and Biomedical Engineering are to provide a means of communicating the advances being made in the areas of biomechanics and biomedical engineering and to stimulate interest in the continually emerging computer based technologies which are being applied in these multidisciplinary subjects. Computer Methods in Biomechanics and Biomedical Engineering will also provide a focus for the importance of integrating the disciplines of engineering with medical technology and clinical expertise. Such integration will have a major impact on health care in the future.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信