A Prediction Approach for the Functional Effects of Non-Coding Gene Variants

2022 International Arab Conference on Information Technology (ACIT) Pub Date : 2022-11-22 DOI:10.1109/ACIT57182.2022.9994094

Gözde Yurtdas, Kagan Aslan, S. Özyer, Tansel Özyer, Mehmet Kaya, R. Alhajj

{"title":"A Prediction Approach for the Functional Effects of Non-Coding Gene Variants","authors":"Gözde Yurtdas, Kagan Aslan, S. Özyer, Tansel Özyer, Mehmet Kaya, R. Alhajj","doi":"10.1109/ACIT57182.2022.9994094","DOIUrl":null,"url":null,"abstract":"The aim of this study is to develop an approach for predicting the functional effects of variants of non-coding genes which have great importance in human genetics. Non-coding genes have formed a very vital field of study since they have a high effect on diseases. However, little is known about non-coding genes compared to coding genes, and they are found in the body almost 9 times more than coding genes. This is a critical issue, and i t is very important to predict the effects of these genes, which are so abundant in the body and difficult to understand. This exhibits the motivation of the study described in the paper. For this purpose, an extensive literature review was first conducted, and possible datasets that could be used were examined. Then, using Python programming language, we developed a prediction model with high accuracy. After investigating how important non-coding gene variants are, and in what areas they can be used, we decided to use a functional interaction network from the deep learning models as the most suitable method. We used STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) which is a biological database and web resource of known and predicted protein-protein interactions. As a second step, we generated feature vectors. After checking the overlap of non-coding genes, we extracted three types of feature vectors. Identifying protein interaction network in Python, the outcome describes the interplay between the biomolecules encoded by genes. It allows to understand the complexities of cellular functions, and even predict potential therapeutics. As a last step, we implemented a deep learning model which included three fully connected (FC) layers, also known as dense layers, with dimensions 40, 10, and 2, respectively. Experimental results demonstrate that the proposed method captured high accuracy values.","PeriodicalId":256713,"journal":{"name":"2022 International Arab Conference on Information Technology (ACIT)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Arab Conference on Information Technology (ACIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACIT57182.2022.9994094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The aim of this study is to develop an approach for predicting the functional effects of variants of non-coding genes which have great importance in human genetics. Non-coding genes have formed a very vital field of study since they have a high effect on diseases. However, little is known about non-coding genes compared to coding genes, and they are found in the body almost 9 times more than coding genes. This is a critical issue, and i t is very important to predict the effects of these genes, which are so abundant in the body and difficult to understand. This exhibits the motivation of the study described in the paper. For this purpose, an extensive literature review was first conducted, and possible datasets that could be used were examined. Then, using Python programming language, we developed a prediction model with high accuracy. After investigating how important non-coding gene variants are, and in what areas they can be used, we decided to use a functional interaction network from the deep learning models as the most suitable method. We used STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) which is a biological database and web resource of known and predicted protein-protein interactions. As a second step, we generated feature vectors. After checking the overlap of non-coding genes, we extracted three types of feature vectors. Identifying protein interaction network in Python, the outcome describes the interplay between the biomolecules encoded by genes. It allows to understand the complexities of cellular functions, and even predict potential therapeutics. As a last step, we implemented a deep learning model which included three fully connected (FC) layers, also known as dense layers, with dimensions 40, 10, and 2, respectively. Experimental results demonstrate that the proposed method captured high accuracy values.

查看原文本刊更多论文

非编码基因变异功能效应的预测方法

本研究的目的是开发一种预测非编码基因变异的功能效应的方法，这些基因在人类遗传学中具有重要意义。由于非编码基因对疾病有很高的影响，它们已经形成了一个非常重要的研究领域。然而，与编码基因相比，人们对非编码基因知之甚少，它们在体内的含量几乎是编码基因的9倍。这是一个关键问题，预测这些基因的影响是非常重要的，因为它们在体内非常丰富，很难理解。这体现了本文研究的动机。为此，首先进行了广泛的文献综述，并检查了可能使用的数据集。然后，利用Python编程语言，开发了一个精度较高的预测模型。在研究了非编码基因变异的重要性以及它们可以在哪些领域使用之后，我们决定使用深度学习模型中的功能交互网络作为最合适的方法。我们使用STRING(检索相互作用基因/蛋白质的搜索工具)，这是一个已知和预测的蛋白质-蛋白质相互作用的生物数据库和网络资源。第二步，我们生成特征向量。在检查非编码基因的重叠后，我们提取了三种类型的特征向量。在Python中识别蛋白质相互作用网络，结果描述了基因编码的生物分子之间的相互作用。它可以让我们了解细胞功能的复杂性，甚至预测潜在的治疗方法。作为最后一步，我们实现了一个深度学习模型，其中包括三个完全连接(FC)层，也称为密集层，维度分别为40、10和2。实验结果表明，该方法具有较高的捕获精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 International Arab Conference on Information Technology (ACIT)

自引率

0.00%

发文量