A Comparative Review and Analysis of Computational Predictors for Identification of Enhancer and their Strength

IF 2.4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS
Mehwish Gill, Muhammad Kabir, Saeed Ahmed, Muhammad Asif Subhani, Maqsood Hayat
{"title":"A Comparative Review and Analysis of Computational Predictors for\nIdentification of Enhancer and their Strength","authors":"Mehwish Gill, Muhammad Kabir, Saeed Ahmed, Muhammad Asif Subhani, Maqsood Hayat","doi":"10.2174/0115748936285942240513064919","DOIUrl":null,"url":null,"abstract":"\n\nEnhancers are the short functional regions (50–1500bp) in the genome, which play an\neffective character in activating gene-transcription in the presence of transcription-factors (TFs).\nMany human diseases, such as cancer and inflammatory bowel disease, are correlated with the enhancers’\ngenetic variations. The precise recognition of the enhancers provides useful insights for\nunderstanding the pathogenesis of human diseases and their treatments. High-throughput experiments\nare considered essential tools for characterizing enhancers; however, these methods are laborious,\ncostly and time-consuming. Computational methods are considered alternative solutions for\naccurate and rapid identification of the enhancers. Over the past years, numerous computational\npredictors have been devised for predicting enhancers and their strength. A comprehensive review\nand thorough assessment are indispensable to systematically compare sequence-based enhancer’s\nbioinformatics tools on their performance. Giving the increasing interest in this domain, we conducted\na large-scale analysis and assessment of the state-of-the-art enhancer predictors to evaluate\ntheir scalability and generalization power. Additionally, we classified the existing approaches into\nthree main groups: conventional machine-learning, ensemble and deep learning-based approaches.\nFurthermore, the study has focused on exploring the important factors that are crucial for developing\nprecise and reliable predictors such as designing trusted benchmark/independent datasets, feature\nrepresentation schemes, feature selection methods, classification strategies, evaluation metrics\nand webservers. Finally, the insights from this review are expected to provide important guidelines\nto the research community and pharmaceutical companies in general and high-throughput tools for\nthe detection and characterization of enhancers in particular.\n","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.2174/0115748936285942240513064919","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Enhancers are the short functional regions (50–1500bp) in the genome, which play an effective character in activating gene-transcription in the presence of transcription-factors (TFs). Many human diseases, such as cancer and inflammatory bowel disease, are correlated with the enhancers’ genetic variations. The precise recognition of the enhancers provides useful insights for understanding the pathogenesis of human diseases and their treatments. High-throughput experiments are considered essential tools for characterizing enhancers; however, these methods are laborious, costly and time-consuming. Computational methods are considered alternative solutions for accurate and rapid identification of the enhancers. Over the past years, numerous computational predictors have been devised for predicting enhancers and their strength. A comprehensive review and thorough assessment are indispensable to systematically compare sequence-based enhancer’s bioinformatics tools on their performance. Giving the increasing interest in this domain, we conducted a large-scale analysis and assessment of the state-of-the-art enhancer predictors to evaluate their scalability and generalization power. Additionally, we classified the existing approaches into three main groups: conventional machine-learning, ensemble and deep learning-based approaches. Furthermore, the study has focused on exploring the important factors that are crucial for developing precise and reliable predictors such as designing trusted benchmark/independent datasets, feature representation schemes, feature selection methods, classification strategies, evaluation metrics and webservers. Finally, the insights from this review are expected to provide important guidelines to the research community and pharmaceutical companies in general and high-throughput tools for the detection and characterization of enhancers in particular.
用于识别增强子及其强度的计算预测因子的比较研究与分析
增强子是基因组中的短功能区(50-1500bp),在转录因子(TFs)存在的情况下,增强子在激活基因转录方面发挥着有效作用。许多人类疾病,如癌症和炎症性肠病,都与增强子的基因变异有关。增强子的精确识别为了解人类疾病的发病机理及其治疗提供了有用的见解。高通量实验被认为是表征增强子的基本工具;然而,这些方法费力、费钱、费时。计算方法被认为是准确、快速鉴定增强子的替代方案。在过去几年中,人们设计了许多计算预测器来预测增强子及其强度。要系统地比较基于序列的增强子生物信息学工具的性能,全面回顾和彻底评估是必不可少的。鉴于人们对这一领域的兴趣与日俱增,我们对最先进的增强子预测工具进行了大规模的分析和评估,以评价它们的可扩展性和泛化能力。此外,我们还将现有方法分为三大类:传统机器学习方法、集合方法和基于深度学习的方法。此外,本研究还重点探讨了对开发精确可靠的预测器至关重要的重要因素,如设计可信的基准/独立数据集、特征表示方案、特征选择方法、分类策略、评估指标和网络服务器。最后,本综述的见解有望为研究界和制药公司提供重要指导,特别是为增强子的检测和表征提供高通量工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Current Bioinformatics
Current Bioinformatics 生物-生化研究方法
CiteScore
6.60
自引率
2.50%
发文量
77
审稿时长
>12 weeks
期刊介绍: Current Bioinformatics aims to publish all the latest and outstanding developments in bioinformatics. Each issue contains a series of timely, in-depth/mini-reviews, research papers and guest edited thematic issues written by leaders in the field, covering a wide range of the integration of biology with computer and information science. The journal focuses on advances in computational molecular/structural biology, encompassing areas such as computing in biomedicine and genomics, computational proteomics and systems biology, and metabolic pathway engineering. Developments in these fields have direct implications on key issues related to health care, medicine, genetic disorders, development of agricultural products, renewable energy, environmental protection, etc.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信