Bioinformatics最新文献

筛选
英文 中文
A novel registration method for long-serial section images of EM with a serial split technique based on unsupervised optical flow network. 一种基于无监督光流网络的EM长序列切片图像序列分割配准方法。
IF 5.8 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad436
Tong Xin, Yanan Lv, Haoran Chen, Linlin Li, Lijun Shen, Guangcun Shan, Xi Chen, Hua Han
{"title":"A novel registration method for long-serial section images of EM with a serial split technique based on unsupervised optical flow network.","authors":"Tong Xin,&nbsp;Yanan Lv,&nbsp;Haoran Chen,&nbsp;Linlin Li,&nbsp;Lijun Shen,&nbsp;Guangcun Shan,&nbsp;Xi Chen,&nbsp;Hua Han","doi":"10.1093/bioinformatics/btad436","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad436","url":null,"abstract":"<p><strong>Motivation: </strong>The registration of serial section electron microscope images is a critical step in reconstructing biological tissue volumes, and it aims to eliminate complex nonlinear deformations from sectioning and replicate the correct neurite structure. However, due to the inherent properties of biological structures and the challenges posed by section preparation of biological tissues, achieving an accurate registration of serial sections remains a significant challenge. Conventional nonlinear registration techniques, which are effective in eliminating nonlinear deformation, can also eliminate the natural morphological variation of neurites across sections. Additionally, accumulation of registration errors alters the neurite structure.</p><p><strong>Results: </strong>This article proposes a novel method for serial section registration that utilizes an unsupervised optical flow network to measure feature similarity rather than pixel similarity to eliminate nonlinear deformation and achieve pairwise registration between sections. The optical flow network is then employed to estimate and compensate for cumulative registration error, thereby allowing for the reconstruction of the structure of biological tissues. Based on the novel serial section registration method, a serial split technique is proposed for long-serial sections. Experimental results demonstrate that the state-of-the-art method proposed here effectively improves the spatial continuity of serial sections, leading to more accurate registration and improved reconstruction of the structure of biological tissues.</p><p><strong>Availability and implementation: </strong>The source code and data are available at https://github.com/TongXin-CASIA/EFSR.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10403427/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9961566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
otargen: GraphQL-based R package for tidy data accessing and processing from Open Targets Genetics. otargen:基于graphql的R包,用于Open Targets Genetics的数据访问和处理。
IF 5.8 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad441
Amir Feizi, Kamalika Ray
{"title":"otargen: GraphQL-based R package for tidy data accessing and processing from Open Targets Genetics.","authors":"Amir Feizi,&nbsp;Kamalika Ray","doi":"10.1093/bioinformatics/btad441","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad441","url":null,"abstract":"<p><strong>Motivation: </strong>Open Target Genetics is a comprehensive resource portal that offers variant-centric statistical evidence, enabling the prioritization of causal variants and the identification of potential drug targets. The portal uses GraphQL technology for efficient data query and provides endpoints for programmatic access for R and Python users. However, leveraging GraphQL for data retrieval can be challenging, time-consuming, and repetitive, requiring familiarity with the GraphQL query language and processing outputs in nested JSON (JavaScript Object Notation) format into tidy data tables. Therefore, developing open-source tools are required to simplify data retrieval processes to integrate valuable genetic information into data-driven target discovery pipelines seamlessly.</p><p><strong>Results: </strong>otargen is an open-source R package designed to make data retrieval and analysis from the Open Target Genetics portal as simple as possible for R users. The package offers a suite of functions covering all query types, allowing streamlined data access in a tidy table format. By executing only a single line of code, the otargen users avoid the repetitive scripting of complex GraphQL queries, including the post-processing steps. In addition, otargen contains convenient plotting functions to visualize and gain insights from complex data tables returned by several key functions.</p><p><strong>Availability and implementation: </strong>otargen is available at https://amirfeizi.github.io/otargen/.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10394122/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10017873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
pyCaverDock: Python implementation of the popular tool for analysis of ligand transport with advanced caching and batch calculation support. pyCaverDock: Python实现的流行工具,用于分析配体传输,具有高级缓存和批处理计算支持。
IF 5.8 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad443
Ondrej Vavra, Jakub Beranek, Jan Stourac, Martin Surkovsky, Jiri Filipovic, Jiri Damborsky, Jan Martinovic, David Bednar
{"title":"pyCaverDock: Python implementation of the popular tool for analysis of ligand transport with advanced caching and batch calculation support.","authors":"Ondrej Vavra,&nbsp;Jakub Beranek,&nbsp;Jan Stourac,&nbsp;Martin Surkovsky,&nbsp;Jiri Filipovic,&nbsp;Jiri Damborsky,&nbsp;Jan Martinovic,&nbsp;David Bednar","doi":"10.1093/bioinformatics/btad443","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad443","url":null,"abstract":"<p><strong>Summary: </strong>Access pathways in enzymes are crucial for the passage of substrates and products of catalysed reactions. The process can be studied by computational means with variable degrees of precision. Our in-house approximative method CaverDock provides a fast and easy way to set up and run ligand binding and unbinding calculations through protein tunnels and channels. Here we introduce pyCaverDock, a Python3 API designed to improve user experience with the tool and further facilitate the ligand transport analyses. The API enables users to simplify the steps needed to use CaverDock, from automatizing setup processes to designing screening pipelines.</p><p><strong>Availability and implementation: </strong>pyCaverDock API is implemented in Python 3 and is freely available with detailed documentation and practical examples at https://loschmidt.chemi.muni.cz/caverdock/.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10397418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10017874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ICARUS: flexible protein structural alignment based on Protein Units. ICARUS:基于蛋白质单位的柔性蛋白质结构比对。
IF 5.8 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad459
Gabriel Cretin, Charlotte Périn, Nicolas Zimmermann, Tatiana Galochkina, Jean-Christophe Gelly
{"title":"ICARUS: flexible protein structural alignment based on Protein Units.","authors":"Gabriel Cretin,&nbsp;Charlotte Périn,&nbsp;Nicolas Zimmermann,&nbsp;Tatiana Galochkina,&nbsp;Jean-Christophe Gelly","doi":"10.1093/bioinformatics/btad459","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad459","url":null,"abstract":"<p><strong>Motivation: </strong>Alignment of protein structures is a major problem in structural biology. The first approach commonly used is to consider proteins as rigid bodies. However, alignment of protein structures can be very complex due to conformational variability, or complex evolutionary relationships between proteins such as insertions, circular permutations or repetitions. In such cases, introducing flexibility becomes useful for two reasons: (i) it can help compare two protein chains which adopted two different conformational states, such as due to proteins/ligands interaction or post-translational modifications, and (ii) it aids in the identification of conserved regions in proteins that may have distant evolutionary relationships.</p><p><strong>Results: </strong>We propose ICARUS, a new approach for flexible structural alignment based on identification of Protein Units, evolutionarily preserved structural descriptors of intermediate size, between secondary structures and domains. ICARUS significantly outperforms reference methods on a dataset of very difficult structural alignments.</p><p><strong>Availability and implementation: </strong>Code is freely available online at https://github.com/DSIMB/ICARUS.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10400377/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10018371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TranSyT, an innovative framework for identifying transport systems. transsyt,一个识别运输系统的创新框架。
IF 5.8 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad466
Emanuel Cunha, Davide Lagoa, José P Faria, Filipe Liu, Christopher S Henry, Oscar Dias
{"title":"TranSyT, an innovative framework for identifying transport systems.","authors":"Emanuel Cunha,&nbsp;Davide Lagoa,&nbsp;José P Faria,&nbsp;Filipe Liu,&nbsp;Christopher S Henry,&nbsp;Oscar Dias","doi":"10.1093/bioinformatics/btad466","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad466","url":null,"abstract":"<p><strong>Motivation: </strong>The importance and rate of development of genome-scale metabolic models have been growing for the last few years, increasing the demand for software solutions that automate several steps of this process. However, since TRIAGE's release, software development for the automatic integration of transport reactions into models has stalled.</p><p><strong>Results: </strong>Here, we present the Transport Systems Tracker (TranSyT). Unlike other transport systems annotation software, TranSyT does not rely on manual curation to expand its internal database, which is derived from highly curated records retrieved from the Transporters Classification Database and complemented with information from other data sources. TranSyT compiles information regarding transporter families and proteins, and derives reactions into its internal database, making it available for rapid annotation of complete genomes. All transport reactions have GPR associations and can be exported with identifiers from four different metabolite databases. TranSyT is currently available as a plugin for merlin v4.0 and an app for KBase.</p><p><strong>Availability and implementation: </strong>TranSyT web service: https://transyt.bio.di.uminho.pt/; GitHub for the tool: https://github.com/BioSystemsUM/transyt; GitHub with examples and instructions to run TranSyT: https://github.com/ecunha1996/transyt_paper.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444967/pdf/btad466.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10420696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AARDVARK: an automated reversion detector for variants affecting resistance kinetics. AARDVARK:影响抗性动力学变异的自动逆转检测器。
IF 5.8 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad509
Thaidy Moreno, Joaquin Magana, David A Quigley
{"title":"AARDVARK: an automated reversion detector for variants affecting resistance kinetics.","authors":"Thaidy Moreno, Joaquin Magana, David A Quigley","doi":"10.1093/bioinformatics/btad509","DOIUrl":"10.1093/bioinformatics/btad509","url":null,"abstract":"<p><strong>Summary: </strong>Resistance to two classes of FDA-approved therapies that target DNA repair-deficient tumors is caused by mutations that restore the tumor cell's DNA repair function. Identifying these \"reversion\" mutations currently requires manual annotation of patient tumor sequence data. Here we present AARDVARK, an R package that automatically identifies reversion mutations from DNA sequence data.</p><p><strong>Availability and implementation: </strong>AARDVARK is implemented in R (≥3.5). It is available on GitHub at https://github.com/davidquigley/aardvark. It is licensed under the MIT license.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10457659/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10476475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
USNAP: fast unique dense region detection and its application to lung cancer. USNAP:快速独特致密区域检测及其在肺癌中的应用。
IF 4.4 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad477
Serene W H Wong, Chiara Pastrello, Max Kotlyar, Christos Faloutsos, Igor Jurisica
{"title":"USNAP: fast unique dense region detection and its application to lung cancer.","authors":"Serene W H Wong, Chiara Pastrello, Max Kotlyar, Christos Faloutsos, Igor Jurisica","doi":"10.1093/bioinformatics/btad477","DOIUrl":"10.1093/bioinformatics/btad477","url":null,"abstract":"<p><strong>Motivation: </strong>Many real-world problems can be modeled as annotated graphs. Scalable graph algorithms that extract actionable information from such data are in demand since these graphs are large, varying in topology, and have diverse node/edge annotations. When these graphs change over time they create dynamic graphs, and open the possibility to find patterns across different time points. In this article, we introduce a scalable algorithm that finds unique dense regions across time points in dynamic graphs. Such algorithms have applications in many different areas, including the biological, financial, and social domains.</p><p><strong>Results: </strong>There are three important contributions to this manuscript. First, we designed a scalable algorithm, USNAP, to effectively identify dense subgraphs that are unique to a time stamp given a dynamic graph. Importantly, USNAP provides a lower bound of the density measure in each step of the greedy algorithm. Second, insights and understanding obtained from validating USNAP on real data show its effectiveness. While USNAP is domain independent, we applied it to four non-small cell lung cancer gene expression datasets. Stages in non-small cell lung cancer were modeled as dynamic graphs, and input to USNAP. Pathway enrichment analyses and comprehensive interpretations from literature show that USNAP identified biologically relevant mechanisms for different stages of cancer progression. Third, USNAP is scalable, and has a time complexity of O(m+mc log nc+nc log nc), where m is the number of edges, and n is the number of vertices in the dynamic graph; mc is the number of edges, and nc is the number of vertices in the collapsed graph.</p><p><strong>Availability and implementation: </strong>The code of USNAP is available at https://www.cs.utoronto.ca/~juris/data/USNAP22.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":4.4,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10425186/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10067954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BEENE: deep learning-based nonlinear embedding improves batch effect estimation. BEENE:基于深度学习的非线性嵌入改进了批效果估计。
IF 5.8 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad479
Md Ashiqur Rahman, Abdullah Aman Tutul, Mahfuza Sharmin, Md Shamsuzzoha Bayzid
{"title":"BEENE: deep learning-based nonlinear embedding improves batch effect estimation.","authors":"Md Ashiqur Rahman,&nbsp;Abdullah Aman Tutul,&nbsp;Mahfuza Sharmin,&nbsp;Md Shamsuzzoha Bayzid","doi":"10.1093/bioinformatics/btad479","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad479","url":null,"abstract":"<p><strong>Motivation: </strong>Analyzing large-scale single-cell transcriptomic datasets generated using different technologies is challenging due to the presence of batch-specific systematic variations known as batch effects. Since biological and technological differences are often interspersed, detecting and accounting for batch effects in RNA-seq datasets are critical for effective data integration and interpretation. Low-dimensional embeddings, such as principal component analysis (PCA) are widely used in visual inspection and estimation of batch effects. Linear dimensionality reduction methods like PCA are effective in assessing the presence of batch effects, especially when batch effects exhibit linear patterns. However, batch effects are inherently complex and existing linear dimensionality reduction methods could be inadequate and imprecise in the presence of sophisticated nonlinear batch effects.</p><p><strong>Results: </strong>We present Batch Effect Estimation using Nonlinear Embedding (BEENE), a deep nonlinear auto-encoder network which is specially tailored to generate an alternative lower dimensional embedding suitable for both linear and nonlinear batch effects. BEENE simultaneously learns the batch and biological variables from RNA-seq data, resulting in an embedding that is more robust and sensitive than PCA embedding in terms of detecting and quantifying batch effects. BEENE was assessed on a collection of carefully controlled simulated datasets as well as biological datasets, including two technical replicates of mouse embryogenesis cells, peripheral blood mononuclear cells from three largely different experiments and five studies of pancreatic islet cells.</p><p><strong>Availability and implementation: </strong>BEENE is freely available as an open source project at https://github.com/ashiq24/BEENE.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448987/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10102931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of pathogenic single amino acid substitutions using molecular fragment descriptors. 利用分子片段描述子预测致病性单氨基酸取代。
IF 5.8 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad484
A Zadorozhny, A Smirnov, D Filimonov, A Lagunin
{"title":"Prediction of pathogenic single amino acid substitutions using molecular fragment descriptors.","authors":"A Zadorozhny,&nbsp;A Smirnov,&nbsp;D Filimonov,&nbsp;A Lagunin","doi":"10.1093/bioinformatics/btad484","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad484","url":null,"abstract":"<p><strong>Motivation: </strong>Next Generation Sequencing technologies make it possible to detect rare genetic variants in individual patients. Currently, more than a dozen software and web services have been created to predict the pathogenicity of variants related with changing of amino acid residues. Despite considerable efforts in this area, at the moment there is no ideal method to classify pathogenic and harmless variants, and the assessment of the pathogenicity is often contradictory. In this article, we propose to use peptides structural formulas of proteins as an amino acid residues substitutions description, rather than a single-letter code. This allowed us to investigate the effectiveness of chemoinformatics approach to assess the pathogenicity of variants associated with amino acid substitutions.</p><p><strong>Results: </strong>The structure-activity relationships analysis relying on protein-specific data and atom centric substructural multilevel neighborhoods of atoms (MNA) descriptors of molecular fragments appeared to be suitable for predicting the pathogenic effect of single amino acid variants. MNA-based Naïve Bayes classifier algorithm, ClinVar and humsavar data were used for the creation of structure-activity relationships models for 10 proteins. The performance of the models was compared with 11 different predicting tools: 8 individual (SIFT 4G, Polyphen2 HDIV, MutationAssessor, PROVEAN, FATHMM, MVP, LIST-S2, MutPred) and 3 consensus (M-CAP, MetaSVM, MetaLR). The accuracy of MNA-based method varies for the proteins (AUC: 0.631-0.993; MCC: 0.191-0.891). It was similar for both the results of comparisons with the other individual predictors and third-party protein-specific predictors. For several proteins (BRCA1, BRCA2, COL1A2, and RYR1), the performance of the MNA-based method was outstanding, capable of capturing the pathogenic effect of structural changes in amino acid substitutions.</p><p><strong>Availability and implementation: </strong>The datasets are available as supplemental data at Bioinformatics online. A python script to convert amino acid and nucleotide sequences from single-letter codes to SD files is available at https://github.com/SmirnygaTotoshka/SequenceToSDF. The authors provide trial licenses for MultiPASS software to interested readers upon request.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435372/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10121115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
dRFEtools: dynamic recursive feature elimination for omics. dRFEtools:组学的动态递归特征消除。
IF 5.8 3区 生物学
Bioinformatics Pub Date : 2023-08-01 DOI: 10.1093/bioinformatics/btad513
Kynon J M Benjamin, Tarun Katipalli, Apuã C M Paquola
{"title":"dRFEtools: dynamic recursive feature elimination for omics.","authors":"Kynon J M Benjamin,&nbsp;Tarun Katipalli,&nbsp;Apuã C M Paquola","doi":"10.1093/bioinformatics/btad513","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad513","url":null,"abstract":"<p><strong>Motivation: </strong>Advances in technology have generated larger omics datasets with potential applications for machine learning. In many datasets, however, cost and limited sample availability result in an excessively higher number of features as compared to observations. Moreover, biological processes are associated with networks of core and peripheral genes, while traditional feature selection approaches capture only core genes.</p><p><strong>Results: </strong>To overcome these limitations, we present dRFEtools that implements dynamic recursive feature elimination (RFE), reducing computational time with high accuracy compared to standard RFE, expanding dynamic RFE to regression algorithms, and outputting the subsets of features that hold predictive power with and without peripheral features. dRFEtools integrates with scikit-learn (the popular Python machine learning platform) and thus provides new opportunities for dynamic RFE in large-scale omics data while enhancing its interpretability.</p><p><strong>Availability and implementation: </strong>dRFEtools is freely available on PyPI at https://pypi.org/project/drfetools/ or on GitHub https://github.com/LieberInstitute/dRFEtools, implemented in Python 3, and supported on Linux, Windows, and Mac OS.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 8","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10471895/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10285113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信