Bioinformatics (Oxford, England)最新文献

筛选
英文 中文
SSL-VQ: vector-quantized variational autoencoders for semi-supervised prediction of therapeutic targets across diverse diseases.
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf039
Satoko Namba, Chen Li, Noriko Yuyama Otani, Yoshihiro Yamanishi
{"title":"SSL-VQ: vector-quantized variational autoencoders for semi-supervised prediction of therapeutic targets across diverse diseases.","authors":"Satoko Namba, Chen Li, Noriko Yuyama Otani, Yoshihiro Yamanishi","doi":"10.1093/bioinformatics/btaf039","DOIUrl":"10.1093/bioinformatics/btaf039","url":null,"abstract":"<p><strong>Motivation: </strong>Identifying effective therapeutic targets poses a challenge in drug discovery, especially for uncharacterized diseases without known therapeutic targets (e.g. rare diseases, intractable diseases).</p><p><strong>Results: </strong>This study presents a novel machine learning approach using multimodal vector-quantized variational autoencoders (VQ-VAEs) for predicting therapeutic target molecules across diseases. To address the lack of known therapeutic target-disease associations, we incorporate the information on uncharacterized diseases without known targets or uncharacterized proteins without known indications (applicable diseases) in the semi-supervised learning (SSL) framework. The method integrates disease-specific and protein perturbation profiles with genetic perturbations (e.g. gene knockdowns and gene overexpressions) at the transcriptome level. Cross-cell representation learning, facilitated by VQ-VAEs, was performed to extract informative features from protein perturbation profiles across diverse human cell types. Concurrently, cross-disease representation learning was performed, leveraging VQ-VAE, to extract informative features reflecting disease states from disease-specific profiles. The model's applicability to uncharacterized diseases or proteins is enhanced by considering the consistency between disease-specific and patient-specific signatures. The efficacy of the method is demonstrated across three practical scenarios for 79 diseases: target repositioning for target-disease pairs, new target prediction for uncharacterized diseases, and new indication prediction for uncharacterized proteins. This method is expected to be valuable for identifying therapeutic targets across various diseases.</p><p><strong>Availability and implementation: </strong>Code: github.com/YamanishiLab/SSL-VQ and Data: 10.5281/zenodo.14644837.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143070123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DNAdesign: feature-aware in silico design of synthetic DNA through mutation.
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf052
Yingfei Wang, Jinsen Li, Tsu-Pei Chiu, Nicolas Gompel, Remo Rohs
{"title":"DNAdesign: feature-aware in silico design of synthetic DNA through mutation.","authors":"Yingfei Wang, Jinsen Li, Tsu-Pei Chiu, Nicolas Gompel, Remo Rohs","doi":"10.1093/bioinformatics/btaf052","DOIUrl":"10.1093/bioinformatics/btaf052","url":null,"abstract":"<p><strong>Motivation: </strong>DNA sequence and shape readout represent different modes of protein-DNA recognition. Current tools lack the functionality to simultaneously consider alterations in different readout modes caused by sequence mutations. DNAdesign is a web-based tool to compare and design mutations based on both DNA sequence and shape characteristics. Users input a wild-type sequence, select sites to introduce mutations and choose a set of DNA shape parameters for mutation design.</p><p><strong>Results: </strong>DNAdesign utilizes Deep DNAshape to provide ultra-fast predictions of DNA shape based on extended k-mers and offers multiple encoding methods for nucleotide sequences, including the physicochemical encoding of DNA through their functional groups in the major and minor groove. DNAdesign provides all mutation candidates along the sequence and shape dimensions, with interactive visualization comparing each candidate with the wild-type DNA molecule. DNAdesign provides an approach to studying gene regulation and applications in synthetic biology, such as the design of synthetic enhancers and transcription factor binding sites.</p><p><strong>Availability and implementation: </strong>The DNAdesign webserver and documentation are freely accessible at https://dnadesign.usc.edu.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11825384/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143076691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional profiling of the sequence stockpile: a protein pair-based assessment of in silico prediction tools.
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf035
R Prabakaran, Yana Bromberg
{"title":"Functional profiling of the sequence stockpile: a protein pair-based assessment of in silico prediction tools.","authors":"R Prabakaran, Yana Bromberg","doi":"10.1093/bioinformatics/btaf035","DOIUrl":"10.1093/bioinformatics/btaf035","url":null,"abstract":"<p><strong>Motivation: </strong>In silico functional annotation of proteins is crucial to narrowing the sequencing-accelerated gap in our understanding of protein activities. Numerous function annotation methods exist, and their ranks have been growing, particularly so with the recent deep learning-based developments. However, it is unclear if these tools are truly predictive. As we are not aware of any methods that can identify new terms in functional ontologies, we ask if they can, at least, identify molecular functions of proteins that are non-homologous to or far-removed from known protein families.</p><p><strong>Results: </strong>Here, we explore the potential and limitations of the existing methods in predicting the molecular functions of thousands of such proteins. Lacking the \"ground truth\" functional annotations, we transformed the assessment of function prediction into evaluation of functional similarity of protein pairs that likely share function but are unlike any of the currently functionally annotated sequences. Notably, our approach transcends the limitations of functional annotation vocabularies, providing a means to assess different-ontology annotation methods. We find that most existing methods are limited to identifying functional similarity of homologous sequences and fail to predict the function of proteins lacking reference. Curiously, despite their seemingly unlimited by-homology scope, deep learning methods also have trouble capturing the functional signal encoded in protein sequence. We believe that our work will inspire the development of a new generation of methods that push boundaries and promote exploration and discovery in the molecular function domain.</p><p><strong>Availability and implementation: </strong>The data underlying this article are available at https://doi.org/10.6084/m9.figshare.c.6737127.v3. The code used to compute siblings is available openly at https://bitbucket.org/bromberglab/siblings-detector/.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11821270/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143034899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robustly interrogating machine learning-based scoring functions: what are they learning?
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf040
Guy Durant, Fergus Boyles, Kristian Birchall, Brian Marsden, Charlotte M Deane
{"title":"Robustly interrogating machine learning-based scoring functions: what are they learning?","authors":"Guy Durant, Fergus Boyles, Kristian Birchall, Brian Marsden, Charlotte M Deane","doi":"10.1093/bioinformatics/btaf040","DOIUrl":"10.1093/bioinformatics/btaf040","url":null,"abstract":"<p><strong>Motivation: </strong>Machine learning-based scoring functions (MLBSFs) have been found to exhibit inconsistent performance on different benchmarks and be prone to learning dataset bias. For the field to develop MLBSFs that learn a generalizable understanding of physics, a more rigorous understanding of how they perform is required.</p><p><strong>Results: </strong>In this work, we compared the performance of a diverse set of popular MLBSFs (RFScore, SIGN, OnionNet-2, Pafnucy, and PointVS) to our proposed baseline models that can only learn dataset biases on a range of benchmarks. We found that these baseline models were competitive in accuracy to these MLBSFs in almost all proposed benchmarks, indicating these models only learn dataset biases. Our tests and provided platform, ToolBoxSF, will enable researchers to robustly interrogate MLBSF performance and determine the effect of dataset biases on their predictions.</p><p><strong>Availability and implementation: </strong>https://github.com/guydurant/toolboxsf.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11821266/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143061530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
tttrlib: modular software for integrating fluorescence spectroscopy, imaging, and molecular modeling. Tttrlib:集成荧光光谱,成像和分子建模的模块化软件。
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf025
Thomas-Otavio Peulen, Katherina Hemmen, Annemarie Greife, Benjamin M Webb, Suren Felekyan, Andrej Sali, Claus A M Seidel, Hugo Sanabria, Katrin G Heinze
{"title":"tttrlib: modular software for integrating fluorescence spectroscopy, imaging, and molecular modeling.","authors":"Thomas-Otavio Peulen, Katherina Hemmen, Annemarie Greife, Benjamin M Webb, Suren Felekyan, Andrej Sali, Claus A M Seidel, Hugo Sanabria, Katrin G Heinze","doi":"10.1093/bioinformatics/btaf025","DOIUrl":"10.1093/bioinformatics/btaf025","url":null,"abstract":"<p><strong>Summary: </strong>We introduce software for reading, writing and processing fluorescence single-molecule and image spectroscopy data and developing analysis pipelines to unify various spectroscopic analysis tools. Our software can be used for processing multiple experiment types, e.g. for time-resolved single-molecule spectroscopy, laser scanning microscopy, fluorescence correlation spectroscopy and image correlation spectroscopy. The software is file format agnostic and processes multiple time-resolved data formats and outputs. Our software eliminates the need for data conversion and mitigates data archiving issues.</p><p><strong>Availability and implementation: </strong>tttrlib is available via pip (https://pypi.org/project/tttrlib/) and bioconda while the open-source code is available via GitHub (https://github.com/fluorescence-tools/tttrlib). Presented examples and additional documentation demonstrating how to implement in vitro and live-cell image spectroscopy analysis are available at https://docs.peulen.xyz/tttrlib and https://zenodo.org/records/14002224.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11796090/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143018154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
scatterbar: an R package for visualizing proportional data across spatially resolved coordinates.
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf047
Dee Velazquez, Jean Fan
{"title":"scatterbar: an R package for visualizing proportional data across spatially resolved coordinates.","authors":"Dee Velazquez, Jean Fan","doi":"10.1093/bioinformatics/btaf047","DOIUrl":"10.1093/bioinformatics/btaf047","url":null,"abstract":"<p><strong>Motivation: </strong>Displaying proportional data across many spatially resolved coordinates is a challenging but important data visualization task, particularly for spatially resolved transcriptomics data. Scatter pie plots are one type of commonly used data visualization for such data but present perceptual challenges that may lead to difficulties in interpretation. Increasing the visual saliency of such data visualizations can help viewers more accurately identify proportional trends and compare proportional differences across spatial locations.</p><p><strong>Results: </strong>We developed scatterbar, an open-source R package that extends ggplot2, to visualize proportional data across many spatially resolved coordinates using scatter stacked bar plots. We apply scatterbar to visualize deconvolved cell-type proportions from a spatial transcriptomics dataset of the adult mouse brain to demonstrate how scatter stacked bar plots can enhance the distinguishability of proportional distributions compared to scatter pie plots.</p><p><strong>Availability and implementation: </strong>scatterbar is available on CRAN https://cran.r-project.org/package=scatterbar with additional documentation and tutorials at https://jef.works/scatterbar/.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11829801/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143071297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
COBREXA 2: tidy and scalable construction of complex metabolic models.
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf056
Miroslav Kratochvíl, St Elmo Wilken, Oliver Ebenhöh, Reinhard Schneider, Venkata P Satagopam
{"title":"COBREXA 2: tidy and scalable construction of complex metabolic models.","authors":"Miroslav Kratochvíl, St Elmo Wilken, Oliver Ebenhöh, Reinhard Schneider, Venkata P Satagopam","doi":"10.1093/bioinformatics/btaf056","DOIUrl":"10.1093/bioinformatics/btaf056","url":null,"abstract":"<p><strong>Summary: </strong>Constraint-based metabolic models offer a scalable framework to investigate biological systems using optimality principles. Construction and simulation of detailed models that utilize multiple kinds of constraint systems pose a significant coding overhead, complicating implementation of new types of analyses. We present an improved version of the constraint-based metabolic modeling package COBREXA, which utilizes a hierarchical model construction framework that decouples the implemented analysis algorithms into independent, yet re-combinable, building blocks. By removing the need to re-implement modeling components, assembly of complex metabolic models is simplified, which we demonstrate on use-cases of resource-balanced models, and enzyme-constrained flux balance models of interacting bacterial communities. Notably, these models show improved predictive capabilities in both monoculture and community settings. In perspective, the re-usable model-building components in COBREXA 2 provide a sustainable way to handle increasingly complex models in constraint-based modeling.</p><p><strong>Availability and implementation: </strong>COBREXA 2 is available from https://github.com/COBREXA/COBREXA.jl, and from Julia package repositories. COBREXA 2 works on all major operating systems and computer architectures. Documentation is available at https://cobrexa.github.io/COBREXA.jl/.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143375005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CAMIL: channel attention-based multiple instance learning for whole slide image classification. CAMIL:基于多实例学习的全幻灯片图像分类。
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf024
Jinyang Mao, Junlin Xu, Xianfang Tang, Yongjin Liu, Heaven Zhao, Geng Tian, Jialiang Yang
{"title":"CAMIL: channel attention-based multiple instance learning for whole slide image classification.","authors":"Jinyang Mao, Junlin Xu, Xianfang Tang, Yongjin Liu, Heaven Zhao, Geng Tian, Jialiang Yang","doi":"10.1093/bioinformatics/btaf024","DOIUrl":"10.1093/bioinformatics/btaf024","url":null,"abstract":"<p><strong>Motivation: </strong>The classification task based on whole-slide images (WSIs) is a classic problem in computational pathology. Multiple instance learning (MIL) provides a robust framework for analyzing whole slide images with slide-level labels at gigapixel resolution. However, existing MIL models typically focus on modeling the relationships between instances while neglecting the variability across the channel dimensions of instances, which prevents the model from fully capturing critical information in the channel dimension.</p><p><strong>Results: </strong>To address this issue, we propose a plug-and-play module called Multi-scale Channel Attention Block (MCAB), which models the interdependencies between channels by leveraging local features with different receptive fields. By alternately stacking four layers of Transformer and MCAB, we designed a channel attention-based MIL model (CAMIL) capable of simultaneously modeling both inter-instance relationships and intra-channel dependencies. To verify the performance of the proposed CAMIL in classification tasks, several comprehensive experiments were conducted across three datasets: Camelyon16, TCGA-NSCLC, and TCGA-RCC. Empirical results demonstrate that, whether the feature extractor is pretrained on natural images or on WSIs, our CAMIL surpasses current state-of-the-art MIL models across multiple evaluation metrics.</p><p><strong>Availability and implementation: </strong>All implementation code is available at https://github.com/maojy0914/CAMIL.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11802473/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143018197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HiCForecast: dynamic network optical flow estimation algorithm for spatiotemporal Hi-C data forecasting.
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf030
Dmitry Pinchuk, H M A Mohit Chowdhury, Abhishek Pandeya, Oluwatosin Oluwadare
{"title":"HiCForecast: dynamic network optical flow estimation algorithm for spatiotemporal Hi-C data forecasting.","authors":"Dmitry Pinchuk, H M A Mohit Chowdhury, Abhishek Pandeya, Oluwatosin Oluwadare","doi":"10.1093/bioinformatics/btaf030","DOIUrl":"10.1093/bioinformatics/btaf030","url":null,"abstract":"<p><strong>Motivation: </strong>The exploration of the 3D organization of DNA within the nucleus in relation to various stages of cellular development has led to experiments generating spatiotemporal Hi-C data. However, there is limited spatiotemporal Hi-C data for many organisms, impeding the study of 3D genome dynamics. To overcome this limitation and advance our understanding of genome organization, it is crucial to develop methods for forecasting Hi-C data at future time points from existing timeseries Hi-C data.</p><p><strong>Result: </strong>In this work, we designed a novel framework named HiCForecast, adopting a dynamic voxel flow algorithm to forecast future spatiotemporal Hi-C data. We evaluated how well our method generalizes forecasting data across different species and systems, ensuring performance in homogeneous, heterogeneous, and general contexts. Using both computational and biological evaluation metrics, our results show that HiCForecast outperforms the current state-of-the-art algorithm, emerging as an efficient and powerful tool for forecasting future spatiotemporal Hi-C datasets.</p><p><strong>Availability and implementation: </strong>HiCForecast is publicly available at https://github.com/OluwadareLab/HiCForecast.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11793695/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143026152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ESCARGOT: an AI agent leveraging large language models, dynamic graph of thoughts, and biomedical knowledge graphs for enhanced reasoning.
Bioinformatics (Oxford, England) Pub Date : 2025-02-04 DOI: 10.1093/bioinformatics/btaf031
Nicholas Matsumoto, Hyunjun Choi, Jay Moran, Miguel E Hernandez, Mythreye Venkatesan, Xi Li, Jui-Hsuan Chang, Paul Wang, Jason H Moore
{"title":"ESCARGOT: an AI agent leveraging large language models, dynamic graph of thoughts, and biomedical knowledge graphs for enhanced reasoning.","authors":"Nicholas Matsumoto, Hyunjun Choi, Jay Moran, Miguel E Hernandez, Mythreye Venkatesan, Xi Li, Jui-Hsuan Chang, Paul Wang, Jason H Moore","doi":"10.1093/bioinformatics/btaf031","DOIUrl":"10.1093/bioinformatics/btaf031","url":null,"abstract":"<p><strong>Motivation: </strong>LLMs like GPT-4, despite their advancements, often produce hallucinations and struggle with integrating external knowledge effectively. While Retrieval-Augmented Generation (RAG) attempts to address this by incorporating external information, it faces significant challenges such as context length limitations and imprecise vector similarity search. ESCARGOT aims to overcome these issues by combining LLMs with a dynamic Graph of Thoughts and biomedical knowledge graphs, improving output reliability, and reducing hallucinations.</p><p><strong>Result: </strong>ESCARGOT significantly outperforms industry-standard RAG methods, particularly in open-ended questions that demand high precision. ESCARGOT also offers greater transparency in its reasoning process, allowing for the vetting of both code and knowledge requests, in contrast to the black-box nature of LLM-only or RAG-based approaches.</p><p><strong>Availability and implementation: </strong>ESCARGOT is available as a pip package and on GitHub at: https://github.com/EpistasisLab/ESCARGOT.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11796095/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143026151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信