Bioinformatics advances最新文献

筛选
英文 中文
Pandora: a tool to estimate dimensionality reduction stability of genotype data. 一个估计基因型数据降维稳定性的工具。
IF 2.4
Bioinformatics advances Pub Date : 2025-03-03 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf040
Julia Haag, Alexander I Jordan, Alexandros Stamatakis
{"title":"Pandora: a tool to estimate dimensionality reduction stability of genotype data.","authors":"Julia Haag, Alexander I Jordan, Alexandros Stamatakis","doi":"10.1093/bioadv/vbaf040","DOIUrl":"10.1093/bioadv/vbaf040","url":null,"abstract":"<p><strong>Motivation: </strong>Genotype datasets typically contain a large number of single-nucleotide polymorphisms for a comparatively small number of individuals. To identify similarities between individuals and to infer an individual's origin or membership to a population, dimensionality reduction techniques are routinely deployed. However, inherent (technical) difficulties such as missing or noisy data need to be accounted for when analyzing a lower dimensional representation of genotype data, and the intrinsic uncertainty of such analyses should be reported in all studies. However, to date, there exists no stability assessment technique for genotype data that can estimate this uncertainty.</p><p><strong>Results: </strong>Here, we present Pandora, a stability estimation framework for genotype data based on bootstrapping. Pandora computes an overall score to quantify the stability of the entire embedding, infers per-individual support values, and also deploys a <math><mi>k</mi></math> -means clustering approach to assess the uncertainty of assignments to potential cultural groups. Using published empirical and simulated datasets, we demonstrate the usage and utility of Pandora for studies that rely on dimensionality reduction techniques.</p><p><strong>Availability and implementation: </strong>Pandora is available on GitHub: https://github.com/tschuelia/Pandora.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf040"},"PeriodicalIF":2.4,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11955236/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143756191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
scLTNN: an innovative tool for automatically visualizing single-cell trajectories. scLTNN:用于自动可视化单细胞轨迹的创新工具。
IF 2.4
Bioinformatics advances Pub Date : 2025-02-26 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf033
Cencan Xing, Zehua Zeng, Lei Hu, Jianing Kang, Shah Roshan, Yuanyan Xiong, Hongwu Du, Tongbiao Zhao
{"title":"scLTNN: an innovative tool for automatically visualizing single-cell trajectories.","authors":"Cencan Xing, Zehua Zeng, Lei Hu, Jianing Kang, Shah Roshan, Yuanyan Xiong, Hongwu Du, Tongbiao Zhao","doi":"10.1093/bioadv/vbaf033","DOIUrl":"10.1093/bioadv/vbaf033","url":null,"abstract":"<p><strong>Motivation: </strong>Cellular state identification and trajectory inference enable the computational simulation of cell fate dynamics using single-cell RNA sequencing data. However, existing methods for constructing cell fate trajectories demand substantial computational resources or prior knowledge of the developmental process.</p><p><strong>Results: </strong>Here, based on the discovery of the consistent expression distribution of highly variable genes, we create a new tool named scRNA-seq latent time neural network (scLTNN) by combining an artificial neural network with a distribution model. This innovative tool is pre-trained and capable of automatically inferring the origin and terminal state of cells, and accurately illustrating the developmental trajectory of cells with minimal use of computational resources and time. We implement scLTNN on human bone marrow cells, mouse pancreatic endocrine lineage, and axial mesoderm lineage of zebrafish embryo, accurately reconstructing their cell fate trajectories, respectively. Our scLTNN tool provides a straightforward and efficient method for illustrating cell fate trajectories, applicable across various species without the need for prior knowledge of the biological process.</p><p><strong>Availability and implementation: </strong>https://github.com/Starlitnightly/scLTNN.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf033"},"PeriodicalIF":2.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11889453/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143588436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing gene selection and module identification via ontology-based scoring and deep learning. 通过基于本体的评分和深度学习优化基因选择和模块识别。
IF 2.4
Bioinformatics advances Pub Date : 2025-02-26 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf034
Boutaina Ettetuani, Rajaa Chahboune, Ahmed Moussa
{"title":"Optimizing gene selection and module identification via ontology-based scoring and deep learning.","authors":"Boutaina Ettetuani, Rajaa Chahboune, Ahmed Moussa","doi":"10.1093/bioadv/vbaf034","DOIUrl":"10.1093/bioadv/vbaf034","url":null,"abstract":"<p><strong>Motivation: </strong>Understanding gene interactions and their biological significance is a key challenge in computational biology. The complexity of biological systems, coupled with high-dimensional omics data, necessitates robust methods for gene selection and interaction analysis. Traditional statistical techniques often struggle with the hierarchical nature of gene ontology (GO) terms, leading to redundancy and limited interpretability. Meanwhile, deep learning models require biologically meaningful input to enhance their predictive power.</p><p><strong>Results: </strong>We present an integrated framework that enhances gene selection and uncovers gene interactions by combining a novel statistical algorithm with a deep neural network model. The statistical algorithm ranks differentially expressed genes by correlating their expression scores with the semantic similarity of their biological context, utilizing GO information to align genes with known pathways. The deep neural network then identifies interaction modules by integrating genes from different clusters based on regulatory pathway data. This model effectively navigates the hierarchical complexity of GO terms structured as directed acyclic graphs, employing a feed-forward architecture optimized via back-propagation. Our results demonstrate improved gene selection accuracy and enhanced discovery of biologically relevant interactions, providing valuable insights into complex disease mechanisms.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf034"},"PeriodicalIF":2.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12073971/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144053693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new ensemble learning method stratified sampling blending optimizes conventional blending and improves prediction performance. 一种新的集合学习方法分层抽样混合法优化了传统混合法,提高了预测性能。
IF 2.4
Bioinformatics advances Pub Date : 2025-02-22 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf002
Na Miao, Mengke Yang, Pingping Han, Jiakun Qiao, Zhaoxuan Che, Fangjun Xu, Xiangyu Dai, Mengjin Zhu
{"title":"A new ensemble learning method stratified sampling blending optimizes conventional blending and improves prediction performance.","authors":"Na Miao, Mengke Yang, Pingping Han, Jiakun Qiao, Zhaoxuan Che, Fangjun Xu, Xiangyu Dai, Mengjin Zhu","doi":"10.1093/bioadv/vbaf002","DOIUrl":"https://doi.org/10.1093/bioadv/vbaf002","url":null,"abstract":"<p><strong>Motivation: </strong>Ensemble learning, as a powerful machine learning method, improves overall prediction performance by combining the prediction results of multiple base models. Blending, as a popular ensemble learning method, can train multiple base models, input the resulting prediction results to further train meta model and obtain final prediction results. However, conventional blending divides the training set by simple random sampling, which causes bias and large variance, thus affecting the stability and accuracy of prediction performance. In this study, we propose a new algorithm of stratified sampling blending (ssBlending), which addresses the algorithm instability of conventional blending caused by the random partition of the training set, further improving the prediction accuracy.</p><p><strong>Results: </strong>We used multiple genotype data sets from different species including animal (pig), plant (loblolly pine), and microorganism (yeast) to test the prediction performance of ssBlending. The across-species multi-dataset verification results reveal that ssBlending is superior to conventional blending in terms of prediction accuracy and stability. In addition, we optimized the training set sampling rate (BestH) to facilitate the practical application of the ssBlending algorithm. In summary, this study proposes a completely new algorithm combing stratification strategy with the conventional blending, which provides more options for ensemble learning in various fields.</p><p><strong>Availability and implementation: </strong>https://figshare.com/s/23122a18dc8a35f12ff6.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf002"},"PeriodicalIF":2.4,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11908643/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143652375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XeroGraph: enhancing data integrity in the presence of missing values with statistical and predictive analysis. 在统计和预测分析缺失值的情况下,增强数据完整性。
IF 2.4
Bioinformatics advances Pub Date : 2025-02-21 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf035
Laila Mousafi Alasal, Emma U Hammarlund, Kenneth J Pienta, Lars Rönnstrand, Julhash U Kazi
{"title":"XeroGraph: enhancing data integrity in the presence of missing values with statistical and predictive analysis.","authors":"Laila Mousafi Alasal, Emma U Hammarlund, Kenneth J Pienta, Lars Rönnstrand, Julhash U Kazi","doi":"10.1093/bioadv/vbaf035","DOIUrl":"10.1093/bioadv/vbaf035","url":null,"abstract":"<p><strong>Motivation: </strong>Missing data present a pervasive challenge in data analysis, potentially biasing outcomes and undermining conclusions if not addressed properly. Missing data are commonly classified into Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR). While MCAR poses a minimal risk of data distortion, both MAR and MNAR can seriously affect the results of subsequent analyses. Therefore, it is important to know the type of missing data and appropriately handle them.</p><p><strong>Results: </strong>To facilitate efficient handling of missing data, we introduce a Python package named XeroGraph that is designed to evaluate data quality, categorize the nature of missingness, and guide imputation decisions. By comparing how various imputation methods influence underlying distributions, XeroGraph provides a systematic framework that supports more accurate and transparent analyses. Through its comprehensive preliminary assessments and user-friendly interface, this package facilitates the selection of optimal strategies tailored to the specific missing data mechanisms present in a dataset. In doing so, XeroGraph may significantly improve the validity and reproducibility of research findings, making it a valuable tool for professionals in data-intensive fields.</p><p><strong>Availability and implementation: </strong>XeroGraph is compatible with all operating systems and requires Python version 3.9 or higher. It can be freely downloaded from PyPI (https://pypi.org/project/XeroGraph). The source code is accessible on GitHub (https://github.com/kazilab/XeroGraph), and comprehensive documentation is available at Read the Docs (https://xerograph.readthedocs.io). This software is distributed under the Apache License 2.0.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf035"},"PeriodicalIF":2.4,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11889451/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143588440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive bioinformatic evaluation of the NTRK family's potential as prognostic biomarkers in breast cancer. NTRK家族作为乳腺癌预后生物标志物的潜力的综合生物信息学评估。
IF 2.4
Bioinformatics advances Pub Date : 2025-02-21 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf030
Ramtin Mohammadi, Mohsen Ghiasi, Saber Mehdizadeh, Javad Mohammadi, Shahla Mohammad Ganji
{"title":"A comprehensive bioinformatic evaluation of the NTRK family's potential as prognostic biomarkers in breast cancer.","authors":"Ramtin Mohammadi, Mohsen Ghiasi, Saber Mehdizadeh, Javad Mohammadi, Shahla Mohammad Ganji","doi":"10.1093/bioadv/vbaf030","DOIUrl":"10.1093/bioadv/vbaf030","url":null,"abstract":"<p><strong>Motivation: </strong>Breast cancer (BC), with its rising prevalence and mortality rate, is one of the most significant human health issues. The family of transmembrane tyrosine kinases that promote neuronal growth includes the neurotrophic tyrosine kinase receptors (NTRKs). NTRK1-3 genes encode the members of this family. Alterations of NTRK genes can induce carcinogenesis both in neurogenic and non-neurogenic cells. The prevalence of NTRK gene fusion is under 1% in solid tumours but is highly encountered in rare tumours. Since the prognostic values of NTRK families' expression in various types of cancer are becoming increasingly evident, we aimed to conduct a comprehensive bioinformatics study evaluating the prognostic significance of the NTRK family in BC. Online bioinformatic databases including TCGA, UALCAN, Kaplan-Meier plotter, bc-GenExMiner, cBioPortal, STRING, Enrichr, and TIMER were utilized for analysis.</p><p><strong>Results: </strong>High levels of NTRK2 and 3 demonstrated better associations with overall survival (OS) and recurrence-free survival (RFS) in BC patients (<i>P</i> < .05), while high levels of NTRK1 showed an applicable correlation with RFS in BC patients (<i>P</i> < .001). Our findings provide a new outlook that might aid in the field of personalized medicine and therapeutic use of NTRK as a prognostic biomarker in BC.</p><p><strong>Availability and implementation: </strong>All data generated or analysed during this study are included in this published article.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf030"},"PeriodicalIF":2.4,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11886811/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143588429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BioModTool: from biomass composition data to structured biomass objective functions for genome-scale metabolic models. BioModTool:从生物量组成数据到基因组尺度代谢模型的结构化生物量目标函数。
IF 2.4
Bioinformatics advances Pub Date : 2025-02-21 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf036
Clémence Dupont Thibert, Sylvaine Roy, Gilles Curien, Maxime Durot
{"title":"BioModTool: from biomass composition data to structured biomass objective functions for genome-scale metabolic models.","authors":"Clémence Dupont Thibert, Sylvaine Roy, Gilles Curien, Maxime Durot","doi":"10.1093/bioadv/vbaf036","DOIUrl":"10.1093/bioadv/vbaf036","url":null,"abstract":"<p><strong>Summary: </strong>BioModTool is a Python program allowing easy generation of biomass objective functions for genome-scale metabolic models from user data. BioModTool loads biomass composition data in the form of a structured Excel file completed by the user, normalizes these data into model-compatible units (mmol.gDW<sup>-1</sup>), and creates a structured biomass objective function to update a metabolic model. Aimed at a wide range of users, BioModTool can be run as a Python module compatible with COBRApy but also comes with an interface allowing its use by non-modelers. By providing an easy definition of new biomass objective functions, BioModTool can accelerate new genome-scale metabolic reconstructions, improve existing ones, and facilitate biomass-specific experimental datasets analyses with genome-scale models.</p><p><strong>Availability and implementation: </strong>BioModTool is publicly available on PyPI (https://pypi.org/project/BioModTool/) under a GNU Lesser General Public License (LGPL). Installation instructions and source code are available on GitHub (https://github.com/Total-RD/BioModTool). BioModTool is compatible with Windows, Linux, and MacOS operating systems.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf036"},"PeriodicalIF":2.4,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11891441/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143598360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Masked language modeling pretraining dynamics for downstream peptide: T-cell receptor binding prediction. 隐藏语言建模预训练动态下游肽:t细胞受体结合预测。
IF 2.4
Bioinformatics advances Pub Date : 2025-02-20 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf028
Brock Landry, Jian Zhang
{"title":"Masked language modeling pretraining dynamics for downstream peptide: T-cell receptor binding prediction.","authors":"Brock Landry, Jian Zhang","doi":"10.1093/bioadv/vbaf028","DOIUrl":"https://doi.org/10.1093/bioadv/vbaf028","url":null,"abstract":"<p><strong>Motivation: </strong>Predicting antigen peptide and T-cell receptor (TCR) binding is difficult due to the combinatoric nature of peptides and the scarcity of labeled peptide-binding pairs. The masked language modeling method of pretraining is reliably used to increase the downstream performance of peptide:TCR binding prediction models by leveraging unlabeled data. In the literature, binding prediction models are commonly trained until the validation loss converges. To evaluate this method, cited transformer model architectures pretrained with masked language modeling are investigated to assess the benefits of achieving lower loss metrics during pretraining. The downstream performance metrics for these works are recorded after each subsequent interval of masked language modeling pretraining.</p><p><strong>Results: </strong>The results demonstrate that the downstream performance benefit achieved from masked language modeling peaks substantially before the pretraining loss converges. Using the pretraining loss metric is largely ineffective for precisely identifying the best downstream performing pretrained model checkpoints (or saved states). However, the pretraining loss metric in these scenarios can be used to mark a threshold in which the downstream performance benefits from pretraining have fully diminished. Further pretraining beyond this threshold does not negatively impact downstream performance but results in unpredictable bilateral deviations from the post-threshold average downstream performance benefit.</p><p><strong>Availability and implementation: </strong>The datasets used in this article for model training are publicly available from each original model's authors at https://github.com/SFGLab/bertrand, https://github.com/wukevin/tcr-bert, https://github.com/NKI-AI/STAPLER, and https://github.com/barthelemymp/TULIP-TCR.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf028"},"PeriodicalIF":2.4,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11908642/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143652337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CoDNet: controlled diffusion network for structure-based drug design. CoDNet:基于结构的药物设计控制扩散网络。
IF 2.4
Bioinformatics advances Pub Date : 2025-02-19 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf031
Fahmi Kazi Md, Shahil Yasar Haque, Eashrat Jahan, Latin Chakma, Tamanna Shermin, Asif Uddin Ahmed, Salekul Islam, Swakkhar Shatabda, Riasat Azim
{"title":"CoDNet: controlled diffusion network for structure-based drug design.","authors":"Fahmi Kazi Md, Shahil Yasar Haque, Eashrat Jahan, Latin Chakma, Tamanna Shermin, Asif Uddin Ahmed, Salekul Islam, Swakkhar Shatabda, Riasat Azim","doi":"10.1093/bioadv/vbaf031","DOIUrl":"10.1093/bioadv/vbaf031","url":null,"abstract":"<p><strong>Motivation: </strong>Structure-based drug design (SBDD) holds promising potential to design ligands with high-binding affinity and rationalize their interaction with targets. By utilizing geometric knowledge of the three-dimensional (3D) structures of target binding sites, SBDD enhances the efficacy and selectivity of therapeutic agents by optimizing binding interactions at the molecular level. Here, we present CoDNet, a novel approach that combines the conditioning capabilities of ControlNet with the potency of the diffusion model to create generative frameworks for molecular compound design. This proposed method pioneers the application of ControlNet in diffusion model-based drug development. Its ability to generate drug-like compounds from 3D conformations is prominent due to its capability to bypass Open Babel post-processing and integrate bond details and molecular information.</p><p><strong>Results: </strong>For the gold standard QM9 dataset, CoDNet outperforms existing state-of-the-art methods with a validity rate of 99.02%. This competitive performance underscores the precision and efficacy of CoDNet's drug design, establishing it as a significant advancement with great potential for enhancing drug development initiatives.</p><p><strong>Availability and implementation: </strong>https://github.com/CoDNet1/EDM_Custom.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf031"},"PeriodicalIF":2.4,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11886848/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143588433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BADGER: biologically-aware interpretable differential gene expression ranking model. BADGER:生物感知可解释的差异基因表达排序模型。
IF 2.4
Bioinformatics advances Pub Date : 2025-02-18 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf029
Hajung Kim, Mogan Gim, Seungheun Baek, Soyon Park, Sunkyu Kim, Jaewoo Kang
{"title":"BADGER: biologically-aware interpretable differential gene expression ranking model.","authors":"Hajung Kim, Mogan Gim, Seungheun Baek, Soyon Park, Sunkyu Kim, Jaewoo Kang","doi":"10.1093/bioadv/vbaf029","DOIUrl":"10.1093/bioadv/vbaf029","url":null,"abstract":"<p><strong>Motivation: </strong>Understanding which genes are significantly affected by drugs is crucial for drug repurposing, as drugs targeting specific pathways in one disease might be effective in another with similar genetic profiles. By analyzing gene expression changes in cells before and after drug treatment, we can identify the genes most impacted by drugs.</p><p><strong>Results: </strong>The Biologically-Aware Interpretable Differential Gene Expression Ranking (BADGER) model is an interpretable model designed to predict gene expression changes resulting from interactions between cancer cell lines and chemical compounds. The model enhances explainability through integration of prior knowledge about drug targets via pathway information, handles novel cancer cell lines through similarity-based embedding, and employs three attention blocks that mimic the cascading effects of chemical compounds. This model overcomes previous limitations of cell line range and explainability constraints in drug-cell response studies. The model demonstrates superior performance over baselines in both unseen cell and unseen pair split evaluations, showing robust prediction capabilities for untested drug-cell line combinations.</p><p><strong>Availability and implementation: </strong>This makes it particularly valuable for drug repurposing scenarios, especially in developing therapeutic plans for new or resistant diseases by leveraging similarities with other diseases. All code and data used in this study are available at https://github.com/dmis-lab/BADGER.git.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf029"},"PeriodicalIF":2.4,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11978390/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信