Briefings in bioinformatics最新文献

筛选
英文 中文
ADCNet: a unified framework for predicting the activity of antibody-drug conjugates. ADCNet:预测抗体-药物偶联物活性的统一框架。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-03 DOI: 10.1093/bib/bbaf228
Liye Chen, Biaoshun Li, Yihao Chen, Mujie Lin, Shipeng Zhang, Chenxin Li, Yu Pang, Ling Wang
{"title":"ADCNet: a unified framework for predicting the activity of antibody-drug conjugates.","authors":"Liye Chen, Biaoshun Li, Yihao Chen, Mujie Lin, Shipeng Zhang, Chenxin Li, Yu Pang, Ling Wang","doi":"10.1093/bib/bbaf228","DOIUrl":"https://doi.org/10.1093/bib/bbaf228","url":null,"abstract":"<p><p>Antibody-drug conjugates (ADCs) have revolutionized the field of cancer treatment in the era of precision medicine due to their ability to precisely target cancer cells and release highly effective drugs. Nevertheless, the rational design and discovery of ADCs remain challenging because the relationship between their quintuple structures and activities is difficult to explore and understand. To address this issue, we first introduce a unified deep learning framework called ADCNet to explore such relationship and help design potential ADCs. The ADCNet highly integrates the protein representation learning language model ESM-2 and small-molecule representation learning language model functional group-based bidirectional encoder representations from transformers to achieve activity prediction through learning meaningful features from antigen and antibody protein sequences of ADC, SMILES strings of linker and payload, and drug-antibody ratio (DAR) value. Based on a carefully designed and manually tailored ADC data set, extensive evaluation results reveal that ADCNet performs best on the test set compared to baseline machine learning models across all evaluation metrics. For example, it achieves an average prediction accuracy of 87.12%, a balanced accuracy of 0.8689, and an area under receiver operating characteristic curve of 0.9293 on the test set. In addition, cross-validation, ablation experiments, and external independent testing results further prove the stability, advancement, and robustness of the ADCNet architecture. For the convenience of the community, we develop the first online platform (https://ADCNet.idruglab.cn) for the prediction of ADCs activity based on the optimal ADCNet model, and the source code is publicly available at https://github.com/idrugLab/ADCNet.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144149285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PhosF3C: a feature fusion architecture with fine-tuned protein language model and conformer for prediction of general phosphorylation site. PhosF3C:一种具有微调蛋白语言模型和构象的特征融合结构,用于预测一般磷酸化位点。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-03 DOI: 10.1093/bib/bbaf242
Yuhuan Liu, Xueying Wang, Haitian Zhong, Jixiu Zhai, Xiaojuan Gong, Tianchi Lu
{"title":"PhosF3C: a feature fusion architecture with fine-tuned protein language model and conformer for prediction of general phosphorylation site.","authors":"Yuhuan Liu, Xueying Wang, Haitian Zhong, Jixiu Zhai, Xiaojuan Gong, Tianchi Lu","doi":"10.1093/bib/bbaf242","DOIUrl":"https://doi.org/10.1093/bib/bbaf242","url":null,"abstract":"<p><p>Protein phosphorylation, a key post-translational modification, provides essential insight into protein properties, making its prediction highly significant. Using the emerging capabilities of large language models (LLMs), we apply Low-Rank Adaptation (LoRA) fine-tuning to ESM2, a powerful protein large language model, to efficiently extract features with minimal computational resources, optimizing task-specific text alignment. Additionally, we integrate the conformer architecture with the feature coupling unit to enhance local and global feature exchange, further improving prediction accuracy. Our model achieves state-of-the-art performance, obtaining area under the curve scores of 79.5%, 76.3%, and 71.4% at the S, T, and Y sites of the general data sets. Based on the powerful feature extraction capabilities of LLMs, we conduct a series of analyses on protein representations, including studies on their structure, sequence, and various chemical properties [such as hydrophobicity (GRAVY), surface charge, and isoelectric point]. We propose a test method called linear regression tomography which is a top-down method using representation to explore the model's feature extraction capabilities. Our resources, including data and code, are publicly accessible at https://github.com/SkywalkerLuke/PhosF3C.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144149301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DECEPTICON: a correlation-based strategy for RNA-seq deconvolution inspired by a variation of the Anna Karenina principle. 霸天虎:一种基于关联的rna序列反褶积策略,灵感来自安娜·卡列尼娜原理的变体。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-03 DOI: 10.1093/bib/bbaf234
Fulan Deng, Jiawei Zou, Miaochen Wang, Yida Gu, Jiale Wu, Lianchong Gao, Yuan Ji, Henry H Y Tong, Jie Chen, Wantao Chen, Lianjiang Tan, Yaoqing Chu, Xin Zou, Jie Hao
{"title":"DECEPTICON: a correlation-based strategy for RNA-seq deconvolution inspired by a variation of the Anna Karenina principle.","authors":"Fulan Deng, Jiawei Zou, Miaochen Wang, Yida Gu, Jiale Wu, Lianchong Gao, Yuan Ji, Henry H Y Tong, Jie Chen, Wantao Chen, Lianjiang Tan, Yaoqing Chu, Xin Zou, Jie Hao","doi":"10.1093/bib/bbaf234","DOIUrl":"https://doi.org/10.1093/bib/bbaf234","url":null,"abstract":"<p><p>Accurately deconvoluting cellular composition from bulk RNA-seq data is pivotal for understanding the tumor microenvironment and advancing precision medicine. Existing methods often struggle to consistently and accurately quantify cell types across heterogeneous RNA-seq datasets, particularly when ground truths are unavailable. In this study, we introduce DECEPTICON, a deconvolution strategy inspired by the Anna Karenina principle, which postulates that successful outcomes share common traits, while failures are more varied. DECEPTICON selects top-performing methods by leveraging correlations between different strategies and combines them dynamically to enhance performance. Our approach demonstrates superior accuracy in predicting cell-type proportions across multiple tumor datasets, improving correlation by 23.9% and reducing root mean square error by 73.5% compared to the best of 50 analyzed strategies. Applied to The Cancer Genome Atlas (TCGA) datasets for breast carcinoma, cervical squamous cell carcinoma, and lung adenocarcinoma, DECEPTICON-based predictions showed improved differentiation between patient prognoses. This correlation-based strategy offers a reliable, flexible tool for deconvoluting complex transcriptomic data and highlights its potential in refining prognostic assessments in oncology and advancing cancer biology.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144149290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MambaPhase: deep learning for liquid-liquid phase separation protein classification. MambaPhase:用于液-液相分离蛋白质分类的深度学习。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-03 DOI: 10.1093/bib/bbaf230
Jianwei Huang, Youli Zhang, Shulin Ren, Ziyang Wang, Xiaocheng Jin, Xiaoli Lu, Yu Zhang, Xiaoping Min, Shengxiang Ge, Jun Zhang, Ningshao Xia
{"title":"MambaPhase: deep learning for liquid-liquid phase separation protein classification.","authors":"Jianwei Huang, Youli Zhang, Shulin Ren, Ziyang Wang, Xiaocheng Jin, Xiaoli Lu, Yu Zhang, Xiaoping Min, Shengxiang Ge, Jun Zhang, Ningshao Xia","doi":"10.1093/bib/bbaf230","DOIUrl":"https://doi.org/10.1093/bib/bbaf230","url":null,"abstract":"<p><p>Liquid-liquid phase separation plays a critical role in cellular processes, including protein aggregation and RNA metabolism, by forming membraneless subcellular structures. Accurate identification of phase-separated proteins is essential for understanding and controlling these processes. Traditional identification methods are effective but often costly and time-consuming. The recent machine learning methods have reduced these costs, but most models are restricted to classifying scaffold and client proteins with limited experimental conditions. To address this limitation, we developed a Mamba-based encoder using contrastive learning that incorporates separation probability, protein type, and experimental conditions. Our model achieved 95.2% accuracy in predicting phase-separated proteins and an ROCAUC score of 0.87 in classifying scaffold and client proteins. Further validation in the DgHBP-2 drug delivery system demonstrated its potential for condition modulation in drug development. This study provides an effective framework for the accurate identification and control of phase separation, facilitating advancements in biomedical research and therapeutic applications.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144149299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepRNA-Twist: language-model-guided RNA torsion angle prediction with attention-inception network. DeepRNA-Twist:语言模型引导的RNA扭转角预测与注意初始网络。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-01 DOI: 10.1093/bib/bbaf199
Abrar Rahman Abir, Md Toki Tahmid, Rafiqul Islam Rayan, M Saifur Rahman
{"title":"DeepRNA-Twist: language-model-guided RNA torsion angle prediction with attention-inception network.","authors":"Abrar Rahman Abir, Md Toki Tahmid, Rafiqul Islam Rayan, M Saifur Rahman","doi":"10.1093/bib/bbaf199","DOIUrl":"10.1093/bib/bbaf199","url":null,"abstract":"<p><p>RNA torsion and pseudo-torsion angles are critical in determining the three-dimensional conformation of RNA molecules, which in turn governs their biological functions. However, current methods are limited by RNA's structural complexity as well as flexibility, with experimental techniques being costly and computational approaches struggling to capture the intricate sequence dependencies needed for accurate predictions. To address these challenges, we introduce DeepRNA-Twist, a novel deep learning framework designed to predict RNA torsion and pseudo-torsion angles directly from sequence. DeepRNA-Twist utilizes RNA language model embeddings, which provides rich, context-aware feature representations of RNA sequences. Additionally, it introduces 2A3IDC module (Attention Augmented Inception Inside Inception with Dilated CNN), combining inception networks with dilated convolutions and multi-head attention mechanism. The dilated convolutions capture long-range dependencies in the sequence without requiring a large number of parameters, while the multi-head attention mechanism enhances the model's ability to focus on both local and global structural features simultaneously. DeepRNA-Twist was rigorously evaluated on benchmark datasets, including RNA-Puzzles, CASP-RNA, and SPOT-RNA-1D, and demonstrated significant improvements over existing methods, achieving state-of-the-art accuracy. Source code is available at https://github.com/abrarrahmanabir/DeepRNA-Twist.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12047705/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143971183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
De-motif sampling: an approach to decompose hierarchical motifs with applications in T cell recognition. 去基序采样:一种在T细胞识别中应用的分解分层基序的方法。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-01 DOI: 10.1093/bib/bbaf221
Xinyi Tang, Ran Liu
{"title":"De-motif sampling: an approach to decompose hierarchical motifs with applications in T cell recognition.","authors":"Xinyi Tang, Ran Liu","doi":"10.1093/bib/bbaf221","DOIUrl":"10.1093/bib/bbaf221","url":null,"abstract":"<p><p>T cell immune recognition requires the interactions among antigen peptides, Major Histocompatibility Complex (MHC) molecules, and T cell receptors (TCRs). While research into the interactions between MHC and peptides is well established, the specific preferences of TCRs for peptides remain less understood. This gap largely stems from the requirement that antigen peptides must be bound to MHC and presented on the cell surface prior to recognition by TCRs. Typically, motifs related to TCR recognition are influenced by MHC characteristics, limiting the direct identification of TCR-specific motifs. To address this challenge, this study introduces a Bayesian method designed to decompose hierarchical motifs independently of MHC constraints. This model, rigorously tested through comprehensive simulation experiments and applied to real data, establishes a clear hierarchical structure for motifs related to T cell recognition.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12082833/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144076073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-objective computational optimization of human 5' UTR sequences. 人类5' UTR序列的多目标计算优化。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-01 DOI: 10.1093/bib/bbaf225
Keisuke Yamada, Kanta Suga, Naoko Abe, Koji Hashimoto, Susumu Tsutsumi, Masahito Inagaki, Fumitaka Hashiya, Hiroshi Abe, Michiaki Hamada
{"title":"Multi-objective computational optimization of human 5' UTR sequences.","authors":"Keisuke Yamada, Kanta Suga, Naoko Abe, Koji Hashimoto, Susumu Tsutsumi, Masahito Inagaki, Fumitaka Hashiya, Hiroshi Abe, Michiaki Hamada","doi":"10.1093/bib/bbaf225","DOIUrl":"https://doi.org/10.1093/bib/bbaf225","url":null,"abstract":"<p><p>The computational design of messenger RNA (mRNA) sequences is a critical technology for both scientific research and industrial applications. Recent advances in prediction and optimization models have enabled the automatic scoring and optimization of $5^prime $ UTR sequences, key upstream elements of mRNA. However, fully automated design of $5^prime $ UTR sequences with more than two objective scores has not yet been explored. In this study, we present a computational pipeline that optimizes human $5^prime $ UTR sequences in a multi-objective framework, addressing up to four distinct and conflicting objectives. Our work represents an important advancement in the multi-objective computational design of mRNA sequences, paving the way for more sophisticated mRNA engineering.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144141511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advances in multi-trait genomic prediction approaches: classification, comparative analysis, and perspectives. 多性状基因组预测方法的进展:分类、比较分析和展望。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-01 DOI: 10.1093/bib/bbaf211
Alain J Mbebi, Facundo Mercado, David Hobby, Hao Tong, Zoran Nikoloski
{"title":"Advances in multi-trait genomic prediction approaches: classification, comparative analysis, and perspectives.","authors":"Alain J Mbebi, Facundo Mercado, David Hobby, Hao Tong, Zoran Nikoloski","doi":"10.1093/bib/bbaf211","DOIUrl":"10.1093/bib/bbaf211","url":null,"abstract":"<p><p>Traits in any organism are not independent, but show considerable integration, observed in a form of couplings and trade-offs. Therefore, improvement in one trait may affect other traits, often in undesired direction. To account for this problem, crop breeding increasingly relies on multi-trait genomic prediction (MT-GP) approaches that leverage the availability of genetic markers from different populations along with advances in high-throughput precision phenotyping. While significant progress has been made to jointly model multiple traits using a variety of statistical and machine learning approaches, there is no systematic comparison of advantages and shortcomings of the existing classes of MT-GP models. Here, we fill this knowledge gap by first classifying the existing MT-GP models and briefly summarizing their general principles, modeling assumptions, and potential limitations. We then perform an extensive comparative analysis with 10 traits measured in an Oryza sativa diversity panel using cross-validation scenarios relevant in breeding practice. Finally, we discuss directions that can enable the building of next generation MT-GP models in addressing pressing challenges in crop breeding.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12070487/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143961401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MMsurv: a multimodal multi-instance multi-cancer survival prediction model integrating pathological images, clinical information, and sequencing data. MMsurv:一个整合病理图像、临床信息和测序数据的多模式、多实例、多癌症生存预测模型。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-01 DOI: 10.1093/bib/bbaf209
Hailong Yang, Jia Wang, Wenyan Wang, Shufang Shi, Lijing Liu, Yuhua Yao, Geng Tian, Peizhen Wang, Jialiang Yang
{"title":"MMsurv: a multimodal multi-instance multi-cancer survival prediction model integrating pathological images, clinical information, and sequencing data.","authors":"Hailong Yang, Jia Wang, Wenyan Wang, Shufang Shi, Lijing Liu, Yuhua Yao, Geng Tian, Peizhen Wang, Jialiang Yang","doi":"10.1093/bib/bbaf209","DOIUrl":"10.1093/bib/bbaf209","url":null,"abstract":"<p><p>Accurate prediction of patient survival rates in cancer treatment is essential for effective therapeutic planning. Unfortunately, current models often underutilize the extensive multimodal data available, affecting confidence in predictions. This study presents MMSurv, an interpretable multimodal deep learning model to predict survival in different types of cancer. MMSurv integrates clinical information, sequencing data, and hematoxylin and eosin-stained whole-slide images (WSIs) to forecast patient survival. Specifically, we segment tumor regions from WSIs into image tiles and employ neural networks to encode each tile into one-dimensional feature vectors. We then optimize clinical features by applying word embedding techniques, inspired by natural language processing, to the clinical data. To better utilize the complementarity of multimodal data, this study proposes a novel fusion method, multimodal fusion method based on compact bilinear pooling and transformer, which integrates bilinear pooling with Transformer architecture. The fused features are then processed through a dual-layer multi-instance learning model to remove prognosis-irrelevant image patches and predict each patient's survival risk. Furthermore, we employ cell segmentation to investigate the cellular composition within the tiles that received high attention from the model, thereby enhancing its interpretive capacity. We evaluate our approach on six cancer types from The Cancer Genome Atlas. The results demonstrate that utilizing multimodal data leads to higher predictive accuracy compared to using single-modal image data, with an average C-index increase from 0.6750 to 0.7283. Additionally, we compare our proposed baseline model with state-of-the-art methods using the C-index and five-fold cross-validation approach, revealing a significant average improvement of nearly 10% in our model's performance.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12077396/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144075688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multicancer analyses of short tandem repeat variations reveal shared gene regulatory mechanisms. 短串联重复序列变异的多癌分析揭示了共享的基因调控机制。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-05-01 DOI: 10.1093/bib/bbaf219
Feifei Xia, Max Adriaan Verbiest, Oxana Lundström, Tugce Bilgin Sonay, Michael Baudis, Maria Anisimova
{"title":"Multicancer analyses of short tandem repeat variations reveal shared gene regulatory mechanisms.","authors":"Feifei Xia, Max Adriaan Verbiest, Oxana Lundström, Tugce Bilgin Sonay, Michael Baudis, Maria Anisimova","doi":"10.1093/bib/bbaf219","DOIUrl":"10.1093/bib/bbaf219","url":null,"abstract":"<p><p>Short tandem repeats (STRs) have been reported to influence gene expression across various human tissues. While STR variations are enriched in colorectal, stomach, and endometrial cancers, particularly in microsatellite instable tumors, their functional effects and regulatory mechanisms on gene expression remain poorly understood across these cancer types. Here, we leverage whole-exome sequencing and gene expression data to identify STRs for which repeat lengths are associated with the expression of nearby genes (eSTRs) in colorectal, stomach, and endometrial tumors. While most eSTRs are cancer-specific, shared eSTRs across multiple cancers exhibit consistent effects on gene expression. Notably, coding-region eSTRs identified in all three cancer types show positive correlations with nearby gene expression. We further validate the functional effects of eSTRs by demonstrating associations between somatic eSTR mutations and gene expression changes during the transition from normal to tumor tissues, suggesting their potential roles in tumorigenesis. Combined with DNA methylation data, we perform the first quantitative analysis of the interplay between STR variations and DNA methylation in tumors. We identify eSTRs where repeat lengths are associated with methylation levels of nearby CpG sites (meSTRs) and show that >70% of eSTRs are significantly linked to local DNA methylation. Importantly, the effects of meSTRs on DNA methylation remain consistent across cancer types. Overall, our findings enhance the understanding of how functional STR variations influence gene expression and DNA methylation. Our study highlights shared regulatory mechanisms of STRs across multiple cancers, offering a foundation for future research into their broader implications in tumor biology.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 3","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12096010/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144118817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信