Genomics, proteomics & bioinformatics最新文献

筛选
英文 中文
Centromere Landscapes Resolved from Hundreds of Human Genomes. 从数百个人类基因组中解析中心粒景观
Genomics, proteomics & bioinformatics Pub Date : 2024-12-03 DOI: 10.1093/gpbjnl/qzae071
Shenghan Gao, Yimeng Zhang, Stephen J Bush, Bo Wang, Xiaofei Yang, Kai Ye
{"title":"Centromere Landscapes Resolved from Hundreds of Human Genomes.","authors":"Shenghan Gao, Yimeng Zhang, Stephen J Bush, Bo Wang, Xiaofei Yang, Kai Ye","doi":"10.1093/gpbjnl/qzae071","DOIUrl":"10.1093/gpbjnl/qzae071","url":null,"abstract":"<p><p>High-fidelity (HiFi) sequencing has facilitated the assembly and analysis of the most repetitive region of the genome, the centromere. Nevertheless, our current understanding of human centromeres is based on a relatively small number of telomere-to-telomere assemblies, which have not yet captured its full diversity. In this study, we investigated the genomic diversity of human centromere higher order repeats (HORs) via both HiFi reads and haplotype-resolved assemblies from hundreds of samples drawn from ongoing pangenome-sequencing projects and reprocessed them via a novel HOR annotation pipeline, HiCAT-human. We used this wealth of data to provide a global survey of the centromeric HOR landscape; in particular, we found that 23 HORs presented significant copy number variability between populations. We detected three centromere genotypes with unbalanced population frequencies on chromosomes 5, 8, and 17. An inter-assembly comparison of HOR loci further revealed that while HOR array structures are diverse, they nevertheless tend to form a number of specific landscapes, each exhibiting different levels of HOR subunit expansion and possibly reflecting a cyclical evolutionary transition from homogeneous to nested structures and back.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11652271/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeOri 10.0: An Updated Database of Experimentally Identified Eukaryotic Replication Origins. DeOri 10.0:经实验鉴定的真核生物复制起源的最新数据库。
Genomics, proteomics & bioinformatics Pub Date : 2024-12-03 DOI: 10.1093/gpbjnl/qzae076
Yu-Hao Zeng, Zhen-Ning Yin, Hao Luo, Feng Gao
{"title":"DeOri 10.0: An Updated Database of Experimentally Identified Eukaryotic Replication Origins.","authors":"Yu-Hao Zeng, Zhen-Ning Yin, Hao Luo, Feng Gao","doi":"10.1093/gpbjnl/qzae076","DOIUrl":"10.1093/gpbjnl/qzae076","url":null,"abstract":"<p><p>DNA replication is a complex and crucial biological process in eukaryotes. To facilitate the study of eukaryotic replication events, we present a database of eukaryotic DNA replication origins (DeOri), which collects genome-wide data on eukaryotic DNA replication origins currently available. With the rapid development of high-throughput experimental technology in recent years, the number of datasets in the new release of DeOri 10.0 increased from 10 to 151 and the number of sequences increased from 16,145 to 9,742,396. Besides nucleotide sequences and browser extensible data (BED) files, corresponding annotation files, such as coding sequences (CDSs), mRNAs, and other biological elements within replication origins, are also provided. The experimental techniques used for each dataset, as well as related statistical data, are also presented on web page. Differences in experimental methods, cell lines, and sequencing technologies have resulted in distinct replication origins, making it challenging to differentiate between cell-specific and non-specific replication origins. Based on multiple replication origin datasets at the species level, we scored and screened replication origins in Homo sapiens, Gallus gallus, Mus musculus, Drosophila melanogaster, and Caenorhabditis elegans. The screened regions with high scores were considered as species-conservative origins, which are integrated and presented as reference replication origins (rORIs). Additionally, we analyzed the distribution of relevant genomic elements associated with replication origins at the genome level, such as CpG island (CGI), transcription start site (TSS), and G-quadruplex (G4). These analysis results can be browsed and downloaded as needed at http://tubic.tju.edu.cn/deori/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11652270/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiome-wide Association Studies: Novel Approaches for Understanding Diseases. 多组学介导的广泛关联研究:了解疾病的新方法。
Genomics, proteomics & bioinformatics Pub Date : 2024-12-03 DOI: 10.1093/gpbjnl/qzae077
Mengting Shao, Kaiyang Chen, Shuting Zhang, Min Tian, Yan Shen, Chen Cao, Ning Gu
{"title":"Multiome-wide Association Studies: Novel Approaches for Understanding Diseases.","authors":"Mengting Shao, Kaiyang Chen, Shuting Zhang, Min Tian, Yan Shen, Chen Cao, Ning Gu","doi":"10.1093/gpbjnl/qzae077","DOIUrl":"10.1093/gpbjnl/qzae077","url":null,"abstract":"<p><p>The rapid development of multiome (transcriptome, proteome, cistrome, imaging, and regulome)-wide association study methods have opened new avenues for biologists to understand the susceptibility genes underlying complex diseases. Thorough comparisons of these methods are essential for selecting the most appropriate tool for a given research objective. This review provides a detailed categorization and summary of the statistical models, use cases, and advantages of recent multiome-wide association studies. In addition, to illustrate gene-disease association studies based on transcriptome-wide association study (TWAS), we collected 478 disease entries across 22 categories from 235 manually reviewed publications. Our analysis reveals that mental disorders are the most frequently studied diseases by TWAS, indicating its potential to deepen our understanding of the genetic architecture of complex diseases. In summary, this review underscores the importance of multiome-wide association studies in elucidating complex diseases and highlights the significance of selecting the appropriate method for each study.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630051/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identify Non-mutational p53 Functional Deficiency in Human Cancers. 确定人类癌症中的非突变 p53 功能缺陷。
Genomics, proteomics & bioinformatics Pub Date : 2024-12-03 DOI: 10.1093/gpbjnl/qzae064
Qianpeng Li, Yang Zhang, Sicheng Luo, Zhang Zhang, Ann L Oberg, David E Kozono, Hua Lu, Jann N Sarkaria, Lina Ma, Liguo Wang
{"title":"Identify Non-mutational p53 Functional Deficiency in Human Cancers.","authors":"Qianpeng Li, Yang Zhang, Sicheng Luo, Zhang Zhang, Ann L Oberg, David E Kozono, Hua Lu, Jann N Sarkaria, Lina Ma, Liguo Wang","doi":"10.1093/gpbjnl/qzae064","DOIUrl":"10.1093/gpbjnl/qzae064","url":null,"abstract":"<p><p>An accurate assessment of p53's functional statuses is critical for cancer genomic medicine. However, there is a significant challenge in identifying tumors with non-mutational p53 inactivation which is not detectable through DNA sequencing. These undetected cases are often misclassified as p53-normal, leading to inaccurate prognosis and downstream association analyses. To address this issue, we built the support vector machine (SVM) models to systematically reassess p53's functional statuses in TP53 wild-type (TP53WT) tumors from multiple The Cancer Genome Atlas (TCGA) cohorts. Cross-validation demonstrated the good performance of the SVM models with a mean area under the receiver operating characteristic curve (AUROC) of 0.9822, precision of 0.9747, and recall of 0.9784. Our study revealed that a significant proportion (87%-99%) of TP53WT tumors actually had compromised p53 function. Additional analyses uncovered that these genetically intact but functionally impaired (termed as predictively reduced function of p53 or TP53WT-pRF) tumors exhibited genomic and pathophysiologic features akin to TP53-mutant tumors: heightened genomic instability and elevated levels of hypoxia. Clinically, patients with TP53WT-pRF tumors experienced significantly shortened overall survival or progression-free survival compared to those with predictively normal function of p53 (TP53WT-pN) tumors, and these patients also displayed increased sensitivity to platinum-based chemotherapy and radiation therapy.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11702981/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evolution of Plant Genome Size and Composition. 植物基因组大小和组成的进化。
Genomics, proteomics & bioinformatics Pub Date : 2024-12-03 DOI: 10.1093/gpbjnl/qzae078
Bing He, Wanfei Liu, Jianyang Li, Siwei Xiong, Jing Jia, Qiang Lin, Hailin Liu, Peng Cui
{"title":"Evolution of Plant Genome Size and Composition.","authors":"Bing He, Wanfei Liu, Jianyang Li, Siwei Xiong, Jing Jia, Qiang Lin, Hailin Liu, Peng Cui","doi":"10.1093/gpbjnl/qzae078","DOIUrl":"10.1093/gpbjnl/qzae078","url":null,"abstract":"<p><p>The rapid development of sequencing technology has led to an explosion of plant genome data, opening up more opportunities for research in the field of comparative evolutionary analysis of plant genomes. In this review, we focus on changes in plant genome size and composition, examining the effects of polyploidy, whole-genome duplication, and alternations in transposable elements on plant genome architecture and evolution, respectively. In addition, to address gaps in the available information, we also collected and analyzed 234 representative plant genome data as a supplement. We aim to provide a comprehensive, up-to-date summary of information on plant genome architecture and evolution in this review.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630846/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DRED: A Comprehensive Database of Genes Related to Repeat Expansion Diseases. DRED:与重复扩增疾病相关的基因综合数据库。
Genomics, proteomics & bioinformatics Pub Date : 2024-12-03 DOI: 10.1093/gpbjnl/qzae068
Qingqing Shi, Min Dai, Yingke Ma, Jun Liu, Xiuying Liu, Xiu-Jie Wang
{"title":"DRED: A Comprehensive Database of Genes Related to Repeat Expansion Diseases.","authors":"Qingqing Shi, Min Dai, Yingke Ma, Jun Liu, Xiuying Liu, Xiu-Jie Wang","doi":"10.1093/gpbjnl/qzae068","DOIUrl":"10.1093/gpbjnl/qzae068","url":null,"abstract":"<p><p>Expansion of tandem repeats in genes often causes severe diseases, such as fragile X syndrome, Huntington's disease, and spinocerebellar ataxia. However, information on genes associated with repeat expansion diseases is scattered throughout the literature, systematic prediction of potential genes that may cause diseases via repeat expansion is also lacking. Here, we develop DRED, a Database of genes related to Repeat Expansion Diseases, as a manually-curated database that covers all known 61 genes related to repeat expansion diseases reported in PubMed and OMIM, along with the detailed repeat information for each gene. DRED also includes 516 genes with the potential to cause diseases via repeat expansion, which were predicted based on their repeat composition, genetic variations, genomic features, and disease associations. Various types of information on repeat expansion diseases and their corresponding genes/repeats are presented in DRED, together with links to external resources, such as NCBI and ClinVar. DRED provides user-friendly interfaces with comprehensive functions, and can serve as a central data resource for basic research and repeat expansion disease-related medical diagnosis. DRED is freely accessible at http://omicslab.genetics.ac.cn/dred, and will be frequently updated to include newly reported genes related to repeat expansion diseases.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11696699/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Variant Calling in Whole-exome Sequencing Data Using Population-matched Reference Genomes. 利用人群匹配参考基因组增强全基因组测序数据中的变异调用。
Genomics, proteomics & bioinformatics Pub Date : 2024-12-03 DOI: 10.1093/gpbjnl/qzae070
Shuming Guo, Zhuo Huang, Yanming Zhang, Yukun He, Xiangju Chen, Wenjuan Wang, Lansheng Li, Yu Kang, Zhancheng Gao, Jun Yu, Zhenglin Du, Yanan Chu
{"title":"Enhancing Variant Calling in Whole-exome Sequencing Data Using Population-matched Reference Genomes.","authors":"Shuming Guo, Zhuo Huang, Yanming Zhang, Yukun He, Xiangju Chen, Wenjuan Wang, Lansheng Li, Yu Kang, Zhancheng Gao, Jun Yu, Zhenglin Du, Yanan Chu","doi":"10.1093/gpbjnl/qzae070","DOIUrl":"10.1093/gpbjnl/qzae070","url":null,"abstract":"<p><p>Whole-exome sequencing (WES) data are frequently used for cancer diagnosis and genome-wide association studies (GWAS), based on high-coverage read mapping, informative variant calling, and high-quality reference genomes. The center position of the currently used genome assembly, GRCh38, is now challenged by two newly published telomere-to-telomere (T2T) genomes, T2T-CHM13 and T2T-YAO, and it becomes urgent to have a comparative study to test population specificity using the three reference genomes based on real case WES data. Here, we report our analysis along this line for 19 tumor samples collected from Chinese patients. The primary comparison of the exon regions among the three references reveals that the sequences in up to ∼ 1% of target regions in T2T-YAO are widely diversified from GRCh38 and may lead to off-target in sequence capture. However, T2T-YAO still outperforms GRCh38 by obtaining 7.41% of more mapped reads. Due to more reliable read-mapping and closer phylogenetic relationship with the samples than GRCh38, T2T-YAO reduces half of variant calls of clinical significance which are mostly benign, while maintaining sensitivity in identifying pathogenic variants. T2T-YAO also outperforms T2T-CHM13 in reducing calls of Chinese-specific variants. Our findings highlight the critical need for employing population-specific reference genomes in genomic analysis to ensure accurate variant analysis and the significant benefits of tailoring these approaches to the unique genetic background of each ethnic group.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11687947/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RAG-seq: NSR-primed and Transposase Tagmentation-mediated Strand-specific Total RNA Sequencing in Single Cells. RAG-seq:NSR引物和转座酶标记介导的单细胞链特异性总RNA测序。
Genomics, proteomics & bioinformatics Pub Date : 2024-12-03 DOI: 10.1093/gpbjnl/qzae072
Ping Xu, Zhiheng Yuan, Xiaohua Lu, Peng Zhou, Ding Qiu, Zhenghao Qiao, Zhongcheng Zhou, Li Guan, Yongkang Jia, Xuan He, Ling Sun, Youzhong Wan, Ming Wang, Yang Yu
{"title":"RAG-seq: NSR-primed and Transposase Tagmentation-mediated Strand-specific Total RNA Sequencing in Single Cells.","authors":"Ping Xu, Zhiheng Yuan, Xiaohua Lu, Peng Zhou, Ding Qiu, Zhenghao Qiao, Zhongcheng Zhou, Li Guan, Yongkang Jia, Xuan He, Ling Sun, Youzhong Wan, Ming Wang, Yang Yu","doi":"10.1093/gpbjnl/qzae072","DOIUrl":"10.1093/gpbjnl/qzae072","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) has transformed our understanding of cellular diversity with unprecedented resolution. However, many current methods are limited in capturing full-length transcripts and discerning strand orientation. Here, we present RAG-seq, an innovative strand-specific total RNA sequencing technique that combines not-so-random (NSR) primers with Tn5 transposase-mediated tagmentation. RAG-seq overcomes previous limitations by delivering comprehensive transcript coverage and maintaining strand orientation, which are essential for accurate quantification of overlapping genes and detection of antisense transcripts. Through optimized reverse transcription with oligo-dT primers, rRNA depletion via Depletion of Abundant Sequences by Hybridization (DASH), and linear amplification, RAG-seq enhances sensitivity and reproducibility, especially for low-input samples and single cells. Application to mouse oocytes and early embryos highlights RAG-seq's superior performance in identifying stage-specific antisense transcripts, shedding light on their regulatory roles during early development. This advancement represents a significant leap in transcriptome analysis within complex biological contexts.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11658833/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RNA 5-Methylcytosine Modification: Regulatory Molecules, Biological Functions, and Human Diseases. RNA 5-甲基胞嘧啶修饰:调节分子、生物功能和人类疾病。
Genomics, proteomics & bioinformatics Pub Date : 2024-12-03 DOI: 10.1093/gpbjnl/qzae063
Yanfang Lu, Liu Yang, Qi Feng, Yong Liu, Xiaohui Sun, Dongwei Liu, Long Qiao, Zhangsuo Liu
{"title":"RNA 5-Methylcytosine Modification: Regulatory Molecules, Biological Functions, and Human Diseases.","authors":"Yanfang Lu, Liu Yang, Qi Feng, Yong Liu, Xiaohui Sun, Dongwei Liu, Long Qiao, Zhangsuo Liu","doi":"10.1093/gpbjnl/qzae063","DOIUrl":"10.1093/gpbjnl/qzae063","url":null,"abstract":"<p><p>RNA methylation modifications influence gene expression, and disruptions of these processes are often associated with various human diseases. The common RNA methylation modification 5-methylcytosine (m5C), which is dynamically regulated by writers, erasers, and readers, widely occurs in transfer RNAs (tRNAs), messenger RNAs (mRNAs), ribosomal RNAs (rRNAs), enhancer RNAs (eRNAs), and other non-coding RNAs (ncRNAs). RNA m5C modification regulates metabolism, stability, nuclear export, and translation of RNA molecules. An increasing number of studies have revealed the critical roles of the m5C RNA modification and its regulators in the development, diagnosis, prognosis, and treatment of various human diseases. In this review, we summarized the recent studies on RNA m5C modification and discussed the advances in its detection methodologies, distribution, and regulators. Furthermore, we addressed the significance of RNAs modified with m5C marks in essential biological processes as well as in the development of various human disorders, from neurological diseases to cancers. This review provides a new perspective on the diagnosis, treatment, and monitoring of human diseases by elucidating the complex regulatory network of the epigenetic m5C modification.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11634542/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
iMFP-LG: Identification of Novel Multi-Functional Peptides by Using Protein Language Models and Graph-Based Deep Learning. iMFP-LG:利用蛋白质语言模型和基于图的深度学习识别新型多功能肽。
Genomics, proteomics & bioinformatics Pub Date : 2024-11-25 DOI: 10.1093/gpbjnl/qzae084
Jiawei Luo, Kejuan Zhao, Junjie Chen, Caihua Yang, Fuchuan Qu, Yumeng Liu, Xiaopeng Jin, Ke Yan, Yang Zhang, Bin Liu
{"title":"iMFP-LG: Identification of Novel Multi-Functional Peptides by Using Protein Language Models and Graph-Based Deep Learning.","authors":"Jiawei Luo, Kejuan Zhao, Junjie Chen, Caihua Yang, Fuchuan Qu, Yumeng Liu, Xiaopeng Jin, Ke Yan, Yang Zhang, Bin Liu","doi":"10.1093/gpbjnl/qzae084","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae084","url":null,"abstract":"<p><p>Functional peptides are short amino acid fragments that have a wide range of beneficial functions for living organisms. The majority of previous research focused on mono-functional peptides, but a growing number of multi-functional peptides have been discovered. Although there have been enormous experimental efforts to assay multi-functional peptides, only a small fraction of millions of known peptides have been explored. Effective and precise techniques for identifying multi-functional peptides can facilitate their discovery and mechanistic understanding. In this article, we presented a method iMFP-LG for identifying multi-functional peptides based on protein language models (pLMs) and graph attention networks (GATs). Comparison results showed that iMFP-LG outperforms state-of-the-art methods on both multi-functional bioactive peptides and multi-functional therapeutic peptides datasets. The interpretability of iMFP-LG was also illustrated by visualizing attention patterns in pLMs and GATs. Regarding the outstanding performance of iMFP-LG on the identification of multi-functional peptides, we employed iMFP-LG to screen novel candidate peptides with both ACP and AMP functions from millions of known peptides in the UniRef90. As a result, 8 candidate peptides were identified, and 1 candidate that exhibits both antibacterial and anticancer effects was confirmed through molecular structure alignment and biological experiments. We anticipate that iMFP-LG can assist in the discovery of multi-functional peptides and contribute to the advancement of peptide drug design.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142712263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信