Journal of Molecular Biology最新文献

筛选
英文 中文
BeetleAtlas: An Ontogenetic and Tissue-specific Transcriptomic Atlas of the Red Flour Beetle Tribolium castaneum 甲虫图谱:红面粉甲虫(Tribolium castaneum)的个体发育和组织特异性转录组图集
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168520
{"title":"BeetleAtlas: An Ontogenetic and Tissue-specific Transcriptomic Atlas of the Red Flour Beetle Tribolium castaneum","authors":"","doi":"10.1016/j.jmb.2024.168520","DOIUrl":"10.1016/j.jmb.2024.168520","url":null,"abstract":"<div><p>The red flour beetle <em>Tribolium castaneum</em> has emerged as a powerful model in insect functional genomics. However, a major limitation in the field is the lack of a detailed spatio-temporal view of the genetic signatures underpinning the function of distinct tissues and life stages. Here, we present an ontogenetic and tissue-specific web-based resource for <em>Tribolium</em> transcriptomics: BeetleAtlas (<span><span>https://www.beetleatlas.org</span><svg><path></path></svg></span>). This web application provides access to a database populated with quantitative expression data for nine adult and seven larval tissues, as well as for four embryonic stages of <em>Tribolium</em>. BeetleAtlas allows one to search for individual <em>Tribolium</em> genes to obtain values of both total gene expression and enrichment in different tissues, together with data for individual isoforms. To facilitate cross-species studies, one can also use <em>Drosophila melanogaster</em> gene identifiers to search for related <em>Tribolium</em> genes. For retrieved genes there are options to identify and display the tissue expression of related <em>Tribolium</em> genes or homologous <em>Drosophila</em> genes. Five additional search modes are available to find genes conforming to any of the following criteria: exhibiting high expression in a particular tissue; showing significant differences in expression between larva and adult; having a peak of expression at a specific stage of embryonic development; belonging to a particular functional category; and displaying a pattern of tissue expression similar to that of a query gene. We illustrate how the different feaures of BeetleAtlas can be used to illuminate our understanding of the genetic mechanisms underpinning the biology of what is the largest animal group on earth.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624001074/pdfft?md5=d43ca113651c6eb642cd449e3aad0011&pid=1-s2.0-S0022283624001074-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140099292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NNDB: An Expanded Database of Nearest Neighbor Parameters for Predicting Stability of Nucleic Acid Secondary Structures NNDB:用于预测核酸二级结构稳定性的近邻参数扩展数据库。
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168549
{"title":"NNDB: An Expanded Database of Nearest Neighbor Parameters for Predicting Stability of Nucleic Acid Secondary Structures","authors":"","doi":"10.1016/j.jmb.2024.168549","DOIUrl":"10.1016/j.jmb.2024.168549","url":null,"abstract":"<div><p>Nearest neighbor thermodynamic parameters are widely used for RNA and DNA secondary structure prediction and to model thermodynamic ensembles of secondary structures. The Nearest Neighbor Database (NNDB) is a freely available web resource (<span><span>https://rna.urmc.rochester.edu/NNDB</span><svg><path></path></svg></span>) that provides the functional forms, parameter values, and example calculations. The NNDB provides the 1999 and 2004 set of RNA folding nearest neighbor parameters. We expanded the database to include a set of DNA parameters and a set of RNA parameters that includes m<sup>6</sup>A in addition to the canonical RNA nucleobases. The site was redesigned using the Quarto open-source publishing system. A downloadable PDF version of the complete resource and downloadable sets of nearest neighbor parameters are available.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S002228362400144X/pdfft?md5=0e570dd624af08cd501a916c5bb24b55&pid=1-s2.0-S002228362400144X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140206113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RNA3DB: A structurally-dissimilar dataset split for training and benchmarking deep learning models for RNA structure prediction RNA3DB:用于训练和基准测试 RNA 结构预测深度学习模型的结构相似数据集。
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168552
{"title":"RNA3DB: A structurally-dissimilar dataset split for training and benchmarking deep learning models for RNA structure prediction","authors":"","doi":"10.1016/j.jmb.2024.168552","DOIUrl":"10.1016/j.jmb.2024.168552","url":null,"abstract":"<div><p>With advances in protein structure prediction thanks to deep learning models like AlphaFold, RNA structure prediction has recently received increased attention from deep learning researchers. RNAs introduce substantial challenges due to the sparser availability and lower structural diversity of the experimentally resolved RNA structures in comparison to protein structures. These challenges are often poorly addressed by the existing literature, many of which report inflated performance due to using training and testing sets with significant structural overlap. Further, the most recent Critical Assessment of Structure Prediction (CASP15) has shown that deep learning models for RNA structure are currently outperformed by traditional methods.</p><p>In this paper we present RNA3DB, a dataset of structured RNAs, derived from the Protein Data Bank (PDB), that is designed for training and benchmarking deep learning models. The RNA3DB method arranges the RNA 3D chains into distinct groups (Components) that are non-redundant both with regard to sequence as well as structure, providing a robust way of dividing training, validation, and testing sets. Any split of these structurally-dissimilar Components are guaranteed to produce test and validations sets that are distinct by sequence and structure from those in the training set. We provide the RNA3DB dataset, a particular train/test split of the RNA3DB Components (in an approximate 70/30 ratio) that will be updated periodically. We also provide the RNA3DB methodology along with the source-code, with the goal of creating a reproducible and customizable tool for producing structurally-dissimilar dataset splits for structural RNAs.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624001475/pdfft?md5=5530a074f00756a90477518772fa34fc&pid=1-s2.0-S0022283624001475-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140326204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
E-pRSA: Embeddings Improve the Prediction of Residue Relative Solvent Accessibility in Protein Sequence E-pRSA:嵌入改进了蛋白质序列中残基相对溶剂可及性的预测
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168494
{"title":"E-pRSA: Embeddings Improve the Prediction of Residue Relative Solvent Accessibility in Protein Sequence","authors":"","doi":"10.1016/j.jmb.2024.168494","DOIUrl":"10.1016/j.jmb.2024.168494","url":null,"abstract":"<div><p>Knowledge of the solvent accessibility of residues in a protein is essential for different applications, including the identification of interacting surfaces in protein–protein interactions and the characterization of variations. We describe E-pRSA, a novel web server to estimate Relative Solvent Accessibility values (RSAs) of residues directly from a protein sequence. The method exploits two complementary Protein Language Models to provide fast and accurate predictions. When benchmarked on different blind test sets, E-pRSA scores at the state-of-the-art, and outperforms a previous method we developed, DeepREx, which was based on sequence profiles after Multiple Sequence Alignments. The E-pRSA web server is freely available at <span><span>https://e-prsa.biocomp.unibo.it/main/</span><svg><path></path></svg></span> where users can submit single-sequence and batch jobs.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624000664/pdfft?md5=5479e98c4394e85085ec9ab992a70ec7&pid=1-s2.0-S0022283624000664-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139830690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CHARMM-GUI PDB Reader and Manipulator: Covalent Ligand Modeling and Simulation CHARMM-GUI PDB 阅读器和操纵器:共价配体建模与仿真
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168554
{"title":"CHARMM-GUI PDB Reader and Manipulator: Covalent Ligand Modeling and Simulation","authors":"","doi":"10.1016/j.jmb.2024.168554","DOIUrl":"10.1016/j.jmb.2024.168554","url":null,"abstract":"<div><p>Molecular modeling and simulation serve an important role in exploring biological functions of proteins at the molecular level, which is complementary to experiments. CHARMM-GUI (<span><span>https://www.charmm-gui.org</span><svg><path></path></svg></span>) is a web-based graphical user interface that generates complex molecular simulation systems and input files, and we have been continuously developing and expanding its functionalities to facilitate various complex molecular modeling and make molecular dynamics simulations more accessible to the scientific community. Currently, covalent drug discovery emerges as a popular and important field. Covalent drug forms a chemical bond with specific residues on the target protein, and it has advantages in potency for its prolonged inhibition effects. Even though there are higher demands in modeling PDB protein structures with various covalent ligand types, proper modeling of covalent ligands remains challenging. This work presents a new functionality in CHARMM-GUI <em>PDB Reader &amp; Manipulator</em> that can handle a diversity of ligand-amino acid linkage types, which is validated by a careful benchmark study using over 1,000 covalent ligand structures in RCSB PDB. We hope that this new functionality can boost the modeling and simulation study of covalent ligands.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624001499/pdfft?md5=c22bc6a24892229f4d80acb3c293965e&pid=1-s2.0-S0022283624001499-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140406025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
mACPpred 2.0: Stacked Deep Learning for Anticancer Peptide Prediction with Integrated Spatial and Probabilistic Feature Representations mACPpred 2.0:利用集成空间和概率特征表征的堆叠深度学习进行抗癌肽预测
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168687
{"title":"mACPpred 2.0: Stacked Deep Learning for Anticancer Peptide Prediction with Integrated Spatial and Probabilistic Feature Representations","authors":"","doi":"10.1016/j.jmb.2024.168687","DOIUrl":"10.1016/j.jmb.2024.168687","url":null,"abstract":"<div><p>Anticancer peptides (ACPs), naturally occurring molecules with remarkable potential to target and kill cancer cells. However, identifying ACPs based solely from their primary amino acid sequences remains a major hurdle in immunoinformatics. In the past, several web-based machine learning (ML) tools have been proposed to assist researchers in identifying potential ACPs for further testing. Notably, our meta-approach method, mACPpred, introduced in 2019, has significantly advanced the field of ACP research. Given the exponential growth in the number of characterized ACPs, there is now a pressing need to create an updated version of mACPpred. To develop mACPpred 2.0, we constructed an up-to-date benchmarking dataset by integrating all publicly available ACP datasets. We employed a large-scale of feature descriptors, encompassing both conventional feature descriptors and advanced pre-trained natural language processing (NLP)-based embeddings. We evaluated their ability to discriminate between ACPs and non-ACPs using eleven different classifiers. Subsequently, we employed a stacked deep learning (SDL) approach, incorporating 1D convolutional neural network (1D CNN) blocks and hybrid features. These features included the top seven performing NLP-based features and 90 probabilistic features, allowing us to identify hidden patterns within these diverse features and improve the accuracy of our ACP prediction model. This is the first study to integrate spatial and probabilistic feature representations for predicting ACPs. Rigorous cross-validation and independent tests conclusively demonstrated that mACPpred 2.0 not only surpassed its predecessor (mACPpred) but also outperformed the existing state-of-the-art predictors, highlighting the importance of advanced feature representation capabilities attained through SDL. To facilitate widespread use and accessibility, we have developed a user-friendly for mACPpred 2.0, available at <span><span>https://balalab-skku.org/mACPpred2/</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624002894/pdfft?md5=ecdf80bb684910ec5433145962a8f247&pid=1-s2.0-S0022283624002894-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enricherator: A Bayesian Method for Inferring Regularized Genome-wide Enrichments from Sequencing Count Data Enricherator:从测序计数数据推断正规化全基因组富集度的贝叶斯方法
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168567
{"title":"Enricherator: A Bayesian Method for Inferring Regularized Genome-wide Enrichments from Sequencing Count Data","authors":"","doi":"10.1016/j.jmb.2024.168567","DOIUrl":"10.1016/j.jmb.2024.168567","url":null,"abstract":"<div><p>A pervasive question in biological research studying gene regulation, chromatin structure, or genomics is where, and to what extent, does a signal of interest arise genome-wide? This question is addressed using a variety of methods relying on high-throughput sequencing data as their final output, including ChIP-seq for protein-DNA interactions,<span><span><sup>1</sup></span></span> GapR-seq for measuring supercoiling,<span><span><sup>2</sup></span></span> and HBD-seq or DRIP-seq for R-loop positioning.<span><span>3</span></span>, <span><span>4</span></span> Current computational methods to calculate genome-wide enrichment of the signal of interest usually do not properly handle the count-based nature of sequencing data, they often do not make use of the local correlation structure of sequencing data, and they do not apply any regularization of enrichment estimates. This can result in unrealistic estimates of the true underlying biological enrichment of interest, unrealistically low estimates of confidence in point estimates of enrichment (or no estimates of confidence at all), unrealistic gyrations in enrichment estimates at very close (&lt;10 bp) genomic loci due to noise inherent in sequencing data, and in a multiple-hypothesis testing problem during interpretation of genome-wide enrichment estimates. We developed a tool called Enricherator to infer genome-wide enrichments from sequencing count data. Enricherator uses the variational Bayes algorithm to fit a generalized linear model to sequencing count data and to sample from the approximate posterior distribution of enrichment estimates (<span><span>https://github.com/jwschroeder3/enricherator</span><svg><path></path></svg></span>). Enrichments inferred by Enricherator more precisely identify known binding sites in cases where low coverage between binding sites leads to false-positive peak calls in these noisy regions of the genome; these benefits extend to published datasets.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624001621/pdfft?md5=12eadc9303ecf2b7325490d62b957d44&pid=1-s2.0-S0022283624001621-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140592026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NucMap 2.0: An Updated Database of Genome-wide Nucleosome Positioning Maps Across Species NucMap 2.0:跨物种全基因组核糖体定位图最新数据库。
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168655
{"title":"NucMap 2.0: An Updated Database of Genome-wide Nucleosome Positioning Maps Across Species","authors":"","doi":"10.1016/j.jmb.2024.168655","DOIUrl":"10.1016/j.jmb.2024.168655","url":null,"abstract":"<div><p>Nucleosome dynamics plays important roles in many biological processes, such as DNA replication and gene expression. NucMap (<span><span>https://ngdc.cncb.ac.cn/nucmap</span><svg><path></path></svg></span>) is the first database of genome-wide nucleosome positioning maps across species. Here, we present an updated version, NucMap 2.0, by incorporating more species and MNase-seq samples. In addition, we integrate other related omics data for each MNase-seq sample to provide a comprehensive view of nucleosome positioning, such as gene expression, transcription factor binding sites, histone modifications and DNA methylation. In particular, NucMap 2.0 integrates and pre-analyzes RNA-seq data and ChIP-seq data of human-related samples, which facilitates the interpretation of nucleosome positioning in humans. All processed data are integrated into an in-built genome browser, and users can make comprehensive side-by-side analyses. In addition, more online analytical functions are developed, which allows researchers to identify differential nucleosome regions and explore potential gene regulatory regions. All resources are open access with a user-friendly web interface.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S002228362400250X/pdfft?md5=05c0ca9f6c37361600fa1c82182f3970&pid=1-s2.0-S002228362400250X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141327075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PPInterface: A Comprehensive Dataset of 3D Protein-Protein Interface Structures PPInterface:三维蛋白质-蛋白质界面结构综合数据集。
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168686
{"title":"PPInterface: A Comprehensive Dataset of 3D Protein-Protein Interface Structures","authors":"","doi":"10.1016/j.jmb.2024.168686","DOIUrl":"10.1016/j.jmb.2024.168686","url":null,"abstract":"<div><p>The PPInterface dataset contains 815,082 interface structures, providing the most comprehensive structural information on protein–protein interfaces. This resource is extracted from over 215,000 three-dimensional protein structures stored in the Protein Data Bank (PDB). The dataset contains a wide range of protein complexes, providing a wealth of information for researchers investigating the structural properties of protein–protein interactions. The accompanying web server has a user-friendly interface that allows for efficient search and download functions. Researchers can access detailed information on protein interface structures, visualize them, and explore a variety of features, increasing the dataset’s utility and accessibility.</p><p>The dataset and web server can be found at <span><span>https://3dpath.ku.edu.tr/PPInt/</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624002882/pdfft?md5=8c06cb4d0f228da90e95d1e5dc422504&pid=1-s2.0-S0022283624002882-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141465122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CATH 2024: CATH-AlphaFlow Doubles the Number of Structures in CATH and Reveals Nearly 200 New Folds CATH 2024:CATH-AlphaFlow 使 CATH 中的结构数量翻了一番,并揭示了近 200 个新折叠。
IF 4.7 2区 生物学
Journal of Molecular Biology Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168551
{"title":"CATH 2024: CATH-AlphaFlow Doubles the Number of Structures in CATH and Reveals Nearly 200 New Folds","authors":"","doi":"10.1016/j.jmb.2024.168551","DOIUrl":"10.1016/j.jmb.2024.168551","url":null,"abstract":"<div><p>CATH (<span><span>https://www.cathdb.info</span><svg><path></path></svg></span>) classifies domain structures from experimental protein structures in the PDB and predicted structures in the AlphaFold Database (AFDB). To cope with the scale of the predicted data a new NextFlow workflow (CATH-AlphaFlow), has been developed to classify high-quality domains into CATH superfamilies and identify novel fold groups and superfamilies. CATH-AlphaFlow uses a novel state-of-the-art structure-based domain boundary prediction method (ChainSaw) for identifying domains in multi-domain proteins. We applied CATH-AlphaFlow to process PDB structures not classified in CATH and AFDB structures from 21 model organisms, expanding CATH by over 100%.</p><p>Domains not classified in existing CATH superfamilies or fold groups were used to seed novel folds, giving 253 new folds from PDB structures (September 2023 release) and 96 from AFDB structures of proteomes of 21 model organisms. Where possible, functional annotations were obtained using (i) predictions from publicly available methods (ii) annotations from structural relatives in AFDB/UniProt50. We also predicted functional sites and highly conserved residues. Some folds are associated with important functions such as photosynthetic acclimation (in flowering plants), iron permease activity (in fungi) and post-natal spermatogenesis (in mice).</p><p>CATH-AlphaFlow will allow us to identify many more CATH relatives in the AFDB, further characterising the protein structure landscape.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624001463/pdfft?md5=7f042c9d519839cc743c6f8330403192&pid=1-s2.0-S0022283624001463-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140317526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信