Manuel Bezerra-Brandao, Ronaldo Romario Tunque Cahui, Layla Hirsh
{"title":"Daisy: An integrated repeat protein curation service","authors":"Manuel Bezerra-Brandao, Ronaldo Romario Tunque Cahui, Layla Hirsh","doi":"10.1016/j.jsb.2023.108033","DOIUrl":null,"url":null,"abstract":"<div><p>Tandem repeats in proteins identification, classification and curation is a complex process that requires manual processing from experts, processing power and time. There are recent and relevant advances applying machine learning for protein structure prediction and repeat classification that are useful for this process. However, no service contemplates required databases and software to supplement researching on repeat proteins. In this publication we present Daisy, an integrated repeat protein curation web service. This service can process Protein Data Bank (PDB) and the AlphaFold Database entries for tandem repeats identification. In addition, it uses an algorithm to search a sequence against a library of Pfam hidden Markov model (HMM). Repeat classifications are associated with the identified families through RepeatsDB. This prediction is considered for enhancing the ReUPred algorithm execution and hastening the repeat units identification process. The service can also operate every associated PDB and AlphaFold structure with a UniProt proteome registry.</p><p><strong>Availability:</strong> The Daisy web service is freely accessible at <span>daisy.bioinformatica.org</span><svg><path></path></svg>.</p></div>","PeriodicalId":17074,"journal":{"name":"Journal of structural biology","volume":"215 4","pages":"Article 108033"},"PeriodicalIF":3.0000,"publicationDate":"2023-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of structural biology","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047847723000965","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Tandem repeats in proteins identification, classification and curation is a complex process that requires manual processing from experts, processing power and time. There are recent and relevant advances applying machine learning for protein structure prediction and repeat classification that are useful for this process. However, no service contemplates required databases and software to supplement researching on repeat proteins. In this publication we present Daisy, an integrated repeat protein curation web service. This service can process Protein Data Bank (PDB) and the AlphaFold Database entries for tandem repeats identification. In addition, it uses an algorithm to search a sequence against a library of Pfam hidden Markov model (HMM). Repeat classifications are associated with the identified families through RepeatsDB. This prediction is considered for enhancing the ReUPred algorithm execution and hastening the repeat units identification process. The service can also operate every associated PDB and AlphaFold structure with a UniProt proteome registry.
Availability: The Daisy web service is freely accessible at daisy.bioinformatica.org.
期刊介绍:
Journal of Structural Biology (JSB) has an open access mirror journal, the Journal of Structural Biology: X (JSBX), sharing the same aims and scope, editorial team, submission system and rigorous peer review. Since both journals share the same editorial system, you may submit your manuscript via either journal homepage. You will be prompted during submission (and revision) to choose in which to publish your article. The editors and reviewers are not aware of the choice you made until the article has been published online. JSB and JSBX publish papers dealing with the structural analysis of living material at every level of organization by all methods that lead to an understanding of biological function in terms of molecular and supermolecular structure.
Techniques covered include:
• Light microscopy including confocal microscopy
• All types of electron microscopy
• X-ray diffraction
• Nuclear magnetic resonance
• Scanning force microscopy, scanning probe microscopy, and tunneling microscopy
• Digital image processing
• Computational insights into structure