LigExtract：在蛋白质数据库中从蛋白质结构中大规模自动识别配体。

IF 7.9

Genomics, proteomics & bioinformatics Pub Date : 2025-02-28 DOI:10.1093/gpbjnl/qzaf018

Natália Aniceto, Nuno Martinho, Ismael Rufino, Rita C Guedes

{"title":"LigExtract：在蛋白质数据库中从蛋白质结构中大规模自动识别配体。","authors":"Natália Aniceto, Nuno Martinho, Ismael Rufino, Rita C Guedes","doi":"10.1093/gpbjnl/qzaf018","DOIUrl":null,"url":null,"abstract":"The Protein Data Bank is an ever-growing database of 3D macromolecular structures that has become a crucial resource for the drug discovery process. Exploring complexed proteins and accessing the ligands in these proteins is paramount to help researchers understand biological processes and design new compounds of pharmaceutical interest. However, currently available tools to perform large-scale ligand identification do not address many of the more complex ways in which ligands are stored and represented in PDB structures. Therefore, a new tool called LigExtract was specifically developed for the large-scale processing of PDB structures and the identification of their ligands. This is a fully open-source tool available to the scientific community, designed to provide end-to-end processing whereby the user simply provides a list of UniProt IDs and LigExtract returns a list of ligands, their individual PDB files, a PDB file of the protein chains engaged with the ligand and a series of log files that inform the user of the decisions made during the ligand extraction process as well as potential flagging of additional scenarios that might have to be considered during any follow-up use of the processed files (e.g., ligands covalently bound to the protein). LigExtract is available, open-source, on GitHub (https://github.com/comp-medchem/LigExtract).","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LigExtract: Large-scale Automated Identification of Ligands from Protein Structures in the Protein Data Bank.\",\"authors\":\"Natália Aniceto, Nuno Martinho, Ismael Rufino, Rita C Guedes\",\"doi\":\"10.1093/gpbjnl/qzaf018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Protein Data Bank is an ever-growing database of 3D macromolecular structures that has become a crucial resource for the drug discovery process. Exploring complexed proteins and accessing the ligands in these proteins is paramount to help researchers understand biological processes and design new compounds of pharmaceutical interest. However, currently available tools to perform large-scale ligand identification do not address many of the more complex ways in which ligands are stored and represented in PDB structures. Therefore, a new tool called LigExtract was specifically developed for the large-scale processing of PDB structures and the identification of their ligands. This is a fully open-source tool available to the scientific community, designed to provide end-to-end processing whereby the user simply provides a list of UniProt IDs and LigExtract returns a list of ligands, their individual PDB files, a PDB file of the protein chains engaged with the ligand and a series of log files that inform the user of the decisions made during the ligand extraction process as well as potential flagging of additional scenarios that might have to be considered during any follow-up use of the processed files (e.g., ligands covalently bound to the protein). LigExtract is available, open-source, on GitHub (https://github.com/comp-medchem/LigExtract).\",\"PeriodicalId\":94020,\"journal\":{\"name\":\"Genomics, proteomics & bioinformatics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":7.9000,\"publicationDate\":\"2025-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genomics, proteomics & bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/gpbjnl/qzaf018\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics, proteomics & bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/gpbjnl/qzaf018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

蛋白质数据库是一个不断增长的3D大分子结构数据库，已成为药物发现过程的重要资源。探索复杂的蛋白质和获取这些蛋白质中的配体对于帮助研究人员了解生物过程和设计新的药物感兴趣的化合物至关重要。然而，目前可用的工具来执行大规模的配体鉴定不解决许多更复杂的方式，其中配体存储和表示在PDB结构。因此，专门开发了一种名为LigExtract的新工具，用于PDB结构的大规模处理及其配体的鉴定。这是一个可供科学界使用的完全开源工具，旨在提供端到端处理，用户只需提供UniProt id列表，LigExtract返回配体列表及其单独的PDB文件。与配体结合的蛋白质链的PDB文件和一系列日志文件，这些文件通知用户在配体提取过程中做出的决定，以及在后续使用处理文件（例如，配体与蛋白质共价结合）期间可能必须考虑的潜在附加情况的标记。LigExtract是可用的，开源的，在GitHub （https://github.com/comp-medchem/LigExtract）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

LigExtract: Large-scale Automated Identification of Ligands from Protein Structures in the Protein Data Bank.

The Protein Data Bank is an ever-growing database of 3D macromolecular structures that has become a crucial resource for the drug discovery process. Exploring complexed proteins and accessing the ligands in these proteins is paramount to help researchers understand biological processes and design new compounds of pharmaceutical interest. However, currently available tools to perform large-scale ligand identification do not address many of the more complex ways in which ligands are stored and represented in PDB structures. Therefore, a new tool called LigExtract was specifically developed for the large-scale processing of PDB structures and the identification of their ligands. This is a fully open-source tool available to the scientific community, designed to provide end-to-end processing whereby the user simply provides a list of UniProt IDs and LigExtract returns a list of ligands, their individual PDB files, a PDB file of the protein chains engaged with the ligand and a series of log files that inform the user of the decisions made during the ligand extraction process as well as potential flagging of additional scenarios that might have to be considered during any follow-up use of the processed files (e.g., ligands covalently bound to the protein). LigExtract is available, open-source, on GitHub (https://github.com/comp-medchem/LigExtract).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Genomics, proteomics & bioinformatics

自引率

0.00%

发文量