使用大型语言模型和知识图谱的大分子自动逆向合成计划。

IF 4.3 3区 化学 Q2 POLYMER SCIENCE
Qinyu Ma, Yuhao Zhou, Jianfeng Li
{"title":"使用大型语言模型和知识图谱的大分子自动逆向合成计划。","authors":"Qinyu Ma, Yuhao Zhou, Jianfeng Li","doi":"10.1002/marc.202500065","DOIUrl":null,"url":null,"abstract":"<p><p>Identifying reliable synthesis pathways in materials chemistry is a complex task, particularly in polymer science, due to the intricate and often nonunique nomenclature of macromolecules. To address this challenge, an agent system that integrates large language models (LLMs) and knowledge graphs is proposed. By leveraging LLMs' powerful capabilities for extracting and recognizing chemical substance names, and storing the extracted data in a structured knowledge graph, the system fully automates the retrieval of relevant literature, extraction of reaction data, database querying, construction of retrosynthetic pathway trees, further expansion through the retrieval of additional literature and recommendation of optimal reaction pathways. By considering the complex interdependencies among chemical reactants, a novel Multi-branched Reaction Pathway Search Algorithm (MBRPS) is proposed to help identify all valid multi-branched reaction pathways, which arise when a single product decomposes into multiple reaction intermediates. In contrast, previous studies are limited to cases where a product decomposes into at most one reaction intermediate. This work represents the first attempt to develop a fully automated retrosynthesis planning agent tailored specially for macromolecules powered by LLMs. Applied to polyimide synthesis, the new approach constructs a retrosynthetic pathway tree with hundreds of pathways and recommends optimized routes, including both known and novel pathways.</p>","PeriodicalId":205,"journal":{"name":"Macromolecular Rapid Communications","volume":" ","pages":"e2500065"},"PeriodicalIF":4.3000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Retrosynthesis Planning of Macromolecules Using Large Language Models and Knowledge Graphs.\",\"authors\":\"Qinyu Ma, Yuhao Zhou, Jianfeng Li\",\"doi\":\"10.1002/marc.202500065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Identifying reliable synthesis pathways in materials chemistry is a complex task, particularly in polymer science, due to the intricate and often nonunique nomenclature of macromolecules. To address this challenge, an agent system that integrates large language models (LLMs) and knowledge graphs is proposed. By leveraging LLMs' powerful capabilities for extracting and recognizing chemical substance names, and storing the extracted data in a structured knowledge graph, the system fully automates the retrieval of relevant literature, extraction of reaction data, database querying, construction of retrosynthetic pathway trees, further expansion through the retrieval of additional literature and recommendation of optimal reaction pathways. By considering the complex interdependencies among chemical reactants, a novel Multi-branched Reaction Pathway Search Algorithm (MBRPS) is proposed to help identify all valid multi-branched reaction pathways, which arise when a single product decomposes into multiple reaction intermediates. In contrast, previous studies are limited to cases where a product decomposes into at most one reaction intermediate. This work represents the first attempt to develop a fully automated retrosynthesis planning agent tailored specially for macromolecules powered by LLMs. Applied to polyimide synthesis, the new approach constructs a retrosynthetic pathway tree with hundreds of pathways and recommends optimized routes, including both known and novel pathways.</p>\",\"PeriodicalId\":205,\"journal\":{\"name\":\"Macromolecular Rapid Communications\",\"volume\":\" \",\"pages\":\"e2500065\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Macromolecular Rapid Communications\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1002/marc.202500065\",\"RegionNum\":3,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"POLYMER SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Macromolecular Rapid Communications","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1002/marc.202500065","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"POLYMER SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

在材料化学中,确定可靠的合成途径是一项复杂的任务,特别是在聚合物科学中,由于大分子的复杂和往往非唯一的命名。为了解决这一挑战,提出了一种集成了大型语言模型和知识图的智能体系统。利用llm强大的提取和识别化学物质名称的能力,并将提取的数据存储在结构化的知识图中,系统完全自动化了相关文献的检索、反应数据的提取、数据库查询、反合成路径树的构建、通过检索其他文献进一步扩展和推荐最优反应路径。考虑到化学反应物之间复杂的相互依赖关系,提出了一种新的多分支反应路径搜索算法(MBRPS),以帮助识别单个产物分解成多个反应中间体时产生的所有有效的多分支反应路径。相比之下,以前的研究仅限于产物分解成最多一种反应中间体的情况。这项工作代表了首次尝试开发一种专门为llm驱动的大分子量身定制的全自动反合成计划剂。应用于聚酰亚胺合成,新方法构建了一个具有数百个途径的反合成途径树,并推荐了优化的途径,包括已知的和新的途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automated Retrosynthesis Planning of Macromolecules Using Large Language Models and Knowledge Graphs.

Identifying reliable synthesis pathways in materials chemistry is a complex task, particularly in polymer science, due to the intricate and often nonunique nomenclature of macromolecules. To address this challenge, an agent system that integrates large language models (LLMs) and knowledge graphs is proposed. By leveraging LLMs' powerful capabilities for extracting and recognizing chemical substance names, and storing the extracted data in a structured knowledge graph, the system fully automates the retrieval of relevant literature, extraction of reaction data, database querying, construction of retrosynthetic pathway trees, further expansion through the retrieval of additional literature and recommendation of optimal reaction pathways. By considering the complex interdependencies among chemical reactants, a novel Multi-branched Reaction Pathway Search Algorithm (MBRPS) is proposed to help identify all valid multi-branched reaction pathways, which arise when a single product decomposes into multiple reaction intermediates. In contrast, previous studies are limited to cases where a product decomposes into at most one reaction intermediate. This work represents the first attempt to develop a fully automated retrosynthesis planning agent tailored specially for macromolecules powered by LLMs. Applied to polyimide synthesis, the new approach constructs a retrosynthetic pathway tree with hundreds of pathways and recommends optimized routes, including both known and novel pathways.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Macromolecular Rapid Communications
Macromolecular Rapid Communications 工程技术-高分子科学
CiteScore
7.70
自引率
6.50%
发文量
477
审稿时长
1.4 months
期刊介绍: Macromolecular Rapid Communications publishes original research in polymer science, ranging from chemistry and physics of polymers to polymers in materials science and life sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信