Empowering scientific discovery with explainable small domain-specific and large language models

IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Hengjie Yu, Yizhi Wang, Tao Cheng, Yan Yan, Kenneth A. Dawson, Sam F. Y. Li, Yefeng Zheng, Yaochu Jin
{"title":"Empowering scientific discovery with explainable small domain-specific and large language models","authors":"Hengjie Yu,&nbsp;Yizhi Wang,&nbsp;Tao Cheng,&nbsp;Yan Yan,&nbsp;Kenneth A. Dawson,&nbsp;Sam F. Y. Li,&nbsp;Yefeng Zheng,&nbsp;Yaochu Jin","doi":"10.1007/s10462-025-11365-w","DOIUrl":null,"url":null,"abstract":"<div><p>As artificial intelligence (AI) increasingly integrates into scientific research, explainability has become a cornerstone for ensuring reliability and innovation in discovery processes. This review offers a forward-looking integration of explainable AI (XAI)-based research paradigms, encompassing small domain-specific models, large language models (LLMs), and agent-based large-small model collaboration. For domain-specific models, we introduce a knowledge-oriented taxonomy categorizing methods into knowledge-agnostic, knowledge-based, knowledge-infused, and knowledge-verified approaches, emphasizing the balance between domain knowledge and innovative insights. For LLMs, we examine three strategies for integrating domain knowledge—prompt engineering, retrieval-augmented generation, and supervised fine-tuning—along with advances in explainability, including local, global, and conversation-based explanations. We also envision future agent-based model collaborations within automated laboratories, stressing the need for context-aware explanations tailored to research goals. Additionally, we discuss the unique characteristics and limitations of both explainable small domain-specific models and LLMs in the realm of scientific discovery. Finally, we highlight methodological challenges, potential pitfalls, and the necessity of rigorous validation to ensure XAI’s transformative role in accelerating scientific discovery and reshaping research paradigms.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 12","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11365-w.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11365-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

As artificial intelligence (AI) increasingly integrates into scientific research, explainability has become a cornerstone for ensuring reliability and innovation in discovery processes. This review offers a forward-looking integration of explainable AI (XAI)-based research paradigms, encompassing small domain-specific models, large language models (LLMs), and agent-based large-small model collaboration. For domain-specific models, we introduce a knowledge-oriented taxonomy categorizing methods into knowledge-agnostic, knowledge-based, knowledge-infused, and knowledge-verified approaches, emphasizing the balance between domain knowledge and innovative insights. For LLMs, we examine three strategies for integrating domain knowledge—prompt engineering, retrieval-augmented generation, and supervised fine-tuning—along with advances in explainability, including local, global, and conversation-based explanations. We also envision future agent-based model collaborations within automated laboratories, stressing the need for context-aware explanations tailored to research goals. Additionally, we discuss the unique characteristics and limitations of both explainable small domain-specific models and LLMs in the realm of scientific discovery. Finally, we highlight methodological challenges, potential pitfalls, and the necessity of rigorous validation to ensure XAI’s transformative role in accelerating scientific discovery and reshaping research paradigms.

通过可解释的小型领域特定模型和大型语言模型增强科学发现的能力
随着人工智能(AI)越来越多地融入科学研究,可解释性已成为确保发现过程可靠性和创新性的基石。本综述提供了基于可解释AI (XAI)的研究范式的前瞻性集成,包括小型特定领域模型,大型语言模型(llm)和基于代理的大-小模型协作。对于特定领域的模型,我们引入了面向知识的分类法,将方法分为知识不可知的、基于知识的、知识注入的和知识验证的方法,强调了领域知识和创新见解之间的平衡。对于法学硕士,我们研究了整合领域知识提示工程、检索增强生成和监督微调的三种策略,以及可解释性的进步,包括本地、全局和基于对话的解释。我们还展望了未来自动化实验室中基于代理的模型合作,强调需要针对研究目标量身定制的上下文感知解释。此外,我们还讨论了在科学发现领域中可解释的小领域特定模型和llm的独特特征和局限性。最后,我们强调了方法上的挑战、潜在的陷阱,以及严格验证的必要性,以确保XAI在加速科学发现和重塑研究范式方面发挥变革性作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial Intelligence Review
Artificial Intelligence Review 工程技术-计算机:人工智能
CiteScore
22.00
自引率
3.30%
发文量
194
审稿时长
5.3 months
期刊介绍: Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信