Empowering scientific discovery with explainable small domain-specific and large language models

IF 13.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Review Pub Date : 2025-10-08 DOI:10.1007/s10462-025-11365-w

Hengjie Yu, Yizhi Wang, Tao Cheng, Yan Yan, Kenneth A. Dawson, Sam F. Y. Li, Yefeng Zheng, Yaochu Jin

{"title":"Empowering scientific discovery with explainable small domain-specific and large language models","authors":"Hengjie Yu, Yizhi Wang, Tao Cheng, Yan Yan, Kenneth A. Dawson, Sam F. Y. Li, Yefeng Zheng, Yaochu Jin","doi":"10.1007/s10462-025-11365-w","DOIUrl":null,"url":null,"abstract":"<div><p>As artificial intelligence (AI) increasingly integrates into scientific research, explainability has become a cornerstone for ensuring reliability and innovation in discovery processes. This review offers a forward-looking integration of explainable AI (XAI)-based research paradigms, encompassing small domain-specific models, large language models (LLMs), and agent-based large-small model collaboration. For domain-specific models, we introduce a knowledge-oriented taxonomy categorizing methods into knowledge-agnostic, knowledge-based, knowledge-infused, and knowledge-verified approaches, emphasizing the balance between domain knowledge and innovative insights. For LLMs, we examine three strategies for integrating domain knowledge—prompt engineering, retrieval-augmented generation, and supervised fine-tuning—along with advances in explainability, including local, global, and conversation-based explanations. We also envision future agent-based model collaborations within automated laboratories, stressing the need for context-aware explanations tailored to research goals. Additionally, we discuss the unique characteristics and limitations of both explainable small domain-specific models and LLMs in the realm of scientific discovery. Finally, we highlight methodological challenges, potential pitfalls, and the necessity of rigorous validation to ensure XAI’s transformative role in accelerating scientific discovery and reshaping research paradigms.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 12","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11365-w.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11365-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

As artificial intelligence (AI) increasingly integrates into scientific research, explainability has become a cornerstone for ensuring reliability and innovation in discovery processes. This review offers a forward-looking integration of explainable AI (XAI)-based research paradigms, encompassing small domain-specific models, large language models (LLMs), and agent-based large-small model collaboration. For domain-specific models, we introduce a knowledge-oriented taxonomy categorizing methods into knowledge-agnostic, knowledge-based, knowledge-infused, and knowledge-verified approaches, emphasizing the balance between domain knowledge and innovative insights. For LLMs, we examine three strategies for integrating domain knowledge—prompt engineering, retrieval-augmented generation, and supervised fine-tuning—along with advances in explainability, including local, global, and conversation-based explanations. We also envision future agent-based model collaborations within automated laboratories, stressing the need for context-aware explanations tailored to research goals. Additionally, we discuss the unique characteristics and limitations of both explainable small domain-specific models and LLMs in the realm of scientific discovery. Finally, we highlight methodological challenges, potential pitfalls, and the necessity of rigorous validation to ensure XAI’s transformative role in accelerating scientific discovery and reshaping research paradigms.

查看原文本刊更多论文

通过可解释的小型领域特定模型和大型语言模型增强科学发现的能力

随着人工智能（AI）越来越多地融入科学研究，可解释性已成为确保发现过程可靠性和创新性的基石。本综述提供了基于可解释AI （XAI）的研究范式的前瞻性集成，包括小型特定领域模型，大型语言模型（llm）和基于代理的大-小模型协作。对于特定领域的模型，我们引入了面向知识的分类法，将方法分为知识不可知的、基于知识的、知识注入的和知识验证的方法，强调了领域知识和创新见解之间的平衡。对于法学硕士，我们研究了整合领域知识提示工程、检索增强生成和监督微调的三种策略，以及可解释性的进步，包括本地、全局和基于对话的解释。我们还展望了未来自动化实验室中基于代理的模型合作，强调需要针对研究目标量身定制的上下文感知解释。此外，我们还讨论了在科学发现领域中可解释的小领域特定模型和llm的独特特征和局限性。最后，我们强调了方法上的挑战、潜在的陷阱，以及严格验证的必要性，以确保XAI在加速科学发现和重塑研究范式方面发挥变革性作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial Intelligence Review 工程技术-计算机：人工智能

CiteScore

22.00

自引率

3.30%

发文量

194

审稿时长

5.3 months

期刊介绍： Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.