Agricultural large language model for standardized production of distinctive agricultural products

IF 8.9 1区农林科学 Q1 AGRICULTURE, MULTIDISCIPLINARY

Computers and Electronics in Agriculture Pub Date : 2025-03-12 DOI:10.1016/j.compag.2025.110218

Wenlong Yi , Li Zhang , Sergey Kuzmin , Igor Gerasimov , Muhua Liu

{"title":"Agricultural large language model for standardized production of distinctive agricultural products","authors":"Wenlong Yi , Li Zhang , Sergey Kuzmin , Igor Gerasimov , Muhua Liu","doi":"10.1016/j.compag.2025.110218","DOIUrl":null,"url":null,"abstract":"<div><div>To address the diverse nature of specialty agricultural product standardization, its complex and cumbersome development process, and lengthy drafting cycles, while simultaneously tackling challenges such as outdated standardization documents and hallucinations caused by general large language models’ delayed access to agricultural domain information. This study constructs a multi-stage cascaded large language model based on a hybrid retrieval-augmented mechanism. The model comprises three core modules: (1) A multi-source retrieval augmentation module that achieves comprehensive external knowledge acquisition through vector retrieval, keyword retrieval, and knowledge graph retrieval branches; (2) A knowledge fusion module that filters redundant information using inverse ranking fusion and graph structure pruning methods to achieve precise injection of high-quality knowledge; (3) A domain adaptation module that enhances the model’s understanding of agricultural terminology through vertical domain fine-tuning. Experimental results show that in the standardization document summarization task, the model achieves chrF, BERTscore, and Gscore metrics of 34.85, 74.88, and 39.85, respectively, representing improvements of 59.52%, 35.28%, and 72.84% over the BART baseline model, and 58.54%, 24.25%, and 59.54% over the T5 model. This study enriches the theoretical foundation of large language models in agriculture and provides intelligent technical support for specialty agricultural product standardization development.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"234 ","pages":"Article 110218"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925003242","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

To address the diverse nature of specialty agricultural product standardization, its complex and cumbersome development process, and lengthy drafting cycles, while simultaneously tackling challenges such as outdated standardization documents and hallucinations caused by general large language models’ delayed access to agricultural domain information. This study constructs a multi-stage cascaded large language model based on a hybrid retrieval-augmented mechanism. The model comprises three core modules: (1) A multi-source retrieval augmentation module that achieves comprehensive external knowledge acquisition through vector retrieval, keyword retrieval, and knowledge graph retrieval branches; (2) A knowledge fusion module that filters redundant information using inverse ranking fusion and graph structure pruning methods to achieve precise injection of high-quality knowledge; (3) A domain adaptation module that enhances the model’s understanding of agricultural terminology through vertical domain fine-tuning. Experimental results show that in the standardization document summarization task, the model achieves chrF, BERTscore, and Gscore metrics of 34.85, 74.88, and 39.85, respectively, representing improvements of 59.52%, 35.28%, and 72.84% over the BART baseline model, and 58.54%, 24.25%, and 59.54% over the T5 model. This study enriches the theoretical foundation of large language models in agriculture and provides intelligent technical support for specialty agricultural product standardization development.

查看原文本刊更多论文

农业大语种模式，标准化生产特色农产品

解决特色农产品标准化的多样性、开发过程复杂繁琐、起草周期长等问题，同时解决标准化文件过时、通用大型语言模型对农业领域信息获取滞后造成的幻觉等问题。本研究构建了一个基于检索-增强混合机制的多级级联大语言模型。该模型包括三个核心模块：(1)多源检索增强模块，通过向量检索、关键词检索和知识图检索分支实现全面的外部知识获取；(2)知识融合模块，利用逆排序融合和图结构剪枝方法过滤冗余信息，实现高质量知识的精确注入；(3)领域自适应模块，通过垂直领域微调增强模型对农业术语的理解。实验结果表明，在标准化文档摘要任务中，该模型的chrF、BERTscore和Gscore指标分别达到34.85、74.88和39.85，分别比BART基线模型提高59.52%、35.28%和72.84%，比T5模型提高58.54%、24.25%和59.54%。本研究丰富了农业大语言模型的理论基础，为特色农产品标准化发展提供了智能化技术支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers and Electronics in Agriculture 工程技术-计算机：跨学科应用

CiteScore

15.30

自引率

14.50%

发文量

800

审稿时长

62 days

期刊介绍： Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.