Dairy GPT: Empowering dairy farmers to interact with numerical databases through natural language conversations

IF 5.7 Q1 AGRICULTURAL ENGINEERING

Smart agricultural technology Pub Date : 2025-06-16 DOI:10.1016/j.atech.2025.101097

Danillo Gontijo , Douglas Rolins Santana , Gustavo de Assis Costa , Victor E. Cabrera , Eduardo Noronha de Andrade Freitas

{"title":"Dairy GPT: Empowering dairy farmers to interact with numerical databases through natural language conversations","authors":"Danillo Gontijo , Douglas Rolins Santana , Gustavo de Assis Costa , Victor E. Cabrera , Eduardo Noronha de Andrade Freitas","doi":"10.1016/j.atech.2025.101097","DOIUrl":null,"url":null,"abstract":"<div><div>Large language models (LLMs), like GPT-4, have revolutionized artificial intelligence by enabling intuitive text and voice interactions, simplifying complex tasks, and democratizing access to AI-driven tools. However, one of their primary limitations lies in their ability to effectively handle interactions with strictly numerical data. This limitation has led to innovative solutions such as Retrieval Augmented Generation (RAG) and Natural Language to SQL (NL2SQL), which enhance their applicability in data-intensive domains. This study investigated the possibility and feasibility of using large language models (LLMs) to allow natural language interactions of dairy farmers with purely numerical databases. To support the proposed study, we constructed a dataset consisting of 25,925 daily milk production records from 85 cows, derived from real data collected at the University of Wisconsin-Madison Agricultural Research Station. Three analyses pipelines were proposed to assess the effectiveness of LLMs handling of numerical databases: Prompt Engineering (zero-shot), Retrieval-Augmented Generation (RAG), and NL2SQL with Decomposition, evaluated using a set of quantitative (5) and qualitative (5) questions. Based on these 10 questions, the NL2SQL with Decomposition achieved 80% accuracy for quantitative questions and the Zero-shot achieved 100% for qualitative questions. These results demonstrate the potential of LLMs to enhance data utilization in dairy farming. Future work will focus on refining the proposed methods and expanding their applicability to other livestock purposes.</div></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":"12 ","pages":"Article 101097"},"PeriodicalIF":5.7000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375525003302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Large language models (LLMs), like GPT-4, have revolutionized artificial intelligence by enabling intuitive text and voice interactions, simplifying complex tasks, and democratizing access to AI-driven tools. However, one of their primary limitations lies in their ability to effectively handle interactions with strictly numerical data. This limitation has led to innovative solutions such as Retrieval Augmented Generation (RAG) and Natural Language to SQL (NL2SQL), which enhance their applicability in data-intensive domains. This study investigated the possibility and feasibility of using large language models (LLMs) to allow natural language interactions of dairy farmers with purely numerical databases. To support the proposed study, we constructed a dataset consisting of 25,925 daily milk production records from 85 cows, derived from real data collected at the University of Wisconsin-Madison Agricultural Research Station. Three analyses pipelines were proposed to assess the effectiveness of LLMs handling of numerical databases: Prompt Engineering (zero-shot), Retrieval-Augmented Generation (RAG), and NL2SQL with Decomposition, evaluated using a set of quantitative (5) and qualitative (5) questions. Based on these 10 questions, the NL2SQL with Decomposition achieved 80% accuracy for quantitative questions and the Zero-shot achieved 100% for qualitative questions. These results demonstrate the potential of LLMs to enhance data utilization in dairy farming. Future work will focus on refining the proposed methods and expanding their applicability to other livestock purposes.

查看原文本刊更多论文

乳品GPT：使奶农能够通过自然语言对话与数字数据库进行交互

像GPT-4这样的大型语言模型（llm）通过实现直观的文本和语音交互，简化复杂任务以及普及人工智能驱动工具，彻底改变了人工智能。然而，它们的主要限制之一在于它们能够有效地处理与严格数值数据的交互。这种限制导致了诸如检索增强生成（RAG）和自然语言到SQL （NL2SQL）等创新解决方案的出现，这些解决方案增强了它们在数据密集型领域的适用性。本研究调查了使用大型语言模型（llm）的可能性和可行性，以允许奶农与纯数字数据库进行自然语言交互。为了支持拟议的研究，我们构建了一个由85头奶牛的25,925条每日产奶量记录组成的数据集，这些数据来自威斯康星大学麦迪逊农业研究站收集的真实数据。提出了三个分析管道来评估llm处理数值数据库的有效性：提示工程（zero-shot），检索增强生成（RAG）和带分解的NL2SQL，使用一组定量(5)和定性(5)问题进行评估。在这10个问题的基础上，分解的NL2SQL在定量问题上的准确率达到80%，而Zero-shot在定性问题上的准确率达到100%。这些结果证明了llm在奶牛养殖中提高数据利用率的潜力。未来的工作将侧重于改进所提出的方法并扩大其对其他牲畜用途的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Smart agricultural technology

CiteScore

4.20

自引率

0.00%

发文量