Dairy GPT: Empowering dairy farmers to interact with numerical databases through natural language conversations

IF 6.3 Q1 AGRICULTURAL ENGINEERING
Danillo Gontijo , Douglas Rolins Santana , Gustavo de Assis Costa , Victor E. Cabrera , Eduardo Noronha de Andrade Freitas
{"title":"Dairy GPT: Empowering dairy farmers to interact with numerical databases through natural language conversations","authors":"Danillo Gontijo ,&nbsp;Douglas Rolins Santana ,&nbsp;Gustavo de Assis Costa ,&nbsp;Victor E. Cabrera ,&nbsp;Eduardo Noronha de Andrade Freitas","doi":"10.1016/j.atech.2025.101097","DOIUrl":null,"url":null,"abstract":"<div><div>Large language models (LLMs), like GPT-4, have revolutionized artificial intelligence by enabling intuitive text and voice interactions, simplifying complex tasks, and democratizing access to AI-driven tools. However, one of their primary limitations lies in their ability to effectively handle interactions with strictly numerical data. This limitation has led to innovative solutions such as Retrieval Augmented Generation (RAG) and Natural Language to SQL (NL2SQL), which enhance their applicability in data-intensive domains. This study investigated the possibility and feasibility of using large language models (LLMs) to allow natural language interactions of dairy farmers with purely numerical databases. To support the proposed study, we constructed a dataset consisting of 25,925 daily milk production records from 85 cows, derived from real data collected at the University of Wisconsin-Madison Agricultural Research Station. Three analyses pipelines were proposed to assess the effectiveness of LLMs handling of numerical databases: Prompt Engineering (zero-shot), Retrieval-Augmented Generation (RAG), and NL2SQL with Decomposition, evaluated using a set of quantitative (5) and qualitative (5) questions. Based on these 10 questions, the NL2SQL with Decomposition achieved 80% accuracy for quantitative questions and the Zero-shot achieved 100% for qualitative questions. These results demonstrate the potential of LLMs to enhance data utilization in dairy farming. Future work will focus on refining the proposed methods and expanding their applicability to other livestock purposes.</div></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":"12 ","pages":"Article 101097"},"PeriodicalIF":6.3000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375525003302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Large language models (LLMs), like GPT-4, have revolutionized artificial intelligence by enabling intuitive text and voice interactions, simplifying complex tasks, and democratizing access to AI-driven tools. However, one of their primary limitations lies in their ability to effectively handle interactions with strictly numerical data. This limitation has led to innovative solutions such as Retrieval Augmented Generation (RAG) and Natural Language to SQL (NL2SQL), which enhance their applicability in data-intensive domains. This study investigated the possibility and feasibility of using large language models (LLMs) to allow natural language interactions of dairy farmers with purely numerical databases. To support the proposed study, we constructed a dataset consisting of 25,925 daily milk production records from 85 cows, derived from real data collected at the University of Wisconsin-Madison Agricultural Research Station. Three analyses pipelines were proposed to assess the effectiveness of LLMs handling of numerical databases: Prompt Engineering (zero-shot), Retrieval-Augmented Generation (RAG), and NL2SQL with Decomposition, evaluated using a set of quantitative (5) and qualitative (5) questions. Based on these 10 questions, the NL2SQL with Decomposition achieved 80% accuracy for quantitative questions and the Zero-shot achieved 100% for qualitative questions. These results demonstrate the potential of LLMs to enhance data utilization in dairy farming. Future work will focus on refining the proposed methods and expanding their applicability to other livestock purposes.
乳品GPT:使奶农能够通过自然语言对话与数字数据库进行交互
像GPT-4这样的大型语言模型(llm)通过实现直观的文本和语音交互,简化复杂任务以及普及人工智能驱动工具,彻底改变了人工智能。然而,它们的主要限制之一在于它们能够有效地处理与严格数值数据的交互。这种限制导致了诸如检索增强生成(RAG)和自然语言到SQL (NL2SQL)等创新解决方案的出现,这些解决方案增强了它们在数据密集型领域的适用性。本研究调查了使用大型语言模型(llm)的可能性和可行性,以允许奶农与纯数字数据库进行自然语言交互。为了支持拟议的研究,我们构建了一个由85头奶牛的25,925条每日产奶量记录组成的数据集,这些数据来自威斯康星大学麦迪逊农业研究站收集的真实数据。提出了三个分析管道来评估llm处理数值数据库的有效性:提示工程(zero-shot),检索增强生成(RAG)和带分解的NL2SQL,使用一组定量(5)和定性(5)问题进行评估。在这10个问题的基础上,分解的NL2SQL在定量问题上的准确率达到80%,而Zero-shot在定性问题上的准确率达到100%。这些结果证明了llm在奶牛养殖中提高数据利用率的潜力。未来的工作将侧重于改进所提出的方法并扩大其对其他牲畜用途的适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信