AgAsk: an agent to help answer farmer’s questions from scientific documents

IF 1.6 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE
Bevan Koopman, Ahmed Mourad, Hang Li, Anton van der Vegt, Shengyao Zhuang, Simon Gibson, Yash Dang, David Lawrence, Guido Zuccon
{"title":"AgAsk: an agent to help answer farmer’s questions from scientific documents","authors":"Bevan Koopman, Ahmed Mourad, Hang Li, Anton van der Vegt, Shengyao Zhuang, Simon Gibson, Yash Dang, David Lawrence, Guido Zuccon","doi":"10.1007/s00799-023-00369-y","DOIUrl":null,"url":null,"abstract":"Abstract Decisions in agriculture are increasingly data-driven. However, valuable agricultural knowledge is often locked away in free-text reports, manuals and journal articles. Specialised search systems are needed that can mine agricultural information to provide relevant answers to users’ questions. This paper presents AgAsk—an agent able to answer natural language agriculture questions by mining scientific documents. We carefully survey and analyse farmers’ information needs. On the basis of these needs, we release an information retrieval test collection comprising real questions, a large collection of scientific documents split in passages, and ground truth relevance assessments indicating which passages are relevant to each question. We implement and evaluate a number of information retrieval models to answer farmers questions, including two state-of-the-art neural ranking models. We show that neural rankers are highly effective at matching passages to questions in this context. Finally, we propose a deployment architecture for AgAsk that includes a client based on the Telegram messaging platform and retrieval model deployed on commodity hardware. The test collection we provide is intended to stimulate more research in methods to match natural language to answers in scientific documents. While the retrieval models were evaluated in the agriculture domain, they are generalisable and of interest to others working on similar problems. The test collection is available at: https://github.com/ielab/agvaluate .","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"127 1","pages":"0"},"PeriodicalIF":1.6000,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00799-023-00369-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Decisions in agriculture are increasingly data-driven. However, valuable agricultural knowledge is often locked away in free-text reports, manuals and journal articles. Specialised search systems are needed that can mine agricultural information to provide relevant answers to users’ questions. This paper presents AgAsk—an agent able to answer natural language agriculture questions by mining scientific documents. We carefully survey and analyse farmers’ information needs. On the basis of these needs, we release an information retrieval test collection comprising real questions, a large collection of scientific documents split in passages, and ground truth relevance assessments indicating which passages are relevant to each question. We implement and evaluate a number of information retrieval models to answer farmers questions, including two state-of-the-art neural ranking models. We show that neural rankers are highly effective at matching passages to questions in this context. Finally, we propose a deployment architecture for AgAsk that includes a client based on the Telegram messaging platform and retrieval model deployed on commodity hardware. The test collection we provide is intended to stimulate more research in methods to match natural language to answers in scientific documents. While the retrieval models were evaluated in the agriculture domain, they are generalisable and of interest to others working on similar problems. The test collection is available at: https://github.com/ielab/agvaluate .

Abstract Image

AgAsk:帮助农民从科学文献中回答问题的代理
农业决策越来越多地由数据驱动。然而,宝贵的农业知识往往被锁在自由文本报告、手册和期刊文章中。需要专门的搜索系统来挖掘农业信息,为用户的问题提供相关的答案。本文提出了一种能够通过挖掘科学文献来回答自然语言农业问题的智能体asask。我们认真调查和分析农民的信息需求。在这些需求的基础上,我们发布了一个信息检索测试集,包括真实问题,大量科学文献的片段,以及表明哪些段落与每个问题相关的基础真相相关性评估。我们实施和评估了一些信息检索模型来回答农民的问题,包括两个最先进的神经排序模型。我们表明,在这种情况下,神经排序器在匹配段落和问题方面非常有效。最后,我们提出了一个AgAsk的部署体系结构,其中包括基于Telegram消息平台的客户端和部署在商用硬件上的检索模型。我们提供的测试集旨在激发更多的研究方法,将自然语言与科学文献中的答案相匹配。虽然这些检索模型是在农业领域进行评估的,但它们是可推广的,并且对研究类似问题的其他人很感兴趣。测试集可在:https://github.com/ielab/agvaluate上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.30
自引率
6.70%
发文量
20
期刊介绍: The International Journal on Digital Libraries (IJDL) examines the theory and practice of acquisition definition organization management preservation and dissemination of digital information via global networking. It covers all aspects of digital libraries (DLs) from large-scale heterogeneous data and information management & access to linking and connectivity to security privacy and policies to its application use and evaluation.The scope of IJDL includes but is not limited to: The FAIR principle and the digital libraries infrastructure Findable: Information access and retrieval; semantic search; data and information exploration; information navigation; smart indexing and searching; resource discovery Accessible: visualization and digital collections; user interfaces; interfaces for handicapped users; HCI and UX in DLs; Security and privacy in DLs; multimodal access Interoperable: metadata (definition management curation integration); syntactic and semantic interoperability; linked data Reusable: reproducibility; Open Science; sustainability profitability repeatability of research results; confidentiality and privacy issues in DLs Digital Library Architectures including heterogeneous and dynamic data management; data and repositories Acquisition of digital information: authoring environments for digital objects; digitization of traditional content Digital Archiving and Preservation Digital Preservation and curation Digital archiving Web Archiving Archiving and preservation Strategies AI for Digital Libraries Machine Learning for DLs Data Mining in DLs NLP for DLs Applications of Digital Libraries Digital Humanities Open Data and their reuse Scholarly DLs (incl. bibliometrics altmetrics) Epigraphy and Paleography Digital Museums Future trends in Digital Libraries Definition of DLs in a ubiquitous digital library world Datafication of digital collections Interaction and user experience (UX) in DLs Information visualization Collection understanding Privacy and security Multimodal user interfaces Accessibility (or "Access for users with disabilities") UX studies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信