Relation Extraction for Knowledge Graph Generation in the Agriculture Domain: A Case Study on Soybean Pests and Disease

IF 0.9 4区农林科学 Q4 AGRICULTURAL ENGINEERING

Applied Engineering in Agriculture Pub Date : 2023-01-01 DOI:10.13031/aea.15124

Pengxiang Wang, Cong-Xuan Zhang, Dingqian Wang, Shaohua Zhang, Jun Wang, Xianzhi Wang, Lan Huang

{"title":"Relation Extraction for Knowledge Graph Generation in the Agriculture Domain: A Case Study on Soybean Pests and Disease","authors":"Pengxiang Wang, Cong-Xuan Zhang, Dingqian Wang, Shaohua Zhang, Jun Wang, Xianzhi Wang, Lan Huang","doi":"10.13031/aea.15124","DOIUrl":null,"url":null,"abstract":"HighlightsWith the aim to reduce the burden of acquiring expert knowledge and strengthen the connection between written knowledge and the fields, this article investigated the problem of automatically extracting and organizing soybean pests and disease knowledge from text.Entities and relations were extracted using multiple models with deep neural network structures. Performance of these models were compared and evaluated in detail.A knowledge graph was automatically constructed using the extracted information, and made publicly available.ABSTRACT. Precision agriculture is an emerging type of agriculture that intensively uses information technology to automate agricultural production. Soybean (Glycine max (L.) Merri.), is an important crop in China, with an annual demand of approximately 110 million tons. However, in China, soybean production is threatened by more than 30 kinds of disease and 100 kinds of pests. With the rapidly increasing specialized information in the literature, it is difficult for farmers to keep up. Relation extraction automatically identifies and extracts structured knowledge from natural language text and thus can help to alleviate the problem. In this study, we propose to employ relation extraction to systematically extract information from expert-written text, and generate a knowledge graph from the extracted information. This case study was planned in China, therefore we mainly used Chinese texts. Firstly, we carefully chose expert-written text on soybean pests and disease, labeled the entities, and classified their thematic relations into five categories. Then, we built and trained three relation extraction models using state-of-the-art deep learning architectures and evaluated their performance on our task. Finally, we constructed an example knowledge graph from the extracted information and demonstrated their potential usage for automatic reasoning and solution recommendation for pests and disease prevention. In total, this study sampled 1038 entities and 1569 relation instances. Experimental results showed that our best model achieved an F1 score of 98.49% on identifying relations from text. Experimental results also showed the effectiveness of the example knowledge graph. Keywords: Bidirectional encoder representation from transformers, Knowledge graph, Relation extraction, Soybean pests and disease.","PeriodicalId":55501,"journal":{"name":"Applied Engineering in Agriculture","volume":"1 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Engineering in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.13031/aea.15124","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

HighlightsWith the aim to reduce the burden of acquiring expert knowledge and strengthen the connection between written knowledge and the fields, this article investigated the problem of automatically extracting and organizing soybean pests and disease knowledge from text.Entities and relations were extracted using multiple models with deep neural network structures. Performance of these models were compared and evaluated in detail.A knowledge graph was automatically constructed using the extracted information, and made publicly available.ABSTRACT. Precision agriculture is an emerging type of agriculture that intensively uses information technology to automate agricultural production. Soybean (Glycine max (L.) Merri.), is an important crop in China, with an annual demand of approximately 110 million tons. However, in China, soybean production is threatened by more than 30 kinds of disease and 100 kinds of pests. With the rapidly increasing specialized information in the literature, it is difficult for farmers to keep up. Relation extraction automatically identifies and extracts structured knowledge from natural language text and thus can help to alleviate the problem. In this study, we propose to employ relation extraction to systematically extract information from expert-written text, and generate a knowledge graph from the extracted information. This case study was planned in China, therefore we mainly used Chinese texts. Firstly, we carefully chose expert-written text on soybean pests and disease, labeled the entities, and classified their thematic relations into five categories. Then, we built and trained three relation extraction models using state-of-the-art deep learning architectures and evaluated their performance on our task. Finally, we constructed an example knowledge graph from the extracted information and demonstrated their potential usage for automatic reasoning and solution recommendation for pests and disease prevention. In total, this study sampled 1038 entities and 1569 relation instances. Experimental results showed that our best model achieved an F1 score of 98.49% on identifying relations from text. Experimental results also showed the effectiveness of the example knowledge graph. Keywords: Bidirectional encoder representation from transformers, Knowledge graph, Relation extraction, Soybean pests and disease.

查看原文本刊更多论文

农业领域知识图谱生成的关系提取——以大豆病虫害为例

为了减轻专家知识获取的负担，加强书面知识与领域的联系，本文研究了大豆病虫害知识的文本自动提取和组织问题。利用深度神经网络结构的多个模型提取实体和关系。对这些模型的性能进行了详细的比较和评价。利用提取的信息自动构建知识图谱，并对外公开。精准农业是集约利用信息技术实现农业生产自动化的一种新兴农业类型。大豆(甘氨酸max (l))稻谷(Merri.)是中国的重要作物，年需求量约为1.1亿吨。然而，在中国，大豆生产受到30多种疾病和100多种害虫的威胁。随着文献中专业信息的迅速增加，农民很难跟上。关系抽取是一种从自然语言文本中自动识别和提取结构化知识的方法，有助于缓解这一问题。在本研究中，我们提出采用关系抽取的方法系统地从专家撰写的文本中提取信息，并从提取的信息中生成知识图。本案例研究计划在中国进行，因此我们主要使用中文文本。首先，我们仔细选择专家撰写的大豆病虫害文本，标记实体，并将其主题关系分为五类。然后，我们使用最先进的深度学习架构构建和训练了三个关系提取模型，并评估了它们在我们任务中的表现。最后，我们将提取的信息构建了一个示例知识图，并展示了它们在病虫害预防的自动推理和方案推荐方面的潜在用途。本研究总共抽样了1038个实体和1569个关系实例。实验结果表明，我们的最佳模型在识别文本关系方面达到了98.49%的F1分数。实验结果也证明了实例知识图的有效性。关键词:双向编码器变压器表示，知识图谱，关系提取，大豆病虫害

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Engineering in Agriculture 农林科学-农业工程

CiteScore

1.80

自引率

11.10%

发文量

审稿时长

6 months

期刊介绍： This peer-reviewed journal publishes applications of engineering and technology research that address agricultural, food, and biological systems problems. Submissions must include results of practical experiences, tests, or trials presented in a manner and style that will allow easy adaptation by others; results of reviews or studies of installations or applications with substantially new or significant information not readily available in other refereed publications; or a description of successful methods of techniques of education, outreach, or technology transfer.