基于结构搜索和基于前缀的文本生成的低资源定量信息提取

Tongliang Li, Zixiang Wang, Zhoujun Li
{"title":"基于结构搜索和基于前缀的文本生成的低资源定量信息提取","authors":"Tongliang Li, Zixiang Wang, Zhoujun Li","doi":"10.1609/aaai.v37i11.26540","DOIUrl":null,"url":null,"abstract":"Quantitative information plays an important part in the financial and data analysis areas. Prior work relied on pattern-matching methods and complex hand-crafted rules to extract quantitative information due to the lack of labeled data. Such methods can be unstable and difficult to scale to the open domain. In this paper, we study quantitative information extraction in the low-resource setting. We propose a search-based approach by searching from the syntactic structures to acquire basic training data. The search process is simple yet effective. Then, a prefix-based text-to-text generation method is employed to extract the quantitative information. The prefix design can fully leverage pre-trained language models for text generation to serve the information extraction purpose. Experimental results show that our approaches achieves high performance with a limited amount of labeled data. The extraction result could further boost the performance of other tasks such as quantitative reasoning.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"6 1","pages":"13112-13120"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Low Resource Quantitative Information Extraction via Structure Searching and Prefix-Based Text Generation\",\"authors\":\"Tongliang Li, Zixiang Wang, Zhoujun Li\",\"doi\":\"10.1609/aaai.v37i11.26540\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quantitative information plays an important part in the financial and data analysis areas. Prior work relied on pattern-matching methods and complex hand-crafted rules to extract quantitative information due to the lack of labeled data. Such methods can be unstable and difficult to scale to the open domain. In this paper, we study quantitative information extraction in the low-resource setting. We propose a search-based approach by searching from the syntactic structures to acquire basic training data. The search process is simple yet effective. Then, a prefix-based text-to-text generation method is employed to extract the quantitative information. The prefix design can fully leverage pre-trained language models for text generation to serve the information extraction purpose. Experimental results show that our approaches achieves high performance with a limited amount of labeled data. The extraction result could further boost the performance of other tasks such as quantitative reasoning.\",\"PeriodicalId\":74506,\"journal\":{\"name\":\"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence\",\"volume\":\"6 1\",\"pages\":\"13112-13120\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1609/aaai.v37i11.26540\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaai.v37i11.26540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

定量信息在财务和数据分析领域发挥着重要作用。由于缺乏标记数据,先前的工作依赖于模式匹配方法和复杂的手工规则来提取定量信息。这种方法可能不稳定,并且难以扩展到开放域。本文研究了低资源环境下的定量信息提取问题。我们提出了一种基于搜索的方法,从语法结构中搜索来获取基本训练数据。搜索过程简单而有效。然后,采用基于前缀的文本到文本生成方法提取定量信息。前缀设计可以充分利用预先训练好的语言模型进行文本生成,达到信息提取的目的。实验结果表明,我们的方法在有限数量的标记数据下取得了很高的性能。提取结果可以进一步提高其他任务的性能,如定量推理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Low Resource Quantitative Information Extraction via Structure Searching and Prefix-Based Text Generation
Quantitative information plays an important part in the financial and data analysis areas. Prior work relied on pattern-matching methods and complex hand-crafted rules to extract quantitative information due to the lack of labeled data. Such methods can be unstable and difficult to scale to the open domain. In this paper, we study quantitative information extraction in the low-resource setting. We propose a search-based approach by searching from the syntactic structures to acquire basic training data. The search process is simple yet effective. Then, a prefix-based text-to-text generation method is employed to extract the quantitative information. The prefix design can fully leverage pre-trained language models for text generation to serve the information extraction purpose. Experimental results show that our approaches achieves high performance with a limited amount of labeled data. The extraction result could further boost the performance of other tasks such as quantitative reasoning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信