Zhigang Kan;Linhui Feng;Zhangyue Yin;Linbo Qiao;Xipeng Qiu;Dongsheng Li
{"title":"A Composable Generative Framework Based on Prompt Learning for Various Information Extraction Tasks","authors":"Zhigang Kan;Linhui Feng;Zhangyue Yin;Linbo Qiao;Xipeng Qiu;Dongsheng Li","doi":"10.1109/TBDATA.2023.3278977","DOIUrl":null,"url":null,"abstract":"Prompt learning is an effective paradigm that bridges gaps between the pre-training tasks and the corresponding downstream applications. Approaches based on this paradigm have achieved great transcendent results in various applications. However, it still needs to be answered how to design a general-purpose framework based on the prompt learning paradigm for various information extraction tasks. In this article, we propose a novel composable prompt-based generative framework, which could be applied to a wide range of tasks in the field of information extraction. Specifically, we reformulate information extraction tasks into the form of filling slots in pre-designed type-specific prompts, which consist of one or multiple sub-prompts. A strategy of constructing composable prompts is proposed to enhance the generalization ability in data-scarce scenarios. Furthermore, to fit this framework, we transform relation extraction into the task of determining semantic consistency in prompts. The experimental results demonstrate that our approach surpasses compared baselines on real-world datasets in data-abundant and data-scarce scenarios. Further analysis of the proposed framework is presented, as well as numerical experiments conducted to investigate impact factors of performance on various tasks.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"9 4","pages":"1238-1251"},"PeriodicalIF":7.5000,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10130644/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Prompt learning is an effective paradigm that bridges gaps between the pre-training tasks and the corresponding downstream applications. Approaches based on this paradigm have achieved great transcendent results in various applications. However, it still needs to be answered how to design a general-purpose framework based on the prompt learning paradigm for various information extraction tasks. In this article, we propose a novel composable prompt-based generative framework, which could be applied to a wide range of tasks in the field of information extraction. Specifically, we reformulate information extraction tasks into the form of filling slots in pre-designed type-specific prompts, which consist of one or multiple sub-prompts. A strategy of constructing composable prompts is proposed to enhance the generalization ability in data-scarce scenarios. Furthermore, to fit this framework, we transform relation extraction into the task of determining semantic consistency in prompts. The experimental results demonstrate that our approach surpasses compared baselines on real-world datasets in data-abundant and data-scarce scenarios. Further analysis of the proposed framework is presented, as well as numerical experiments conducted to investigate impact factors of performance on various tasks.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.