Shaoqin Huang , Yue Wang , Eugene Y.C. Wong , Lei Yu
{"title":"Ensemble learning with soft-prompted pretrained language models for fact checking","authors":"Shaoqin Huang , Yue Wang , Eugene Y.C. Wong , Lei Yu","doi":"10.1016/j.nlp.2024.100067","DOIUrl":null,"url":null,"abstract":"<div><p>The infectious diseases, such as COVID-19 pandemic, has led to a surge of information on the internet, including misinformation, necessitating fact-checking tools. However, fact-checking infectious diseases related claims pose challenges due to informal claims versus formal evidence and the presence of multiple aspects in a claim. To address these issues, we propose a soft prompt-based ensemble learning framework for COVID-19 fact checking. To understand complex assertions in informal social media texts, we explore various soft prompt structures to take advantage of the T5 language model, and ensemble these prompt structures together. Soft prompts offer flexibility and better generalization compared to hard prompts. The ensemble model captures linguistic cues and contextual information in COVID-19-related data, and thus enhances generalization to new claims. Experimental results demonstrate that prompt-based ensemble learning improves fact-checking accuracy and provides a promising approach to combat misinformation during the pandemic. In addition, the method also shows great zero-shot learning capability and thus can be applied to various fact checking problems.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100067"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000153/pdfft?md5=268e2b44eb63a0ef7ca15c1fd64330b7&pid=1-s2.0-S2949719124000153-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Processing Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949719124000153","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The infectious diseases, such as COVID-19 pandemic, has led to a surge of information on the internet, including misinformation, necessitating fact-checking tools. However, fact-checking infectious diseases related claims pose challenges due to informal claims versus formal evidence and the presence of multiple aspects in a claim. To address these issues, we propose a soft prompt-based ensemble learning framework for COVID-19 fact checking. To understand complex assertions in informal social media texts, we explore various soft prompt structures to take advantage of the T5 language model, and ensemble these prompt structures together. Soft prompts offer flexibility and better generalization compared to hard prompts. The ensemble model captures linguistic cues and contextual information in COVID-19-related data, and thus enhances generalization to new claims. Experimental results demonstrate that prompt-based ensemble learning improves fact-checking accuracy and provides a promising approach to combat misinformation during the pandemic. In addition, the method also shows great zero-shot learning capability and thus can be applied to various fact checking problems.