{"title":"Multi-head attention based candidate segment selection in QA over hybrid data","authors":"Qian Chen, Xiaoying Gao, Xin Guo, Suge Wang","doi":"10.3233/ida-227032","DOIUrl":null,"url":null,"abstract":"Question Answering based on Tabular and Textual data is a novel task proposed in recent years in the field of QA. At present, most QA systems return answers from a single data form, such as knowledge graphs, tables, texts. However, hybrid data including structured and unstructured data is quite pervasive in real life instead of a single form. Recent research on TAT-QA mainly suffers from the higher error of extracting supporting evidences from both tabular and textual content. This paper aimed to address the problem of failure evidence extraction from more complex and realistic hybrid data. We first proposed two types of metrics to evaluate the performance of evidence extraction on hybrid data, i.e. wrong evidence ratio (WER) and missing evidence ratio (MER). Then we utilize a candidate extractor to obtain supporting evidence related to the question. Third, an origin selector is designed to determine from where the question’s answer comes. Finally, the loss of origin selector is fused to the final loss function, which can improve the evidence extraction performance. Experimental results on the TAT-QA dataset showed that our proposed model outperforms the best baseline in terms of F1, WER and MER, which proves the effectiveness of our model.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"42 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-227032","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Question Answering based on Tabular and Textual data is a novel task proposed in recent years in the field of QA. At present, most QA systems return answers from a single data form, such as knowledge graphs, tables, texts. However, hybrid data including structured and unstructured data is quite pervasive in real life instead of a single form. Recent research on TAT-QA mainly suffers from the higher error of extracting supporting evidences from both tabular and textual content. This paper aimed to address the problem of failure evidence extraction from more complex and realistic hybrid data. We first proposed two types of metrics to evaluate the performance of evidence extraction on hybrid data, i.e. wrong evidence ratio (WER) and missing evidence ratio (MER). Then we utilize a candidate extractor to obtain supporting evidence related to the question. Third, an origin selector is designed to determine from where the question’s answer comes. Finally, the loss of origin selector is fused to the final loss function, which can improve the evidence extraction performance. Experimental results on the TAT-QA dataset showed that our proposed model outperforms the best baseline in terms of F1, WER and MER, which proves the effectiveness of our model.
期刊介绍:
Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.