{"title":"Improving Dense FAQ Retrieval with Synthetic Training","authors":"Lu Liu, Qifei Wu, Guang Chen","doi":"10.1109/IC-NIDC54101.2021.9660603","DOIUrl":null,"url":null,"abstract":"Frequently Asked Question (F AQ) retrieval is a valuable task which aims to find the most relevant question-answer pair from a FAQ dataset given a user query. Currently, most works implement F AQ retrieval considering the similarity between the query and the question as well as the relevance between the query and the answer. However, the query-answer relevance is difficult to model effectively due to the heterogeneity of query-answer pairs in terms of syntax and semantics. To alleviate this issue and improve retrieval performance, we propose a novel approach to consider answer information into F AQ retrieval by question generation, which provides high-quality synthetic positive training examples for dense retriever. Experiment results indicate that our method outperforms term-based BM25 and pretrained dense retriever significantly on two recently published COVID-19 F AQ datasets.","PeriodicalId":264468,"journal":{"name":"2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC-NIDC54101.2021.9660603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Frequently Asked Question (F AQ) retrieval is a valuable task which aims to find the most relevant question-answer pair from a FAQ dataset given a user query. Currently, most works implement F AQ retrieval considering the similarity between the query and the question as well as the relevance between the query and the answer. However, the query-answer relevance is difficult to model effectively due to the heterogeneity of query-answer pairs in terms of syntax and semantics. To alleviate this issue and improve retrieval performance, we propose a novel approach to consider answer information into F AQ retrieval by question generation, which provides high-quality synthetic positive training examples for dense retriever. Experiment results indicate that our method outperforms term-based BM25 and pretrained dense retriever significantly on two recently published COVID-19 F AQ datasets.