{"title":"Improving conversational search with query reformulation using selective contextual history","authors":"Haya Al-Thani , Tamer Elsayed , Bernard J. Jansen","doi":"10.1016/j.dim.2022.100025","DOIUrl":null,"url":null,"abstract":"<div><p>Automated responses to questions for conversational agents, known as conversation passage retrieval, is challenging due to omissions and implied context in user queries. To help address this challenge, queries can be re-written using pre-trained sequence-to-sequence models based on contextual clues from the conversation's history to resolve ambiguities. In this research, we use the TREC conversational assistant (CAsT) 2020 dataset, selecting relevant single sentences from conversation history for query reformulation to improve system effectiveness and efficiency by avoiding topic drift. We propose a practical query selection method that measures clarity score to determine whether to use response sentences during reformulation. We further explore query reformulation as a binary term classification problem and the effects of rank fusion using multiple retrieval models. T5 and BERT retrievals are inventively combined to better represent user information need. Using multi-model fusion, our best system outperforms the best CAsT 2020 run, with an NDCG@3 of 0.537. The implication is that a more selective system that varies the use of responses depending on the query produces a more effective conversational reformulation system. Combining different retrieval results also proved effective in improving system recall.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"7 2","pages":"Article 100025"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data and information management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2543925122001231","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Automated responses to questions for conversational agents, known as conversation passage retrieval, is challenging due to omissions and implied context in user queries. To help address this challenge, queries can be re-written using pre-trained sequence-to-sequence models based on contextual clues from the conversation's history to resolve ambiguities. In this research, we use the TREC conversational assistant (CAsT) 2020 dataset, selecting relevant single sentences from conversation history for query reformulation to improve system effectiveness and efficiency by avoiding topic drift. We propose a practical query selection method that measures clarity score to determine whether to use response sentences during reformulation. We further explore query reformulation as a binary term classification problem and the effects of rank fusion using multiple retrieval models. T5 and BERT retrievals are inventively combined to better represent user information need. Using multi-model fusion, our best system outperforms the best CAsT 2020 run, with an NDCG@3 of 0.537. The implication is that a more selective system that varies the use of responses depending on the query produces a more effective conversational reformulation system. Combining different retrieval results also proved effective in improving system recall.