{"title":"Semantic Augmentation Transformer Model for Ad-hoc Retrieval","authors":"Chongyang Li","doi":"10.1109/ITOEC53115.2022.9734522","DOIUrl":null,"url":null,"abstract":"We propose SATM, a Semantic Augmentation Transformer Model for ad-hoc retrieval. SATM adapts contrastive learning, which improves the performance of semantic similarity and the ranking results of ad-hoc retrieval. It can also augment the semantic representation of sentence embeddings. Specifically, we first use an unsup-ervised contrastive learning augmentation module to learn query similarity so that the projection head can accurately capture query semantics. Then we use the trained encoder network to map queries and perform semantic similarity calculations and rankings with document embeddings. We use BERT to extract the contextual representation of the sentence, and use the augmentation module to enhance the semantics and eliminate the anisotropy of sentence embedding. Experimental results show that in the TrecQA data set, SATM has 4% and 1.4% improvements in MRR and MAP over Bert-base.","PeriodicalId":127300,"journal":{"name":"2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITOEC53115.2022.9734522","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We propose SATM, a Semantic Augmentation Transformer Model for ad-hoc retrieval. SATM adapts contrastive learning, which improves the performance of semantic similarity and the ranking results of ad-hoc retrieval. It can also augment the semantic representation of sentence embeddings. Specifically, we first use an unsup-ervised contrastive learning augmentation module to learn query similarity so that the projection head can accurately capture query semantics. Then we use the trained encoder network to map queries and perform semantic similarity calculations and rankings with document embeddings. We use BERT to extract the contextual representation of the sentence, and use the augmentation module to enhance the semantics and eliminate the anisotropy of sentence embedding. Experimental results show that in the TrecQA data set, SATM has 4% and 1.4% improvements in MRR and MAP over Bert-base.