{"title":"eMoCo:增强动量对比的句子表征学习","authors":"Shibo Qi, Rize Jin, Joon-Young Paik","doi":"10.1145/3569966.3570013","DOIUrl":null,"url":null,"abstract":"Sentence representation learning can transform sentences into fixed format vectors, and provides foundation for downstream tasks such as information retrieval, semantic similarity analysis, etc. With the popularity of contrastive learning, sentence representation learning has also been further developed. At the same time, contrastive learning method based on momentum has achieved great success in computer vision. It solves the coupling between negative samples and batch size. But its expected performance is not observed in natural language processing tasks because the combination of data augmentation strategies is weak, and it only utilizes the samples in the momentum queue as negatives while ignoring those generated in current batch. In this paper, we propose eMoCo: enhanced Momentum Contrast to solve the above issues. We formulate a set of data augmentation strategies for text, and present a novel Dual-Negative loss to make full use of all negative samples. Extensive experiments on STS (Semantic Text Similarity) datasets show that our method outperforms the current state-of-the-art models, indicating its advantages in sentence representation learning.","PeriodicalId":145580,"journal":{"name":"Proceedings of the 5th International Conference on Computer Science and Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"eMoCo: Sentence Representation Learning With Enhanced Momentum Contrast\",\"authors\":\"Shibo Qi, Rize Jin, Joon-Young Paik\",\"doi\":\"10.1145/3569966.3570013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentence representation learning can transform sentences into fixed format vectors, and provides foundation for downstream tasks such as information retrieval, semantic similarity analysis, etc. With the popularity of contrastive learning, sentence representation learning has also been further developed. At the same time, contrastive learning method based on momentum has achieved great success in computer vision. It solves the coupling between negative samples and batch size. But its expected performance is not observed in natural language processing tasks because the combination of data augmentation strategies is weak, and it only utilizes the samples in the momentum queue as negatives while ignoring those generated in current batch. In this paper, we propose eMoCo: enhanced Momentum Contrast to solve the above issues. We formulate a set of data augmentation strategies for text, and present a novel Dual-Negative loss to make full use of all negative samples. Extensive experiments on STS (Semantic Text Similarity) datasets show that our method outperforms the current state-of-the-art models, indicating its advantages in sentence representation learning.\",\"PeriodicalId\":145580,\"journal\":{\"name\":\"Proceedings of the 5th International Conference on Computer Science and Software Engineering\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th International Conference on Computer Science and Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3569966.3570013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Computer Science and Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3569966.3570013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
eMoCo: Sentence Representation Learning With Enhanced Momentum Contrast
Sentence representation learning can transform sentences into fixed format vectors, and provides foundation for downstream tasks such as information retrieval, semantic similarity analysis, etc. With the popularity of contrastive learning, sentence representation learning has also been further developed. At the same time, contrastive learning method based on momentum has achieved great success in computer vision. It solves the coupling between negative samples and batch size. But its expected performance is not observed in natural language processing tasks because the combination of data augmentation strategies is weak, and it only utilizes the samples in the momentum queue as negatives while ignoring those generated in current batch. In this paper, we propose eMoCo: enhanced Momentum Contrast to solve the above issues. We formulate a set of data augmentation strategies for text, and present a novel Dual-Negative loss to make full use of all negative samples. Extensive experiments on STS (Semantic Text Similarity) datasets show that our method outperforms the current state-of-the-art models, indicating its advantages in sentence representation learning.