{"title":"Exploring highly concise and accurate text matching model with tiny weights","authors":"Yangchun Li, Danfeng Yan, Wei Jiang, Yuanqiang Cai, Zhihong Tian","doi":"10.1007/s11280-024-01262-7","DOIUrl":null,"url":null,"abstract":"<p>In this paper, we propose a simple and general lightweight approach named AL-RE2 for text matching models, and conduct experiments on three well-studied benchmark datasets across tasks of natural language inference and paraphrase identification. Firstly, we explore the feasibility of dimensional compression of word embedding vectors using principal component analysis, and then analyze the impact of the information retained in different dimensions on model accuracy. Considering the balance between compression efficiency and information loss, we choose 128 dimensions to represent each word and make the model params 1.6M. Finally, the feasibility of applying depthwise separable convolution instead of standard convolution in the field of text matching is analyzed in detail. The experimental results show that our model’s inference speed is at least 1.5 times faster and it has 42.76% fewer parameters compared to similarly performing models, while its accuracy on the SciTail dataset of is state-of-the-art among all lightweight models.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":"33 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Wide Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11280-024-01262-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we propose a simple and general lightweight approach named AL-RE2 for text matching models, and conduct experiments on three well-studied benchmark datasets across tasks of natural language inference and paraphrase identification. Firstly, we explore the feasibility of dimensional compression of word embedding vectors using principal component analysis, and then analyze the impact of the information retained in different dimensions on model accuracy. Considering the balance between compression efficiency and information loss, we choose 128 dimensions to represent each word and make the model params 1.6M. Finally, the feasibility of applying depthwise separable convolution instead of standard convolution in the field of text matching is analyzed in detail. The experimental results show that our model’s inference speed is at least 1.5 times faster and it has 42.76% fewer parameters compared to similarly performing models, while its accuracy on the SciTail dataset of is state-of-the-art among all lightweight models.