{"title":"A Semantic Feature Representation Method Based on Dynamic Selection of Sub-word-level and Word-level","authors":"XiaoDong Cai, ZhuCheng Gao, Shuting Zheng","doi":"10.1109/CCNS53852.2021.00017","DOIUrl":null,"url":null,"abstract":"Aiming at the problem that low-frequency words or unregistered words are difficult to learn effective word-level feature information due to lack of training samples, which makes the semantic expression of text inaccurate, this paper proposes a semantic feature representation method based on sub-word-level and word-level dynamic selection. First of all, using the bidirectional Long Short-Term Memory network (Bi-LSTM) to capture the characteristics of potential long-distance dependencies, a Bi-LSTM-based sub-word feature representation method is designed on the basis of the Skip-gram method. Then, in order to accurately obtain the semantic feature representation of words, a new gated dynamic selection mechanism is designed, which combines sub-word-level and word-level feature vectors to enrich the effective information of words. The experimental results show that the method proposed in this paper is effective. Compared with the word representation method of related research, the Pearson and Spearman correlation coefficients of this method are significantly improved on the STS dataset and SICK dataset.","PeriodicalId":142980,"journal":{"name":"2021 2nd International Conference on Computer Communication and Network Security (CCNS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 2nd International Conference on Computer Communication and Network Security (CCNS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCNS53852.2021.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Aiming at the problem that low-frequency words or unregistered words are difficult to learn effective word-level feature information due to lack of training samples, which makes the semantic expression of text inaccurate, this paper proposes a semantic feature representation method based on sub-word-level and word-level dynamic selection. First of all, using the bidirectional Long Short-Term Memory network (Bi-LSTM) to capture the characteristics of potential long-distance dependencies, a Bi-LSTM-based sub-word feature representation method is designed on the basis of the Skip-gram method. Then, in order to accurately obtain the semantic feature representation of words, a new gated dynamic selection mechanism is designed, which combines sub-word-level and word-level feature vectors to enrich the effective information of words. The experimental results show that the method proposed in this paper is effective. Compared with the word representation method of related research, the Pearson and Spearman correlation coefficients of this method are significantly improved on the STS dataset and SICK dataset.