{"title":"基于句子语义向量相似度的网络新词发现框架","authors":"GanFeng Yu, Yue Feng Ma, Yang Song","doi":"10.1109/ICTAI56018.2022.00052","DOIUrl":null,"url":null,"abstract":"New word discovery is a key problem in text information retrieval technology. Methods in new word discovery are often closely related to words. Because their target is words, the findings are obtained by designing methods to analyze words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network new words that are far from standard Chinese expression. How detect network new words is one of the important goals in the field of new word discovery today. In this paper, we integrate the word embedding model and clustering methods to propose a network new word discovery framework based on sentence semantic similarity (S3-N2WD) to detect network new words effectively from the network texts. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes new network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network new words but also realizes the standard word meaning of the discovery of it, which reflects the effectiveness of our work.","PeriodicalId":354314,"journal":{"name":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Network New Word Discovery Framework Based on Sentence Semantic Vector Similarity\",\"authors\":\"GanFeng Yu, Yue Feng Ma, Yang Song\",\"doi\":\"10.1109/ICTAI56018.2022.00052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"New word discovery is a key problem in text information retrieval technology. Methods in new word discovery are often closely related to words. Because their target is words, the findings are obtained by designing methods to analyze words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network new words that are far from standard Chinese expression. How detect network new words is one of the important goals in the field of new word discovery today. In this paper, we integrate the word embedding model and clustering methods to propose a network new word discovery framework based on sentence semantic similarity (S3-N2WD) to detect network new words effectively from the network texts. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes new network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network new words but also realizes the standard word meaning of the discovery of it, which reflects the effectiveness of our work.\",\"PeriodicalId\":354314,\"journal\":{\"name\":\"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI56018.2022.00052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI56018.2022.00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Network New Word Discovery Framework Based on Sentence Semantic Vector Similarity
New word discovery is a key problem in text information retrieval technology. Methods in new word discovery are often closely related to words. Because their target is words, the findings are obtained by designing methods to analyze words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network new words that are far from standard Chinese expression. How detect network new words is one of the important goals in the field of new word discovery today. In this paper, we integrate the word embedding model and clustering methods to propose a network new word discovery framework based on sentence semantic similarity (S3-N2WD) to detect network new words effectively from the network texts. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes new network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network new words but also realizes the standard word meaning of the discovery of it, which reflects the effectiveness of our work.