基于句子语义向量相似度的网络新词发现框架

GanFeng Yu, Yue Feng Ma, Yang Song
{"title":"基于句子语义向量相似度的网络新词发现框架","authors":"GanFeng Yu, Yue Feng Ma, Yang Song","doi":"10.1109/ICTAI56018.2022.00052","DOIUrl":null,"url":null,"abstract":"New word discovery is a key problem in text information retrieval technology. Methods in new word discovery are often closely related to words. Because their target is words, the findings are obtained by designing methods to analyze words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network new words that are far from standard Chinese expression. How detect network new words is one of the important goals in the field of new word discovery today. In this paper, we integrate the word embedding model and clustering methods to propose a network new word discovery framework based on sentence semantic similarity (S3-N2WD) to detect network new words effectively from the network texts. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes new network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network new words but also realizes the standard word meaning of the discovery of it, which reflects the effectiveness of our work.","PeriodicalId":354314,"journal":{"name":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Network New Word Discovery Framework Based on Sentence Semantic Vector Similarity\",\"authors\":\"GanFeng Yu, Yue Feng Ma, Yang Song\",\"doi\":\"10.1109/ICTAI56018.2022.00052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"New word discovery is a key problem in text information retrieval technology. Methods in new word discovery are often closely related to words. Because their target is words, the findings are obtained by designing methods to analyze words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network new words that are far from standard Chinese expression. How detect network new words is one of the important goals in the field of new word discovery today. In this paper, we integrate the word embedding model and clustering methods to propose a network new word discovery framework based on sentence semantic similarity (S3-N2WD) to detect network new words effectively from the network texts. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes new network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network new words but also realizes the standard word meaning of the discovery of it, which reflects the effectiveness of our work.\",\"PeriodicalId\":354314,\"journal\":{\"name\":\"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI56018.2022.00052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI56018.2022.00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

新词发现是文本信息检索技术中的一个关键问题。发现新词的方法往往与词汇密切相关。因为他们的目标是单词,所以研究结果是通过设计分析单词的方法获得的。随着社交网络的普及,网民个人和网络自媒体为方便网络生活产生了各种各样的网络文本,其中包括与标准汉语表达相距甚远的网络新词。如何检测网络新词是当今新词发现领域的重要目标之一。本文将词嵌入模型和聚类方法相结合,提出了一种基于句子语义相似度的网络新词发现框架(S3-N2WD),有效地从网络文本中检测网络新词。该框架通过分布式表示模型构建句子语义向量,利用句子语义向量的相似度确定句子之间的语义关系,最终通过句子之间的语义替换实现新的网络词发现。实验验证了该框架不仅完成了网络新词的快速发现,而且实现了发现新词的标准词义,体现了我们工作的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Network New Word Discovery Framework Based on Sentence Semantic Vector Similarity
New word discovery is a key problem in text information retrieval technology. Methods in new word discovery are often closely related to words. Because their target is words, the findings are obtained by designing methods to analyze words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network new words that are far from standard Chinese expression. How detect network new words is one of the important goals in the field of new word discovery today. In this paper, we integrate the word embedding model and clustering methods to propose a network new word discovery framework based on sentence semantic similarity (S3-N2WD) to detect network new words effectively from the network texts. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes new network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network new words but also realizes the standard word meaning of the discovery of it, which reflects the effectiveness of our work.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信