{"title":"基于深度学习的中文图书主题词识别模型","authors":"Li Lin, Xiaoxi Guo","doi":"10.1145/3573428.3573734","DOIUrl":null,"url":null,"abstract":"In order to effectively identify subject words in Chinese books and documents, this paper proposes an automatic recognition model for subject words based on deep learning. The model first builds word vectors with the word to vector (Word2vec) model to obtain the feature information at the semantic granularity level of the vocabulary, and then uses the deep neural network (DNN) model to train the feature weights of the vocabulary to predict the probability that the keywords belong to the subject words to achieve binary classification. Finally, experimental results on a library bibliographic data set show that the TopicDNN model has a prediction accuracy of 85.32%, which has better performance for subject words recognition than traditional machine learning methods.","PeriodicalId":314698,"journal":{"name":"Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Deep Learning-based Recognition Model for Chinese Book Subject Words\",\"authors\":\"Li Lin, Xiaoxi Guo\",\"doi\":\"10.1145/3573428.3573734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to effectively identify subject words in Chinese books and documents, this paper proposes an automatic recognition model for subject words based on deep learning. The model first builds word vectors with the word to vector (Word2vec) model to obtain the feature information at the semantic granularity level of the vocabulary, and then uses the deep neural network (DNN) model to train the feature weights of the vocabulary to predict the probability that the keywords belong to the subject words to achieve binary classification. Finally, experimental results on a library bibliographic data set show that the TopicDNN model has a prediction accuracy of 85.32%, which has better performance for subject words recognition than traditional machine learning methods.\",\"PeriodicalId\":314698,\"journal\":{\"name\":\"Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3573428.3573734\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573428.3573734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Deep Learning-based Recognition Model for Chinese Book Subject Words
In order to effectively identify subject words in Chinese books and documents, this paper proposes an automatic recognition model for subject words based on deep learning. The model first builds word vectors with the word to vector (Word2vec) model to obtain the feature information at the semantic granularity level of the vocabulary, and then uses the deep neural network (DNN) model to train the feature weights of the vocabulary to predict the probability that the keywords belong to the subject words to achieve binary classification. Finally, experimental results on a library bibliographic data set show that the TopicDNN model has a prediction accuracy of 85.32%, which has better performance for subject words recognition than traditional machine learning methods.