{"title":"基于RNN BiLSTM-CRF改进NER的资本化特征和学习率","authors":"Warto, Muljono, Purwanto, E. Noersasongko","doi":"10.1109/CyberneticsCom55287.2022.9865660","DOIUrl":null,"url":null,"abstract":"Entity extraction in the natural language processing research field is still a widely researched topic. It can be a data source for the next NLP stage, such as text summarization, sentiment analysis, chatbot, machine translation, information retrieval, opinion mining, speech recognition, etc. Named Entity Recognition (NER) is the task of detecting named entities on the corpus. The detection process of entities can use various features, one of which is capital letters. Capital letters that appear at the beginning of a sentence indicate the name of a person, place, organization, geolocation, etc. The experiment uses the deep learning approach with Recurrent Neural Network Bidirectional Long Short Term Conditional Random Field (RNN-BiLSTM-CRF). Our comparing three optimization algorithms: Stochastic Gradient Descent (SGD), Adaptive Moment Estimation (Adam), and Adadelta, with the CoNLL2003 dataset. The experiment results using capital letter features showed an increase in the value of F1-Score by 2.9 higher compared to test results that did not use capital letter features. The highest F1-score score was 92.82 in testing using Adam's algorithm, with a 0.001 learning rate.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Capitalization Feature and Learning Rate for Improving NER Based on RNN BiLSTM-CRF\",\"authors\":\"Warto, Muljono, Purwanto, E. Noersasongko\",\"doi\":\"10.1109/CyberneticsCom55287.2022.9865660\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Entity extraction in the natural language processing research field is still a widely researched topic. It can be a data source for the next NLP stage, such as text summarization, sentiment analysis, chatbot, machine translation, information retrieval, opinion mining, speech recognition, etc. Named Entity Recognition (NER) is the task of detecting named entities on the corpus. The detection process of entities can use various features, one of which is capital letters. Capital letters that appear at the beginning of a sentence indicate the name of a person, place, organization, geolocation, etc. The experiment uses the deep learning approach with Recurrent Neural Network Bidirectional Long Short Term Conditional Random Field (RNN-BiLSTM-CRF). Our comparing three optimization algorithms: Stochastic Gradient Descent (SGD), Adaptive Moment Estimation (Adam), and Adadelta, with the CoNLL2003 dataset. The experiment results using capital letter features showed an increase in the value of F1-Score by 2.9 higher compared to test results that did not use capital letter features. The highest F1-score score was 92.82 in testing using Adam's algorithm, with a 0.001 learning rate.\",\"PeriodicalId\":178279,\"journal\":{\"name\":\"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CyberneticsCom55287.2022.9865660\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865660","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Capitalization Feature and Learning Rate for Improving NER Based on RNN BiLSTM-CRF
Entity extraction in the natural language processing research field is still a widely researched topic. It can be a data source for the next NLP stage, such as text summarization, sentiment analysis, chatbot, machine translation, information retrieval, opinion mining, speech recognition, etc. Named Entity Recognition (NER) is the task of detecting named entities on the corpus. The detection process of entities can use various features, one of which is capital letters. Capital letters that appear at the beginning of a sentence indicate the name of a person, place, organization, geolocation, etc. The experiment uses the deep learning approach with Recurrent Neural Network Bidirectional Long Short Term Conditional Random Field (RNN-BiLSTM-CRF). Our comparing three optimization algorithms: Stochastic Gradient Descent (SGD), Adaptive Moment Estimation (Adam), and Adadelta, with the CoNLL2003 dataset. The experiment results using capital letter features showed an increase in the value of F1-Score by 2.9 higher compared to test results that did not use capital letter features. The highest F1-score score was 92.82 in testing using Adam's algorithm, with a 0.001 learning rate.