{"title":"Character-based feature extraction with LSTM networks for POS-tagging task","authors":"Aibek Makazhanov, Zhandos Yessenbayev","doi":"10.1109/ICAICT.2016.7991654","DOIUrl":null,"url":null,"abstract":"In this paper we describe a work in progress on designing the continuous vector space word representations able to map unseen data adequately. We propose a LSTM-based feature extraction layer that reads in a sequence of characters corresponding to a word and outputs a single fixed-length real-valued vector. We then test our model on a POS tagging task on four typologically different languages. The results of the experiments suggest that the model can offer a solution to the out-of-vocabulary words problem, as in a comparable setting its OOV accuracy improves over that of a state of the art tagger.","PeriodicalId":446472,"journal":{"name":"2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICT.2016.7991654","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
In this paper we describe a work in progress on designing the continuous vector space word representations able to map unseen data adequately. We propose a LSTM-based feature extraction layer that reads in a sequence of characters corresponding to a word and outputs a single fixed-length real-valued vector. We then test our model on a POS tagging task on four typologically different languages. The results of the experiments suggest that the model can offer a solution to the out-of-vocabulary words problem, as in a comparable setting its OOV accuracy improves over that of a state of the art tagger.