{"title":"基于混合双向长短期记忆模型的卡纳达语文本语法标注","authors":"A. Ananth, Sachin S. Bhat, P. S. Venugopala","doi":"10.1109/DISCOVER52564.2021.9663430","DOIUrl":null,"url":null,"abstract":"Kannada is one of the most spoken languages in India. Despite the large usage base, like other major Indian languages, there exist minimal linguistic resources for computing and processing. Rich morphology and agglutinative nature of this language pose a great challenge to even the most basic of natural language processing applications like lemmantization, parts of speech tagging, summarization etc. In this paper, we have discussed a deep learning based perspective} for the grammatical tagging by utilizing hybrid models of bidirectional long short term memory(BDLSTM) and linear chain conditional random fields(CCRF). A database of Kannada documents with 15500 manually tagged words is used for this task. Proposed hybrid model shows a promising result of 81.02%.","PeriodicalId":413789,"journal":{"name":"2021 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)","volume":"221 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Grammatical Tagging for the Kannada Text Documents using Hybrid Bidirectional Long-Short Term Memory Model\",\"authors\":\"A. Ananth, Sachin S. Bhat, P. S. Venugopala\",\"doi\":\"10.1109/DISCOVER52564.2021.9663430\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Kannada is one of the most spoken languages in India. Despite the large usage base, like other major Indian languages, there exist minimal linguistic resources for computing and processing. Rich morphology and agglutinative nature of this language pose a great challenge to even the most basic of natural language processing applications like lemmantization, parts of speech tagging, summarization etc. In this paper, we have discussed a deep learning based perspective} for the grammatical tagging by utilizing hybrid models of bidirectional long short term memory(BDLSTM) and linear chain conditional random fields(CCRF). A database of Kannada documents with 15500 manually tagged words is used for this task. Proposed hybrid model shows a promising result of 81.02%.\",\"PeriodicalId\":413789,\"journal\":{\"name\":\"2021 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)\",\"volume\":\"221 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DISCOVER52564.2021.9663430\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DISCOVER52564.2021.9663430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Grammatical Tagging for the Kannada Text Documents using Hybrid Bidirectional Long-Short Term Memory Model
Kannada is one of the most spoken languages in India. Despite the large usage base, like other major Indian languages, there exist minimal linguistic resources for computing and processing. Rich morphology and agglutinative nature of this language pose a great challenge to even the most basic of natural language processing applications like lemmantization, parts of speech tagging, summarization etc. In this paper, we have discussed a deep learning based perspective} for the grammatical tagging by utilizing hybrid models of bidirectional long short term memory(BDLSTM) and linear chain conditional random fields(CCRF). A database of Kannada documents with 15500 manually tagged words is used for this task. Proposed hybrid model shows a promising result of 81.02%.