{"title":"Gurmukhi词义消歧的Naïve贝叶斯方法","authors":"Himdweep Walia, A. Rana, Vineet Kansal","doi":"10.1109/ICRITO.2017.8342465","DOIUrl":null,"url":null,"abstract":"Natural Language Processing is a technique which allows communication between the human and the machine. In this technique the major problem has been Word Sense Disambiguation (WSD). WSD is the process of uniquely identifying the correct usage of the given word, of the multiple meanings that the word may have. A lot of work is going on in this field, especially in English and European Languages. In recent years, significant work has been done in Indian Regional Languages also. Punjabi is an Indian Regional Language and Gurmukhi is its script. The WSD applies three approaches — knowledge based, corpus based and hybrid approach. The corpus based approach can be further divided into — supervised and unsupervised approach. Off the many algorithms implemented under supervised approach, Naive Bayes Approach has shown higher accuracy in WSD. For this paper we have used the Punjabi Corpora (obtained from Evaluations and Language Resources Distribution Agency, Paris, France) which has been sense-tagged with 100 words.","PeriodicalId":357118,"journal":{"name":"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":"{\"title\":\"A Naïve Bayes Approach for working on Gurmukhi Word Sense Disambiguation\",\"authors\":\"Himdweep Walia, A. Rana, Vineet Kansal\",\"doi\":\"10.1109/ICRITO.2017.8342465\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural Language Processing is a technique which allows communication between the human and the machine. In this technique the major problem has been Word Sense Disambiguation (WSD). WSD is the process of uniquely identifying the correct usage of the given word, of the multiple meanings that the word may have. A lot of work is going on in this field, especially in English and European Languages. In recent years, significant work has been done in Indian Regional Languages also. Punjabi is an Indian Regional Language and Gurmukhi is its script. The WSD applies three approaches — knowledge based, corpus based and hybrid approach. The corpus based approach can be further divided into — supervised and unsupervised approach. Off the many algorithms implemented under supervised approach, Naive Bayes Approach has shown higher accuracy in WSD. For this paper we have used the Punjabi Corpora (obtained from Evaluations and Language Resources Distribution Agency, Paris, France) which has been sense-tagged with 100 words.\",\"PeriodicalId\":357118,\"journal\":{\"name\":\"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRITO.2017.8342465\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRITO.2017.8342465","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Naïve Bayes Approach for working on Gurmukhi Word Sense Disambiguation
Natural Language Processing is a technique which allows communication between the human and the machine. In this technique the major problem has been Word Sense Disambiguation (WSD). WSD is the process of uniquely identifying the correct usage of the given word, of the multiple meanings that the word may have. A lot of work is going on in this field, especially in English and European Languages. In recent years, significant work has been done in Indian Regional Languages also. Punjabi is an Indian Regional Language and Gurmukhi is its script. The WSD applies three approaches — knowledge based, corpus based and hybrid approach. The corpus based approach can be further divided into — supervised and unsupervised approach. Off the many algorithms implemented under supervised approach, Naive Bayes Approach has shown higher accuracy in WSD. For this paper we have used the Punjabi Corpora (obtained from Evaluations and Language Resources Distribution Agency, Paris, France) which has been sense-tagged with 100 words.