Gurmukhi词义消歧的Naïve贝叶斯方法

Himdweep Walia, A. Rana, Vineet Kansal
{"title":"Gurmukhi词义消歧的Naïve贝叶斯方法","authors":"Himdweep Walia, A. Rana, Vineet Kansal","doi":"10.1109/ICRITO.2017.8342465","DOIUrl":null,"url":null,"abstract":"Natural Language Processing is a technique which allows communication between the human and the machine. In this technique the major problem has been Word Sense Disambiguation (WSD). WSD is the process of uniquely identifying the correct usage of the given word, of the multiple meanings that the word may have. A lot of work is going on in this field, especially in English and European Languages. In recent years, significant work has been done in Indian Regional Languages also. Punjabi is an Indian Regional Language and Gurmukhi is its script. The WSD applies three approaches — knowledge based, corpus based and hybrid approach. The corpus based approach can be further divided into — supervised and unsupervised approach. Off the many algorithms implemented under supervised approach, Naive Bayes Approach has shown higher accuracy in WSD. For this paper we have used the Punjabi Corpora (obtained from Evaluations and Language Resources Distribution Agency, Paris, France) which has been sense-tagged with 100 words.","PeriodicalId":357118,"journal":{"name":"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":"{\"title\":\"A Naïve Bayes Approach for working on Gurmukhi Word Sense Disambiguation\",\"authors\":\"Himdweep Walia, A. Rana, Vineet Kansal\",\"doi\":\"10.1109/ICRITO.2017.8342465\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural Language Processing is a technique which allows communication between the human and the machine. In this technique the major problem has been Word Sense Disambiguation (WSD). WSD is the process of uniquely identifying the correct usage of the given word, of the multiple meanings that the word may have. A lot of work is going on in this field, especially in English and European Languages. In recent years, significant work has been done in Indian Regional Languages also. Punjabi is an Indian Regional Language and Gurmukhi is its script. The WSD applies three approaches — knowledge based, corpus based and hybrid approach. The corpus based approach can be further divided into — supervised and unsupervised approach. Off the many algorithms implemented under supervised approach, Naive Bayes Approach has shown higher accuracy in WSD. For this paper we have used the Punjabi Corpora (obtained from Evaluations and Language Resources Distribution Agency, Paris, France) which has been sense-tagged with 100 words.\",\"PeriodicalId\":357118,\"journal\":{\"name\":\"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRITO.2017.8342465\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRITO.2017.8342465","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32

摘要

自然语言处理是一种允许人与机器之间进行交流的技术。在这种技术中,主要的问题是词义消歧。WSD是唯一识别给定单词的正确用法,以及该单词可能具有的多种含义的过程。在这个领域有很多工作正在进行,特别是在英语和欧洲语言方面。近年来,在印度地区语言方面也做了大量工作。旁遮普语是印度的一种地方语言,古尔穆克语是它的文字。水务署采用三种方法:基于知识的方法、基于语料库的方法和混合方法。基于语料库的方法可进一步分为有监督和无监督两种。在监督方法下实现的许多算法中,朴素贝叶斯方法在WSD中显示出更高的精度。在本文中,我们使用了旁遮普语料库(从法国巴黎的评估和语言资源分发机构获得),该语料库已被意义标记为100个单词。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Naïve Bayes Approach for working on Gurmukhi Word Sense Disambiguation
Natural Language Processing is a technique which allows communication between the human and the machine. In this technique the major problem has been Word Sense Disambiguation (WSD). WSD is the process of uniquely identifying the correct usage of the given word, of the multiple meanings that the word may have. A lot of work is going on in this field, especially in English and European Languages. In recent years, significant work has been done in Indian Regional Languages also. Punjabi is an Indian Regional Language and Gurmukhi is its script. The WSD applies three approaches — knowledge based, corpus based and hybrid approach. The corpus based approach can be further divided into — supervised and unsupervised approach. Off the many algorithms implemented under supervised approach, Naive Bayes Approach has shown higher accuracy in WSD. For this paper we have used the Punjabi Corpora (obtained from Evaluations and Language Resources Distribution Agency, Paris, France) which has been sense-tagged with 100 words.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信