基于规则的乌尔都语动词分类方法

Lahore Garrison University Research Journal of Computer Science and Information Technology Pub Date : 2021-02-09 DOI:10.54692/lgurjcsit.2021.0501178

Muhammad Waseem

{"title":"基于规则的乌尔都语动词分类方法","authors":"Muhammad Waseem","doi":"10.54692/lgurjcsit.2021.0501178","DOIUrl":null,"url":null,"abstract":"To make dictionaries complete and to keep their size restricted, there is an approach in the linguistic world to equip these dictionaries with morphological information. This module of morphological information is usually known as a morphological analyzer or morphological classifier, which normally contains the complete possible linguistic information about each word for that particular language and it also describes the rules of derivations from the root of a word and its various inflections, respectively. In this work, a classifier for Urdu verbs (CUV) is proposed which is still a challenging research issue, as Urdu is a language of high inflection and derivation. The available stemmers for Urdu do not provide enough information about inflectional and derivational forms of words. Also, morphological classifiers available for Urdu are not worthy of handling various problems and delivering results that prune errors. In our work, a rule based CUV is designed which is able to classify 63 forms of Urdu verbs successfully out of 66. Available Urdu language processing tools are very rare compared to other higher inflectional languages such as German, Turkish, etc., which have competitive morphological classifiers. However, the studies related to Urdu verb morphological classification are identified and a comparative study is presented in this article. In short, this work is a positive contribution to the community, and it provides sufficient information with promising results specifically on inflectional and derivational forms of Urdu verbs.","PeriodicalId":197260,"journal":{"name":"Lahore Garrison University Research Journal of Computer Science and Information Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Classifying Urdu Verbs Using Rule Based Approach\",\"authors\":\"Muhammad Waseem\",\"doi\":\"10.54692/lgurjcsit.2021.0501178\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To make dictionaries complete and to keep their size restricted, there is an approach in the linguistic world to equip these dictionaries with morphological information. This module of morphological information is usually known as a morphological analyzer or morphological classifier, which normally contains the complete possible linguistic information about each word for that particular language and it also describes the rules of derivations from the root of a word and its various inflections, respectively. In this work, a classifier for Urdu verbs (CUV) is proposed which is still a challenging research issue, as Urdu is a language of high inflection and derivation. The available stemmers for Urdu do not provide enough information about inflectional and derivational forms of words. Also, morphological classifiers available for Urdu are not worthy of handling various problems and delivering results that prune errors. In our work, a rule based CUV is designed which is able to classify 63 forms of Urdu verbs successfully out of 66. Available Urdu language processing tools are very rare compared to other higher inflectional languages such as German, Turkish, etc., which have competitive morphological classifiers. However, the studies related to Urdu verb morphological classification are identified and a comparative study is presented in this article. In short, this work is a positive contribution to the community, and it provides sufficient information with promising results specifically on inflectional and derivational forms of Urdu verbs.\",\"PeriodicalId\":197260,\"journal\":{\"name\":\"Lahore Garrison University Research Journal of Computer Science and Information Technology\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Lahore Garrison University Research Journal of Computer Science and Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54692/lgurjcsit.2021.0501178\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lahore Garrison University Research Journal of Computer Science and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54692/lgurjcsit.2021.0501178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

为了使词典更完整，同时又不限制词典的大小，语言学界有一种方法是为词典配备词法信息。这种形态信息模块通常被称为形态分析器或形态分类器，它通常包含该特定语言中每个词的完整可能的语言信息，并分别描述词根衍生规则和词根的各种屈折变化。由于乌尔都语是一种高度屈折和衍生的语言，因此本文提出了一个具有挑战性的乌尔都语动词分类器(CUV)。乌尔都语现有的词干没有提供足够的屈折和派生形式的信息。此外，乌尔都语可用的形态分类器不适合处理各种问题并提供减少错误的结果。在我们的工作中，设计了一个基于规则的CUV，它能够从66种乌尔都语动词中成功地分类出63种。与德语、土耳其语等具有竞争性形态分类器的其他高屈折语言相比，可用的乌尔都语处理工具非常罕见。本文对乌尔都语动词形态分类的相关研究进行了梳理，并进行了比较研究。总之，这项工作是对社区的积极贡献，它提供了足够的信息和有希望的结果，特别是在乌尔都语动词的屈折和衍生形式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Classifying Urdu Verbs Using Rule Based Approach

To make dictionaries complete and to keep their size restricted, there is an approach in the linguistic world to equip these dictionaries with morphological information. This module of morphological information is usually known as a morphological analyzer or morphological classifier, which normally contains the complete possible linguistic information about each word for that particular language and it also describes the rules of derivations from the root of a word and its various inflections, respectively. In this work, a classifier for Urdu verbs (CUV) is proposed which is still a challenging research issue, as Urdu is a language of high inflection and derivation. The available stemmers for Urdu do not provide enough information about inflectional and derivational forms of words. Also, morphological classifiers available for Urdu are not worthy of handling various problems and delivering results that prune errors. In our work, a rule based CUV is designed which is able to classify 63 forms of Urdu verbs successfully out of 66. Available Urdu language processing tools are very rare compared to other higher inflectional languages such as German, Turkish, etc., which have competitive morphological classifiers. However, the studies related to Urdu verb morphological classification are identified and a comparative study is presented in this article. In short, this work is a positive contribution to the community, and it provides sufficient information with promising results specifically on inflectional and derivational forms of Urdu verbs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Lahore Garrison University Research Journal of Computer Science and Information Technology

自引率

0.00%

发文量