A lexicon of Albanian for natural language processing

IF 0.3 0 LANGUAGE & LINGUISTICS
Besim Kabashi
{"title":"A lexicon of Albanian for natural language processing","authors":"Besim Kabashi","doi":"10.1515/LEX-2018-340112","DOIUrl":null,"url":null,"abstract":"Abstract For many applications in the field of natural language processing, a lexicon is needed. For the Albanian language a lexicon that can be used for these purposes is presented below. The lexicon contains around 75,000 entries, including proper names such as personal, geographical and other names. Each entry includes grammatical information such as parts of speech and other specific information, e.g. inflection classes for nouns, adjectives and verbs. The lexicon is part of a morphological tool, but can also be used as an independent resource for other tasks and applications or can be adapted for them. Sources for the creation and the extension of the presented lexicon include both information from traditional dictionaries, e.g. spelling dictionaries, and a balanced linguistic corpus using corpus- driven methods and tools. The lexicon is still work in progress, but aims to cover basic information for most frequent tasks of natural language processing.","PeriodicalId":29876,"journal":{"name":"LEXICOGRAPHICA","volume":null,"pages":null},"PeriodicalIF":0.3000,"publicationDate":"2018-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"LEXICOGRAPHICA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/LEX-2018-340112","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 4

Abstract

Abstract For many applications in the field of natural language processing, a lexicon is needed. For the Albanian language a lexicon that can be used for these purposes is presented below. The lexicon contains around 75,000 entries, including proper names such as personal, geographical and other names. Each entry includes grammatical information such as parts of speech and other specific information, e.g. inflection classes for nouns, adjectives and verbs. The lexicon is part of a morphological tool, but can also be used as an independent resource for other tasks and applications or can be adapted for them. Sources for the creation and the extension of the presented lexicon include both information from traditional dictionaries, e.g. spelling dictionaries, and a balanced linguistic corpus using corpus- driven methods and tools. The lexicon is still work in progress, but aims to cover basic information for most frequent tasks of natural language processing.
用于自然语言处理的阿尔巴尼亚语词典
摘要在自然语言处理领域的许多应用中,都需要一个词典。关于阿尔巴尼亚语,可用于这些目的的词汇如下。该词典包含约75000个词条,包括个人、地理和其他名称等专有名称。每个条目都包含语法信息,例如词类和其他特定信息,例如名词、形容词和动词的屈折变化类。词典是形态学工具的一部分,但也可以用作其他任务和应用程序的独立资源,或者可以针对它们进行调整。用于创建和扩展所呈现的词典的来源既包括来自传统词典的信息,例如拼写词典,也包括使用语料库驱动方法和工具的平衡语言语料库。该词典仍在开发中,但旨在涵盖自然语言处理中最常见任务的基本信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
LEXICOGRAPHICA
LEXICOGRAPHICA Multiple-
CiteScore
0.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信