分析语法形式抽取是对语料库的新挑战(以波兰语和乌克兰语条件语气为例)

IF 1.8 3区 地球科学 Q2 PALEONTOLOGY
S. Fokin
{"title":"分析语法形式抽取是对语料库的新挑战(以波兰语和乌克兰语条件语气为例)","authors":"S. Fokin","doi":"10.17651/polon.42.9","DOIUrl":null,"url":null,"abstract":"A particular challenge for modern textual corpora is the tagging of analytical grammar categories. The com-ponents of these categories may be separated in certain contexts by other words or may even be inverted. A particular interest regarding the selection of analytical grammatical forms is centred around the conditional mood in some Slavic languages, as expressed by means of two words: a past verb form and the particle by/б/би/бы, which is why in most modern corpora, this category lacks a specific tag for these compound forms. The case of Polish is particularly complicated because the particle by may either be merged with the parti-ciple or used separately; furthermore, its separated form may contain a personal verb ending. Specific que-ries subject to experiment on Polish and Ukrainian corpora allow selecting the analytical forms in question.","PeriodicalId":50887,"journal":{"name":"Acta Palaeontologica Polonica","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analytical grammar forms extraction as a new challenge for corpora (Case of conditional mood in Polish and Ukrainian)\",\"authors\":\"S. Fokin\",\"doi\":\"10.17651/polon.42.9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A particular challenge for modern textual corpora is the tagging of analytical grammar categories. The com-ponents of these categories may be separated in certain contexts by other words or may even be inverted. A particular interest regarding the selection of analytical grammatical forms is centred around the conditional mood in some Slavic languages, as expressed by means of two words: a past verb form and the particle by/б/би/бы, which is why in most modern corpora, this category lacks a specific tag for these compound forms. The case of Polish is particularly complicated because the particle by may either be merged with the parti-ciple or used separately; furthermore, its separated form may contain a personal verb ending. Specific que-ries subject to experiment on Polish and Ukrainian corpora allow selecting the analytical forms in question.\",\"PeriodicalId\":50887,\"journal\":{\"name\":\"Acta Palaeontologica Polonica\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Palaeontologica Polonica\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.17651/polon.42.9\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PALEONTOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Palaeontologica Polonica","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.17651/polon.42.9","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PALEONTOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

现代文本语料库面临的一个特殊挑战是分析语法范畴的标注。这些范畴的组成部分可能在某些上下文中被其他词语分开,甚至可能被颠倒。关于分析语法形式的选择,一个特别的兴趣集中在一些斯拉夫语言中的条件语气上,通过两个词来表达:一个过去的动词形式和一个/ /би/бы的助词,这就是为什么在大多数现代语料库中,这一类没有针对这些复合形式的特定标签。Polish的情况特别复杂,因为助词by既可以与助词合并使用,也可以单独使用;此外,它的分离形式可能包含人称动词结尾。在波兰语和乌克兰语语料库上进行实验的特定查询允许选择有问题的分析形式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Analytical grammar forms extraction as a new challenge for corpora (Case of conditional mood in Polish and Ukrainian)
A particular challenge for modern textual corpora is the tagging of analytical grammar categories. The com-ponents of these categories may be separated in certain contexts by other words or may even be inverted. A particular interest regarding the selection of analytical grammatical forms is centred around the conditional mood in some Slavic languages, as expressed by means of two words: a past verb form and the particle by/б/би/бы, which is why in most modern corpora, this category lacks a specific tag for these compound forms. The case of Polish is particularly complicated because the particle by may either be merged with the parti-ciple or used separately; furthermore, its separated form may contain a personal verb ending. Specific que-ries subject to experiment on Polish and Ukrainian corpora allow selecting the analytical forms in question.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Acta Palaeontologica Polonica
Acta Palaeontologica Polonica 地学-古生物学
CiteScore
2.80
自引率
5.60%
发文量
36
审稿时长
12.5 months
期刊介绍: Acta Palaeontologica Polonica is an international quarterly journal publishing papers of general interest from all areas of paleontology. Since its founding by Roman Kozłowski in 1956, various currents of modern paleontology have been represented in the contents of the journal, especially those rooted in biologically oriented paleontology, an area he helped establish. In-depth studies of all kinds of fossils, of the mode of life of ancient organisms and structure of their skeletons are welcome, as those offering stratigraphically ordered evidence of evolution. Work on vertebrates and applications of fossil evidence to developmental studies, both ontogeny and astogeny of clonal organisms, have a long tradition in our journal. Evolution of the biosphere and its ecosystems, as inferred from geochemical evidence, has also been the focus of studies published in the journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信