Journal of Linguistics/Jazykovedný casopis最新文献

筛选
英文 中文
The Intercorp Parallel Corpus with a Uniform Annotation for All Languages 为所有语言提供统一注释的 Intercorp 平行语料库
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0043
Alexandr Rosen
{"title":"The Intercorp Parallel Corpus with a Uniform Annotation for All Languages","authors":"Alexandr Rosen","doi":"10.2478/jazcas-2023-0043","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0043","url":null,"abstract":"Abstract Recently, the language-specific morphosyntactic annotation of InterCorp, a large multilingual parallel corpus, has been replaced by the language-uniform morphosyntactic and syntactic annotation following the guidelines of the Universal Dependencies project. Because the corpus is used predominantly by human users via a token-based concordancer, the CONLL-U format produced by the UDP ipe parser has been extended by attributes such as lemma of the token’s syntactic head or morphosyntactic categories of the content verb’s auxiliary. We conclude that despite some theoretical and practical issues, the new annotation is a promising solution to the issue of mutually incompatible tagsets within a single corpus.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"25 1","pages":"254 - 265"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Economy of Czech Exchange in the Slovak Marketplace of Austria After the Fall of Hungary 匈牙利灭亡后奥地利斯洛伐克市场的捷克语交流经济
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0022
Martin Diweg-Pukanec
{"title":"The Economy of Czech Exchange in the Slovak Marketplace of Austria After the Fall of Hungary","authors":"Martin Diweg-Pukanec","doi":"10.2478/jazcas-2023-0022","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0022","url":null,"abstract":"Abstract Lexical quanta in writing and speech are important indicators of an individual’s social class. This paper analyses the lexical features of the utterances produced by writers/speakers from different social classes in terms of word length based on a representative sample of letters from the Kremnica archive. The study found that writers/ speakers from different social classes showed different average word length in the 2nd half of the 16th century. According to the analysis of the letters, writers/speakers from the upper classes produced utterances with longer words than those from the lower classes. These differences are explained by factors related to the individual’s social class backgrounds and his “right to speech” or “right to be read”. The linguistic economy principle, objective of which is to save more time and energy by conveying more information with less effort is thus far from exhaustive and by no means reflects the whole sociolinguistic reality.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"15 1","pages":"43 - 51"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Verbification of Feminine Forms of Adjectives Można ‘Posible ’, Niemożna ‘Imposible ’ and Niepodobna ‘Imposible ’ – Corpus -Based Aproach 形容词 "可能"(Można)、"不可能"(Niemożna)和 "不可能"(Niepodobna)的阴性形式的动词化--基于语料库的方法
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0019
Renata Bronikowska
{"title":"Verbification of Feminine Forms of Adjectives Można ‘Posible ’, Niemożna ‘Imposible ’ and Niepodobna ‘Imposible ’ – Corpus -Based Aproach","authors":"Renata Bronikowska","doi":"10.2478/jazcas-2023-0019","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0019","url":null,"abstract":"Abstract The article is devoted to the process taking place in the Middle Polish period, which led to the transformation of nominative, singular, feminine forms of three adjectives (można ‘possible’, niemożna ‘impossible’ and niepodobna ‘impossible’) into verbal lexemes (the so-called predicatives). In this respect, the predicative uses of these forms in the texts collected in the Electronic Corpus of 17th- and 18th-century Polish Texts (up to 1772) were investigated. The progressive verbification of adjectival forms was considered to be indicated by three changes in the constructions where these forms played the role of a predicate: supersession of connections with feminine verb forms by connections with neuter forms, limiting the connections with verbs to the auxiliary verb być ‘to be’, and disappearance of connections with the personal form of the verb być in the present tense. The research results show that both forms acquired the two most important features characteristic for predicatives during the 17th century. The third of the analysed properties characterizes the form można/niemożna from the second half of the 19th century, and the process of its acquisition by the form niepodobna has not ended yet.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"82 1","pages":"9 - 18"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Syllabic Consonants in Historical Czech and How to Identify Them 历史捷克语中的音节辅音及其识别方法
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0055
Markéta Ziková, Martin Březina, Radek Čech, Pavel Kosek
{"title":"Syllabic Consonants in Historical Czech and How to Identify Them","authors":"Markéta Ziková, Martin Březina, Radek Čech, Pavel Kosek","doi":"10.2478/jazcas-2023-0055","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0055","url":null,"abstract":"Abstract The paper provides fine-grained evidence concerning the development of syllabic consonants /r l/ in Czech, that is only sketched in the existing literature. The evidence is based on an automatic parser that identifies potential syllable-projecting segments according to sonority. The parser was applied to six verse texts from the 14th–16th centuries, which show a strong tendency towards octosyllabicity. The data provided by the parser newly reveal that the shift from non-syllabic to syllabic /r l/ is position-dependent: word-medial non-syllabic strings C(r/l)C change more rapidly than non-syllabic word-final ones C(r/l)#. This finding is in line with a cross-linguistic observation that non-syllabic C(r/l)C are marked, hence they are regularly syllabified prior to less marked C(r/l)#.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"31 1","pages":"391 - 400"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Competition of German Adjectival Suffixes 德语形容词后缀的竞争
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0026
Filip Kalaš
{"title":"The Competition of German Adjectival Suffixes","authors":"Filip Kalaš","doi":"10.2478/jazcas-2023-0026","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0026","url":null,"abstract":"Abstract The paper presents a corpus linguistic perspective on two adjectival suffixes -al and -ell in the German language. Its attention is focused on the distributional frequency of the derived adjectives, the semantic motivation and contextual occurrence through the lens of retried adjective + noun collocations. On top of that, the paper attempts to determine the superiority of such derived adjectives in the specialised vocabulary.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"13 1","pages":"81 - 91"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards a Corpus-Based Dictionary of Verbal Government for the Russian Language 建立基于语料库的俄语动词词典
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0035
Eduard Klyshinsky, A. Bogdanova, Mikhail Kopotev
{"title":"Towards a Corpus-Based Dictionary of Verbal Government for the Russian Language","authors":"Eduard Klyshinsky, A. Bogdanova, Mikhail Kopotev","doi":"10.2478/jazcas-2023-0035","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0035","url":null,"abstract":"Abstract This paper introduces a technique for automatic verbal governance extraction in the Russian language, which encapsulates information on the grammatical features of verbnoun co-occurrences, encompassing both prepositional and non-prepositional dependencies. The construction of the dictionary, a corpus of approximately 3.5 billion words was used. The proposed method involves syntactic parsing of the texts, filtering of resultant outputs, and creating a dictionary of prepositional government. After error filtering, the dictionary contains ca. 18,000 verbs along with NP/PPs governed by these verbs.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"27 1","pages":"173 - 181"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Slovak Language Models for Basic Preprocessing Tasks in Python 用于 Python 基本预处理任务的斯洛伐克语言模型
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0049
D. Hládek, Maros Harahus, J. Staš, Matus Pleva
{"title":"Slovak Language Models for Basic Preprocessing Tasks in Python","authors":"D. Hládek, Maros Harahus, J. Staš, Matus Pleva","doi":"10.2478/jazcas-2023-0049","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0049","url":null,"abstract":"Abstract We propose a Slovak language model for the spaCy library in Python. These models are easy-to-use for basic natural language processing tasks in a single package. The package contains several components for basic preprocessing tasks, such as tokenization, sentence boundary detection, syntactic parsing, lemmatization, named entity recognition, morphology analysis, and word vectors. It is based on the state-of-the-art monolingual SlovakBERT model. Named entity recognition is trained on a separate, publicly available WikiAnn database. The other statistical classifiers use a Slovak Dependency Treebank corpus. Morphological tags are compatible with the conventions of the Slovak National Corpus. The part of speech tags use conventions of the Universal Dependencies framework. We trained a separate word vector model on a web-based corpus. The training uses fastText with Floret modification. We present a series of experiments that confirm that the model performs similarly to other languages for all tasks. Training scripts and data are publicly available.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"212 1","pages":"323 - 332"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proverbs in Contemporary Czech. Corpus Probe into Written Texts 当代捷克语中的谚语。书面文本语料库探究
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0027
Marie Koprivová, Kateřina Šichová
{"title":"Proverbs in Contemporary Czech. Corpus Probe into Written Texts","authors":"Marie Koprivová, Kateřina Šichová","doi":"10.2478/jazcas-2023-0027","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0027","url":null,"abstract":"Abstract The paper deals with the possibility of creating a paremiological optimum for students of Czech as a foreign language. The selection of proverbs should reflect the frequency, familiarity with and use of proverbs. The study focuses on the most frequent proverbs in written Czech, using contemporary idiomatically annotated corpora. On this basis, our own minimum was created. The paper compares the results with previous studies on the paremiological minima of Czech (Schindler 1993 and Čermák 2003) and shows the intersection of all three minima.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"18 1","pages":"92 - 99"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text Vectorization Techniques Based on Wordnet 基于词网的文本矢量化技术
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0048
D. Držík, Kirsten Šteflovič
{"title":"Text Vectorization Techniques Based on Wordnet","authors":"D. Držík, Kirsten Šteflovič","doi":"10.2478/jazcas-2023-0048","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0048","url":null,"abstract":"Abstract The utilization of text vectorization techniques has become essential for numerous classification tasks in present-day natural language processing. Word embedding methods commonly used today, such as Word2Vec, GloVe, etc., are based on the semantic similarity of words. WordNet, as a lexical database of words, provides a rich source of semantic information. In our article, we propose a text vectorization technique using extended text data with the data augmentation method, specifically by replacing words with their synonyms obtained from WordNet. The results obtained from text classification tasks using multiple classifiers demonstrate that expanding the corpus with this method leads to improved vector representations of words.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"36 1","pages":"310 - 322"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dative Ambiguity in Rusian: A Corpus Induced Study 俄语中的助词模糊性:语料库诱导研究
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0025
Edyta Jurkiewicz-Rohrbacher
{"title":"Dative Ambiguity in Rusian: A Corpus Induced Study","authors":"Edyta Jurkiewicz-Rohrbacher","doi":"10.2478/jazcas-2023-0025","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0025","url":null,"abstract":"Abstract When describing the Russian dative case, an observation often made in passing is that its assignment to certain types of arguments is ambiguous, particularly in constructions with a predicative infinitive. Thus far, no studies have put this problem into focus nor described the range of structures to which it applies. I approach this problem with corpus-driven methods. The present study shows that the predicate order and the referential prominence hierarchy can be used as explanatory variables in the modelling of the semanticsyntactic role assignment.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"260 1","pages":"70 - 80"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信