Journal of Linguistics/Jazykovedný casopis最新文献

筛选
英文 中文
ANOPHONE: An Annotation Tool for Phonemes and L2 Annotation Systems for Czech ANOPHONE:音素注释工具和捷克语二级注释系统
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0050
Richard Holaj, Petr Porízka
{"title":"ANOPHONE: An Annotation Tool for Phonemes and L2 Annotation Systems for Czech","authors":"Richard Holaj, Petr Porízka","doi":"10.2478/jazcas-2023-0050","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0050","url":null,"abstract":"Abstract The goal of this text is the presentation of the ANOPHONE annotation system, which allows for the management and annotation of speech data to develop a tool for the automatic transcription of speech of non-native speakers of Czech. This system is currently designed for annotations on the segmental level of recordings of non-native speakers of Czech, with the aim to train automatic speech recognition (ASR) models used in this tool. After an introductory section that discusses the use of technology in pronunciation teaching and mentions some of the e-learning applications for teaching the pronunciation of second languages (L2), we address both general and more specific aspects of speech data annotation to train ASR models and mention attributive and synthetic segmental systems of speech data annotation for Czech as L2. We also briefly introduce the annotation system of non-native speakers of Czech called BV1, which is used for testing the ANOPHONE tool. The main part of this text focuses on presenting the annotation tool itself, while the conclusion describes the experience of testing the speech data annotation tool using BV1 annotation system for Czech as L2.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"7 1","pages":"333 - 344"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Effect of (Historical) Language Variation on the East Slavic Lects Lematisers Performance 语言(历史)变异对东斯拉夫语 Lects Lematisers 表演的影响
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0040
Ilia Afanasev, Olga Lyashevskaya, Stefan Rebrikov, Yana Shishkina, Igor Trofimov, Natalia Vlasova
{"title":"The Effect of (Historical) Language Variation on the East Slavic Lects Lematisers Performance","authors":"Ilia Afanasev, Olga Lyashevskaya, Stefan Rebrikov, Yana Shishkina, Igor Trofimov, Natalia Vlasova","doi":"10.2478/jazcas-2023-0040","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0040","url":null,"abstract":"Abstract The need to develop tools for historical and regional variations is becoming more urgent in natural language processing. In this paper, we present two candidate systems for lemmatising historical East Slavic lects (Late Old East Slavic and Middle Russian), as well as modern regional East Slavic lects (Belogornoje and Megra): BERT-based end-to-end pipeline with language-specific heuristics and sequence-to-sequence BART-based encoderdecoder. To evaluate their predictions, we use accuracy score and string similarity measures, such as Levenshtein distance. The BERT-based model is more suitable for the regional data, achieving 85% accuracy score, and only 74% on the historical data. BART-based model climbs up to 92.6% accuracy score on the historical data, yet gets only 80% on the regional data. We provide an error analysis and discuss ways to enhance models, such as dictionary lookup and spellchecker.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"20 1","pages":"225 - 233"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple Interpretation and Fragmented Texts Within a Historical Corpus: The Case of Old East Slavic Vernacular Writing 历史语料库中的多重解释和支离破碎的文本:古代东斯拉夫方言写作案例
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0044
D. Sitchinava
{"title":"Multiple Interpretation and Fragmented Texts Within a Historical Corpus: The Case of Old East Slavic Vernacular Writing","authors":"D. Sitchinava","doi":"10.2478/jazcas-2023-0044","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0044","url":null,"abstract":"Abstract The paper presents the issue of fragmented and/or ambiguously interpreted texts within the corpora of Old East Slavic vernacular writing. One of these corpora, the corpus of the Old East Slavic birchbark letters, is already available, the other, comprising the texts of Old East Slavic inscriptions, is under preparation. Due to the fragmentary state of many birchbark and epigraphy texts, their lemmatization and grammatical tagging may be uncertain and multiple interpretations may coexist. Some lemmas survive only in fragments which are nevertheless relevant for the study of lexicon. The grammatical status of many fragments may be firmly established despite lacking lexical information. However the relevant data on these fragments is not available in the word indices and corpora that take into consideration only best-preserved word forms. In the paper, the representation and annotation of such word forms within the Old East Slavic vernacular corpora is presented, and relative frequencies of such phenomena within the birchbark letter corpus are shown, with some case studies showing the relevance of the annotation of fragmented forms. The existing approaches, namely for the classical epigraphy within the EpiDoc standard and in the Hittite syntactic treebanks, are also briefly presented and compared to the solution found within the Old East Slavic vernacular corpora.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"17 1","pages":"266 - 274"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Lexical Analysis of Noun Lemmas in Slovak Judicial Decisions 斯洛伐克司法判决中名词词组的比较词法分析
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0032
Miroslav Zumrík
{"title":"Comparative Lexical Analysis of Noun Lemmas in Slovak Judicial Decisions","authors":"Miroslav Zumrík","doi":"10.2478/jazcas-2023-0032","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0032","url":null,"abstract":"Abstract The paper presents a comparative lexical analysis of the most frequent noun lemmas in Slovak judicial decisions. The data was taken from the large corpus of decisions provided by the Ministry of Justice of the Slovak Republic, and compared with lemmas from four other corpora. The style of administrative, legal and other highly formalized texts has been receiving attention in recent years, although research into the style of Slovak judicial decisions remains rather sparse, which could partially stem from their idiosyncratic nature. The paper’s aim is thus to focus on the quantitative and qualitative characteristics of noun lemmas found specifically in the style of judicial decisions.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"20 1","pages":"140 - 149"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differences in Spoken Language Processing in General Corpora (ORAL, ORTOFON) and in a Specialized Corpus (DIALEKT) and their Reflection in the Mapka Application 通用语料库(ORAL、ORTOFON)和专用语料库(DIALEKT)中口语处理的差异及其在 Mapka 应用程序中的反映
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0038
M. Waclawicová
{"title":"Differences in Spoken Language Processing in General Corpora (ORAL, ORTOFON) and in a Specialized Corpus (DIALEKT) and their Reflection in the Mapka Application","authors":"M. Waclawicová","doi":"10.2478/jazcas-2023-0038","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0038","url":null,"abstract":"Abstract ORAL and ORTOFON, general corpora of the spoken Czech language, capture authentic and prototypical informal spoken language. DIALEKT, a specialized corpus, represents traditional regional dialects of the Czech language. Since the corpora’s goals and the nature of the captured language data differ, different data collection methods were required. It concerns not only the choice of speakers, but the whole communication situation. Samples chosen from these three corpora are included in the Mapka application and reflect the distinct character of the corpora. The ORAL and ORTOFON samples show general spoken language in various informal situations and capture a wide range of speakers. The DIALEKT samples represent traditional regional dialects spoken by chosen types of speakers in a semiformal situation of guided interview.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"22 1","pages":"204 - 213"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistician, Programmer, Data Scientist? Who is, or Should Be, a Corpus Linguist in the 2020s? 统计学家、程序员、数据科学家?2020 年代谁是或应该是语料库语言学家?
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0023
Łukasz Grabowski
{"title":"Statistician, Programmer, Data Scientist? Who is, or Should Be, a Corpus Linguist in the 2020s?","authors":"Łukasz Grabowski","doi":"10.2478/jazcas-2023-0023","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0023","url":null,"abstract":"Abstract In this short essay, I aim to ruminate on the nature of a corpus linguist’s work in the 2020s, a time marked by unprecedented advancements in the field of computer technologies and artificial intelligence. This seems to be particularly relevant considering the theme of the 12th International Conference Slovko 2023, which is “Natural Language Processing and Corpus Linguistics”. In the last two decades or so, corpus linguistics has drawn extensively from the fields such as statistics, computer science and data science. In many respects corpus linguistics has served as a significant source of inspiration for progress in the field of natural language processing (NLP), leading to the development of large language models (LLMs) as well as recent introduction of conversational artificial intelligence, among others. Thus, in this paper I will make an attempt at identifying the skills that may help rank-and-file or aspiring corpus linguists to survive and, hopefully, flourish in the research field in the 2020s.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"214 1","pages":"52 - 59"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spanish Synonyms as Part of a Multilingual Event-Type Ontology 作为多语言事件类型本体论一部分的西班牙语同义词
Journal of Linguistics/Jazykovedný casopis Pub Date : 2023-06-01 DOI: 10.2478/jazcas-2023-0033
Cristina Fernández-Alcaina, Eva Fucíková, Jan Hajič, Zdenka Uresová
{"title":"Spanish Synonyms as Part of a Multilingual Event-Type Ontology","authors":"Cristina Fernández-Alcaina, Eva Fucíková, Jan Hajič, Zdenka Uresová","doi":"10.2478/jazcas-2023-0033","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0033","url":null,"abstract":"Abstract This paper presents an ongoing work on the multilingual event-type ontology SynSemClass, where multilingual verbal synonymy is formalized in terms of syntactic and semantic properties. In the ontology, verbs are grouped into synonym classes, both monolingually and cross-lingually. Specifically, verbs are considered to belong to the same class if they both express the same meaning in a specific context, and their valency frame can be mapped to the set of roles defined for a particular class. SynSemClass is built following a bottom-up approach where translational equivalents are automatically extracted from parallel corpora and annotated by human annotators. The task of the annotators consists in mapping the valency frame of a particular verb with the set of roles defined for the class where the verb is included as a potential class member, establishing links to external resources, and selecting relevant examples. The Spanish part of the ontology currently contains 257 classes enriched with Spanish synonyms. The resulting resource provides finegrained syntactic and semantic information on multilingual verbal synonyms and links to other existing monolingual and multilingual resources.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"37 1","pages":"153 - 162"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139371505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contextual Sensitivity of the Lexemes Učiteľ [Teacher] and Učiteľka [Female Teacher] 词汇u<e:1> item<e:1>[教师]和Učiteľka[女教师]的语境敏感性
Journal of Linguistics/Jazykovedný casopis Pub Date : 2022-12-01 DOI: 10.2478/jazcas-2023-0014
Lujza Urbancová
{"title":"Contextual Sensitivity of the Lexemes Učiteľ [Teacher] and Učiteľka [Female Teacher]","authors":"Lujza Urbancová","doi":"10.2478/jazcas-2023-0014","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0014","url":null,"abstract":"Abstract The study deals with the lexical meaning of lexemes female teacher, teacher (male teacher and a generic meaning of the lexeme), which might be in Slovak influenced by context and discourse as well as by attitudes and gender stereotypes of interlocutors. In pragmatic research, the author focuses on semantic indeterminacy as an implicit component of lexical meaning determined by the socialization of interlocutors. Analysis of the lexemes female teacher, teacher and their collocations with the adjectives typical, crazy, burned out, in different contexts, has shown that the gender of the person referred to has an influence on the meaning. The implicit or socialized meaning of the lexemes of the feminine gender is frequently associated with those phenomena that are percieved negatively in society, while the names of the masculine gender do not contain this component.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126945179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New Words and Gender Equality in Serbian – Does Discrimination Exist? 塞尔维亚语中的新词与性别平等——歧视存在吗?
Journal of Linguistics/Jazykovedný casopis Pub Date : 2022-12-01 DOI: 10.2478/jazcas-2023-0016
Vesna Đorđević, Jelena Janković, M. Nikolić
{"title":"New Words and Gender Equality in Serbian – Does Discrimination Exist?","authors":"Vesna Đorđević, Jelena Janković, M. Nikolić","doi":"10.2478/jazcas-2023-0016","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0016","url":null,"abstract":"Abstract We examined the general attitude to new feminine titles, as it formed in the media in 2021, and the overall image of social feminine titles currently prevalent in the Serbian media, all by way of ascertaining the reasons for acceptance or non-acceptance of new social feminine titles that were articulated in the media. Having defined the necessary terms (discrimination, gender equality, social feminine title and so on) and after a brief review of the social context that made social feminine titles a hot topic in the Serbian media in 2021, we analysed the relevant media texts that present the various positions on social feminine titles. The method of qualitative content analysis was applied, as it was deemed the most fitting methodological procedure for extracting both the arguments put forward in favour of, and those against social feminine title use. The research corpus consisted of media texts and official announcements by Serbian linguistic institutions on the subject of social feminine titles, collected from January to September of 2021. The basic assumption was that the dominant attitude in the media texts would be against new feminine title use, but also that both supporters and opponents of new social feminine titles would feel discriminated against, whether the discrimination came via opposition to or, conversely, via obligatory and consistent use of these terms.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124528657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Similarities and Differences in Linguistic Discrimination Between Slovak and Hungarian Teachers of Hungarian Language and Literature 匈牙利语与匈牙利语教师语言辨别力的异同
Journal of Linguistics/Jazykovedný casopis Pub Date : 2022-12-01 DOI: 10.2478/jazcas-2023-0018
I. Jánk
{"title":"Similarities and Differences in Linguistic Discrimination Between Slovak and Hungarian Teachers of Hungarian Language and Literature","authors":"I. Jánk","doi":"10.2478/jazcas-2023-0018","DOIUrl":"https://doi.org/10.2478/jazcas-2023-0018","url":null,"abstract":"Abstract The purpose of this study is to demonstrate the presence of linguistic discrimination in pedagogical situations, especially in pedagogical evaluation. The paper is based on a survey which involved 502 Hungarian Language and Literature teachers and teacher trainees from Hungary (N = 216), Slovakia (N = 128), Romania (N = 108) and Ukraine (N = 50). Data were primarily collected through a technique similar to matched-guise tests; however, the method of the present research had some additional complexity. The article discusses similarities and differences in linguistic discrimination between Slovak and Hungarian teachers who teach Hungarian Language and Literature. The question it raises is whether there are any differences between the two samples. The results of the mentioned research show that the presence of linguistic discrimination is powerful in both samples, but there are differences in its strength and realization.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116921350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信