{"title":"Language interpretation of German migration discourse (in comparison view of the years 2019 and 2015/16)","authors":"E. Molnárová, Jana Lauková","doi":"10.2478/jazcas-2022-0012","DOIUrl":"https://doi.org/10.2478/jazcas-2022-0012","url":null,"abstract":"Abstract The presented paper is a research dive into the topic of web corpora as well as an analysis of linguistic grasp of the issue of migration from the perspective of social, cultural and cognitive linguistics. The presented research reflects the problem of the construction of the language grasp of this issue in Europe in a selected German mass media discourse. We compare the phenomenon of migration in 2015/2016, when record migration flows to the EU were recorded, and in 2019, when migration kept increasing. The analysis of language grasp of the issue of migration is a part of our scientific research within the project VEGA Xenisms in German and Slovak communications.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121335613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How different types of linguistic corpora shed light (or not) on various categories of substandard lexicon: contrastive analysis of vocabulary in the comedy “Les Kaïra” [Porn in the hood], a typical example of the hood film genre","authors":"Alena Podhorná-Polická, Anne-Caroline Fiévet","doi":"10.2478/jazcas-2022-0017","DOIUrl":"https://doi.org/10.2478/jazcas-2022-0017","url":null,"abstract":"Abstract The arrival of WaC corpora, including Aranea family corpora, with its “close-to-spoken language” writings from different non-formal web pages brought the new options to researchers of sociolects, mainly to those who were previously obliged to observe youth collectives in its spontaneous discourses with its consequent time-consuming transcripts. Non-spontaneous spoken language from rap songs or youth film dialogues also help researchers to describe the level of societal diffusion of some typical features of youth slang. In this paper, we focus on demonstration of these crossed approaches in order to describe three types of verbs, used in a successful comedy about Parisian peri-urban post-adolescents Les Kaïra (2012), representing different types of substandard lexicon.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116122932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaroslava Hlavácová, Marie Mikulová, B. Štěpánková
{"title":"Consistency of morphological dictionary MorfFlex","authors":"Jaroslava Hlavácová, Marie Mikulová, B. Štěpánková","doi":"10.2478/jazcas-2022-0010","DOIUrl":"https://doi.org/10.2478/jazcas-2022-0010","url":null,"abstract":"Abstract Language corpora usually contain, in addition to their own texts, various types of annotations. The most common one is a morphological annotation, which consists in assigning a lemma and a morphological tag to each wordform. For morphological tagging, morphological dictionaries are traditionally used. Our paper presents a new version of the so-called “Prague” morphological dictionary MorfFlex used for tagging many Czech corpora (particularly Prague Dependency Treebanks, corpora published by the Institute of the Czech National Corpus in Prague or large Czech web corpora of the Aranea series). Three basic principles were used to update the dictionary: the Golden Rule of Morphology, the Principle of Paradigm Unity, and the Principle of Paradigm Uniqueness.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"18 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114100732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Kabyle corpus digital database and exploitation. Test of lexicometric analysis of the identity dimension in the romanesque discourse","authors":"Arezki Ikherbane, Ramdane Boukherrouf, Noura Tigziri","doi":"10.2478/jazcas-2022-0014","DOIUrl":"https://doi.org/10.2478/jazcas-2022-0014","url":null,"abstract":"Abstract The purpose of this contribution is to show, through a preliminary analysis of a corpus sample composed of the first five kabyle novels (1963-1990), the contribution of lexicometry as a new method based on statistics, in the treatment of large corpora and the establishment of databases. The aim is to describe all the phases intrinsic to the preliminary processing of a corpus (transcription, tagging and lemmatization) before submitting them to the various stages of its exploitation. Thus, in our corpus, we have opted to deal with the theme of identity induced by the five works by highlighting both the overused vocabulary and the singularity of each work in relation to the corpus as a whole. But before moving on to the quantitative analysis of the vocabulary, a work of data preparation is necessary. We intend to focus on the orthographic choices to be adopted by removing all ambiguities, the marking out and the lemmatization of the corpus. In order to do this, we have resorted to Lexico5 computer tool.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"31 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120843932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The style of accompanying dialogues","authors":"Jana Hoffmannová","doi":"10.2478/jazcas-2022-0030","DOIUrl":"https://doi.org/10.2478/jazcas-2022-0030","url":null,"abstract":"Abstract Conversations accompanying collective activities are exceptionally appropriate material for the development of “interactional stylistics” (cf. Orgoňová – Bohunická 2018). They display a number of specific aspects, including a low frequency of full-meaning expressions and, conversely, a high frequency of substitute deictic expressions, used when showing and pointing. Characteristic for these dialogues is the observation of the cooperative principle (Grice 1975), above all through various forms of agreement, strengthened by reduplication and intensification (yeah that’s it that’s it; yeah that’s clear, of course); also through speakers repeating after their interlocutors, but also through emphatic, confirmational repetition of their own expressions or the accompaniment of utterances and turns with strong coreference. Less frequent, but striking, are the expressions of motivated, functional disagreement, gradually eliminated through negotiation. Through the use of all of these means, a specific structure of the conversation is created, often based on the actual coproduction of turns, on non-extensive overlaps and on the use of numerous continuers. Here, verbal communication merges inseparably with gestures, movements and facial expressions, which necessarily leads to the use of methods based on the analysis of video recordings.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133745503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting fishing terminology using GNU/Linux tools","authors":"Agnieszka Kaliska","doi":"10.2478/jazcas-2022-0016","DOIUrl":"https://doi.org/10.2478/jazcas-2022-0016","url":null,"abstract":"Abstract The technological revolution that has occurred in recent decades has made accessible for researches large textual data collections. At the same time, the development of increasingly sophisticated computer tools provides them with new methods of analyzing texts. In the present study however we examine the functionalities offered by traditional tools, namely GNU/Linux tools, easily accessible via the command line but still unknown among linguists with little or no computer knowledge. Our goal is to show how using the web corpus on the one hand and the processing GNU/Linux tools on the other, we can extract key-terms of fishing jargon.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114652064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Didactising specialised parallel corpora: the case of European directives","authors":"Elefthéria Dogoriti, Théodore Vyzas","doi":"10.2478/jazcas-2022-0018","DOIUrl":"https://doi.org/10.2478/jazcas-2022-0018","url":null,"abstract":"Abstract Within the framework of a didactic proposal, this article proposes to present a preliminary step to the specialized translation French-Greek. It will attempt to highlight the benefits of autonomous learning through the consultation of a corpus of specialized parallel texts established by the EU institutions. The use of concordancers will provide solutions to students wishing to study the variability of terminology and specialized vocabulary at monolingual and bilingual levels.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134477229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Designing a Corpus of Czech Monologues: Orator v2","authors":"Marie Koprivová, Zuzana Laubeová, D. Lukeš","doi":"10.2478/jazcas-2021-0048","DOIUrl":"https://doi.org/10.2478/jazcas-2021-0048","url":null,"abstract":"Abstract ORATOR v2 is a new 1.5M word corpus of Czech monologues, delivered to a live audience in semi-formal to formal settings. It was designed to chart the space of naturally occurring monologues which can be obtained for corpus processing. As such, it aims for diversity but does not attempt any balancing of subcategories, recognizing that some types of data are inherently easier to obtain in high volume than others. The transcription guidelines and annotation tools employed are the same as other recent spoken corpora published by the CNC, which facilitates interesting comparisons between various types of spoken Czech. The present paper sketches out three case studies, comparing ORATOR to the informal conversations of ORTOFON v2 in terms of the frequencies of demonstratives and hesitations, as well as lexical richness.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128372515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Robust Approach to Variation in Carpathian Rusyn: Resampling-Based Methods for Small Data Sets","authors":"M. Z. Lahjouji-Seppälä, Achim Rabus","doi":"10.2478/jazcas-2021-0055","DOIUrl":"https://doi.org/10.2478/jazcas-2021-0055","url":null,"abstract":"Abstract Quantitative, corpus based research on spontaneous spoken Carpathian Rusyn language can cause several data-related problems: Speakers are using ambivalent forms in different quantities, resulting in a biased data set – while a stricter data-cleaning process would lead to a large scale data loss. On top of that, polytomous categorical dependent variables are hard to analyze due to methodological limitations. This paper provides several approaches to face unbalanced and biased data sets containing variation of conjugational forms of the verb maty ‘to have’ and (po-)znaty ‘to know’ in Carpathian Rusyn language. Using resampling based methods like Cross-Validation, Bootstrapping and Random Forests, we provide a strategy for circumventing possible methodological pitfalls and gaining the most information from our precious data, without trying to p-hack the results. Calculating the predictive power of several sociolinguistic factors on linguistic variation, we can make valid statements about the (sociolinguistic) status of Rusyn and the stability of the old dialect continuum of Rusyn varieties.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129430620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Capturing Numerals and Pronouns at the Morphological Layer in the Prague Dependency Treebanks of Czech","authors":"B. Štěpánková, Marie Mikulová","doi":"10.2478/jazcas-2021-0042","DOIUrl":"https://doi.org/10.2478/jazcas-2021-0042","url":null,"abstract":"Abstract The paper presents a novel and unified morphological description of numerals and pronouns, as compiled for the newest edition of the Prague Dependency Treebank (Prague Dependency Treebank – Consolidated 1.0) and its integral part the morphological dictionary MorfFlex. On the basis of considerable experience with real data annotation and the use of the morphological dictionary, particular changes were proposed. For both of the parts of speech a new set of subtypes was proposed, based mainly on the morphological criterion and its combination with semantic properties and other relevant features, such as definiteness in numerals and possessivity, reflexivity, and clitichood in pronouns. Each subtype has a specific value at the 2nd position of the morphological tag, which serves also as an indicator of the applicability of other tag categories.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128093031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}