{"title":"Correspondences between Czech and English Coreferential Expressions","authors":"M. Novák, A. Nedoluzhko","doi":"10.4000/DISCOURS.9058","DOIUrl":"https://doi.org/10.4000/DISCOURS.9058","url":null,"abstract":"In this work, we present a comprehensive study on correspondences between certain classes of coreferential expressions in English and Czech. We focus on central pronouns, relative pronouns, and anaphoric zeros. We designed an alignment-refining algorithm for English personal and possessive pronouns and Czech relative pronouns that improves the quality of alignment links not only for the classes it aimed at but also in general. Moreover, the instances of anaphoric expressions we focus on were manually annotated with their alignment counterparts, which served as a basis for this empirical study. The collected statistics of correspondences are contrasted with theoretical assumptions regarding the use of anaphoric means in the languages under analysis, such as pro-drop properties, the use of finite and non-finite constructions, etc. Finally, we present the ways how the observed correspondences can be exploited in cross-lingual coreference resolution.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2015-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88376402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lénaïs Maskens, Louise-Amélie Cougnon, Sophie Roekhaut, Cédrick Fairon
{"title":"Nouveaux médias et orthographe. Incompétence ou pluricompétence","authors":"Lénaïs Maskens, Louise-Amélie Cougnon, Sophie Roekhaut, Cédrick Fairon","doi":"10.4000/DISCOURS.9020","DOIUrl":"https://doi.org/10.4000/DISCOURS.9020","url":null,"abstract":"La presente etude s’interesse a l’existence d’une pluricompetence qui permettrait aux utilisateurs de nouveaux medias de communication de passer de l’ecrit traditionnel a la CEMO (communication ecrite mediee par ordinateur) de la meme facon qu’ils changent de registre. Nous avons recolte les productions ecrites de jeunes de 14 a 15 ans a travers deux supports (electronique / papier) et dans trois situations de communication (dictee, activite en classe, Facebook) afin d’etudier l’influence de ces variables sur la gestion de l’orthographe. Les resultats aux dictees indiquent un niveau relativement bas (une erreur tous les 5 ou 6 mots) avec une majorite d’erreurs grammaticales, ce qui est conforme aux etudes precedemment menees sur le sujet. L’observation des unites communes aux trois corpus montre que l’on retrouve la forme graphique standard dans au moins un des corpus (sinon plusieurs), et ce, chez tous les eleves. Le meme type d’analyse d’unites communes menee sur le corpus Facebook uniquement montre que la forme standard est maitrisee dans un grand nombre de cas (88 % des formes) par les eleves. Enfin, nous observons que la palette de variantes graphiques utilisee dans les conversations Facebook est assez limitee (principalement abreviations, smileys et caracteres echos) et que le taux de compression des formes est assez faible, indiquant que la plupart des formes sont respectees dans leur totalite ou reduites d’un seul caractere.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2015-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80645221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anaphore possessive et anaphore associative : le cas des noms collectifs","authors":"M. Salles","doi":"10.4000/DISCOURS.8981","DOIUrl":"https://doi.org/10.4000/DISCOURS.8981","url":null,"abstract":"Cet article est consacre a differentes relations anaphoriques pouvant s’etablir entre un nom collectif (ex. regiment, caravane, foret) et les noms de ses membres (ex. soldats, chameaux, arbres, pour les trois noms collectifs precedents) : anaphore associative dans des sequences telles que un regiment… les soldats, anaphore possessive de la collection aux membres (un regiment… ses soldats) ou des membres a la collection (des soldats… leur regiment), entre autres exemples. Parfois d’une grande souplesse anaphorique, comparee a une relation semantique proche comme la meronymie (ex. arbre / tronc, voiture / moteur), la relation membre-collection presente aussi des restrictions surprenantes. Cette souplesse comme ces blocages s’expliquent a la fois par les proprietes referentielles des collections (leur pluralite et leur homogeneite internes) et par certaines proprietes semantiques de la relation membre-collection (notamment le caractere generalement non relationnel ou categorematique du nom de membre). On s’interessera plus specifiquement ici a l’alternance du defini associatif et du possessif devant les noms de membres. L’homogeneite interne qui caracterise les collections expliquera pourquoi l’anaphore associative n’est pas possible avec certains noms, les noms generiques de membres (qui, eux, sont relationnels ; ex. membre, element) : ces derniers ne laissent en effet aucune place a la differenciation reclamee par l’anaphore associative et imposent alors l’emploi du possessif. Le caractere generalement non relationnel du nom de membre expliquera pourquoi le possessif ne constitue pas un veritable concurrent au defini associatif dans les autres cas. On soulignera pour finir que, lorsque toutes les conditions sont reunies pour l’emploi de l’un ou l’autre determinant, le choix referentiel n’est pas sans incidence sur l’interpretation des relations de coherence.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2015-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86339898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dislocation clitique de l’objet à gauche en français écrit","authors":"Etienne Riou, Barbara Hemforth","doi":"10.4000/DISCOURS.9037","DOIUrl":"https://doi.org/10.4000/DISCOURS.9037","url":null,"abstract":"Cet article porte sur l’acceptabilite des constructions syntaxiques a la lumiere de la distinction entre francais oral et francais ecrit. Nous presentons cette distinction comme un facteur appartenant a une approche multifactorielle. Ladite approche inclut les contraintes pragmatiques, stylistiques, et syntaxiques qui influencent l’acceptabilite d’un enonce. Nous argumentons qu’il est possible d’ameliorer l’acceptabilite d’une construction associee au francais oral, la dislocation clitique de l’objet a gauche, dans un contexte ecrit en modifiant la structure informationnelle du discours. Nous presentons une serie d’experiences testant l’acceptabilite de la construction a l’ecrit sous diverses contraintes informationnelles. Nous proposerons un modele incluant les memes contraintes pour le francais parle et ecrit en integrant la modalite comme predicteur.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2015-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73959951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Bolly, Ludivine Crible, Liesbeth Degand, Deniz Uygur-Distexhe
{"title":"MDMA. Un modèle pour l’identification et l’annotation des marqueurs discursifs « potentiels » en contexte","authors":"C. Bolly, Ludivine Crible, Liesbeth Degand, Deniz Uygur-Distexhe","doi":"10.4000/DISCOURS.9009","DOIUrl":"https://doi.org/10.4000/DISCOURS.9009","url":null,"abstract":"Partant du constat qu’il n’existe pas de categorie fermee de marqueurs discursifs (MD) et que la definition de ces marqueurs varie fortement selon le cadre epistemologique adopte, l’objectif du projet MDMA (« Model for Discourse Marker Annotation ») est d’etablir une methode empirique d’identification et d’annotation des MD en francais oral. La methode vise tout d’abord a decrire les MD en faisceaux de variables et ensuite, d’un point de vue combinatoire, en patrons specifiques. Notre demarche comprend trois etapes : (i) reperage manuel de tous les MD dits « potentiels » dans un corpus equilibre en francais oral (5 000 mots ; Belgique et France) ; (ii) extraction automatique de toutes les formes qui correspondent aux MD potentiels identifies precedemment (1 181 occurrences) ; (iii) analyse parametrique d’un echantillon aleatoire de 200 MD potentiels en contexte (variables syntaxiques, formelles et semantico-pragmatiques). L’hypothese est que l’analyse statistique des contraintes distributionnelles imposees aux differents MD potentiels devrait reveler une certaine hierarchisation entre variables annotees, concernant leur pertinence, leur fiabilite et leur generalisabilite (voire leur specificite). Dans cet article, nous presenterons les principes d’annotation des MD, nous aborderons ensuite la problematique de l’accord inter-juges, pour finalement discuter de maniere plus approfondie les resultats de l’analyse sur corpus.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2015-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87001893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluative Meaning and Cohesion: The Structuring Function of Evaluative Meaning in Scientific Writing","authors":"Stefania Degaetano-Ortlieb","doi":"10.4000/DISCOURS.9053","DOIUrl":"https://doi.org/10.4000/DISCOURS.9053","url":null,"abstract":"We present a diachronic study of evaluative meaning in scientific writing, focusing on evaluative expressions that possibly serve the interpersonal as well as the textual metafunction in terms of Systemic Functional Linguistics (SFL). These are expressions such as importantly or obviously used in sentence-initial position to evaluate what follows but which also establish a cohesive link with the adjacent discourse. For the analysis, the SciTex corpus, comprising nine scientific disciplines, was used. The data were analyzed in macro and micro-analytical steps combining quantitative and qualitative analyses. This allows us to observe generalizable trends as well as fine-grained distinctions.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2015-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75970162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Quantitative Analysis of Discourse Phenomena in Machine Translation","authors":"Carolina Scarton, Lucia Specia","doi":"10.4000/DISCOURS.9047","DOIUrl":"https://doi.org/10.4000/DISCOURS.9047","url":null,"abstract":"State-of-the-art Machine Translation (MT) systems translate documents by considering isolated sentences, disregarding information beyond sentence level. As a result, machine-translated documents often contain problems related to discourse coherence and cohesion. Recently, some initiatives in the evaluation and quality estimation of MT outputs have attempted to detect discourse problems in order to assess the quality of these machine translations. However, a quantitative analysis of discourse phenomena in MT outputs is still needed in order to better understand the phenomena and identify possible solutions or ways to improve evaluation. This paper aims to answer the following questions: What is the impact of discourse phenomena on MT quality? Can we capture and measure quantitatively any issues related to discourse in MT outputs? In order to answer these questions, we present a quantitative analysis of several discourse phenomena and correlate the resulting figures with scores from automatic translation quality evaluation metrics. We show that figures related to discourse phenomena present a higher correlation with quality scores than the baseline counts widely used for quality estimation of MT.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2015-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74511388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quelques orientations méthodologiques pour étudier la gestuelle dans des corpus spontanés et semi-contrôlés","authors":"Marion Tellier","doi":"10.4000/DISCOURS.8917","DOIUrl":"https://doi.org/10.4000/DISCOURS.8917","url":null,"abstract":"Cet article dresse un panorama methodologique en vue de l’etude de la parole dans une perspective multimodale. Il ne s’agit donc pas de donner ici des resultats mais de montrer un peu les coulisses d’une etude multimodale en revelant les questionnements methodologiques que tout chercheur doit se poser avant d’entreprendre une telle etude. Ainsi, cet article abordera la question du recueil des elements constituant un corpus multimodal, de l’annotation (et notamment de la conception d’un schema d’annotation) et enfin de la methodologie du contre-codage en vue de valider les annotations. Il presentera donc diverses etudes et se fera le temoin de plusieurs experiences methodologiques. L’accent sera mis en particulier sur les etudes de la gestuelle coverbale mais tout en envisageant le transfert de ces approches methodologiques a d’autres modalites telles que les expressions faciales, le regard ou encore les postures.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2014-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87187371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Expressions parenthétiques dans un corpus parallèle français-grec : les adverbiaux de conviction personnelle","authors":"Fryni Kakoyianni-Doa","doi":"10.4000/DISCOURS.8929","DOIUrl":"https://doi.org/10.4000/DISCOURS.8929","url":null,"abstract":"Le present article est consacre a l’etude descriptive et comparative d’un sous-ensemble d’adverbiaux de phrase francais et grecs, dits parenthetiques, qui se referent a la conviction personnelle du locuteur vis-a-vis de l’information transmise. Plus particulierement, sont examinees, dans un corpus reel, des formes adverbiales de conviction personnelle qui ont des rapports etroits avec les verbes parenthetiques croire et penser.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2014-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85581678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Authorial Presence in French and English: “Pronoun + Verb” Patterns in Biology and Medicine Research Articles","authors":"Laura M. Hartwell, Marie-Paule Jacques","doi":"10.4000/DISCOURS.8941","DOIUrl":"https://doi.org/10.4000/DISCOURS.8941","url":null,"abstract":"Certain subjective qualities of scientific research articles are exposed when authors refer to themselves through various means including pronoun use. Drawing upon the online bilingual “Scientext” corpus, we compare personal pronoun and syntactically linked verb constructions within 180 published articles in English and French in the fields of medicine and biology. This study found that overall pronoun frequency was significantly higher (χ2 = 69.45, df = 1, p < 0.001) in English (22.6 per 10,000) than in French (14 per 10,000) and that the French on [one] (23.8%) was significantly more frequent (χ2 = 163.35, df = 1, p < 0.001) than the English pronoun “one” (3.8%). Furthermore, while most French verbs were limited to the present and passe compose, English conjugation was distributed mainly between the simple past, the simple present, and the present perfect. Both the lexis and the conjugation vary with the progression of the research article and the author roles of researcher, writer, arguer, and evaluator. This paper contributes to the discussion of the representation of objectivity in scientific discourse.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2014-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91225701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}