Argentina Anna Rescigno, Eva Vanmassenhove, J. Monti, Andy Way
{"title":"A Case Study of Natural Gender Phenomena in Translation. A Comparison of Google Translate, Bing Microsoft Translator and DeepL for English to Italian, French and Spanish","authors":"Argentina Anna Rescigno, Eva Vanmassenhove, J. Monti, Andy Way","doi":"10.4000/books.aaccademia.8844","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8844","url":null,"abstract":"This paper presents the results of an evaluation of Google Translate, DeepL and Bing Microsoft Translator with reference to natural gender translation and provides statistics about the frequency of female, male and neutral forms in the translations of a list of personality adjectives, and nouns referring to professions and bigender nouns. The evaluation is carried out for English→Spanish, English→Italian and English→French.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122762140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How Granularity of Orthography-Phonology Mappings Affect Reading Development: Evidence from a Computational Model of English Word Reading and Spelling","authors":"A. Lim, B. O’Brien, Luca Onnis","doi":"10.4000/books.aaccademia.8628","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8628","url":null,"abstract":"It is widely held that children implicitly learn the structure of their writing system through statistical learning of spelling-tosound mappings. Yet an unresolved question is how to sequence reading experience so that children can ‘pick up’ the structure optimally. We tackle this question here using a computational model of encoding and decoding. The order of presentation of words was manipulated so that they exhibited two distinct progressions of granularity of spelling-to-sound mappings. We found that under a training regime that introduced written words progressively from small-to-large granularity, the network exhibited an early advantage in reading acquisition as compared to a regime introducing written words from large-to-small granularity. Our results thus provide support for the grain size theory (Ziegler and Goswami, 2005) and demonstrate that the order of learning can influence learning trajectories of literacy skills.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124370545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Domain Adaptation for Text Classification with Weird Embeddings","authors":"Valerio Basile","doi":"10.4000/books.aaccademia.8250","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8250","url":null,"abstract":"Pre-trained word embeddings are often used to initialize deep learning models for text classification, as a way to inject precomputed lexical knowledge and boost the learning process. However, such embeddings are usually trained on generic corpora, while text classification tasks are often domain-specific. We propose a fully automated method to adapt pre-trained word embeddings to any given classification task, that needs no additional resource other than the original training set. The method is based on the concept of word weirdness, extended to score the words in the training set according to how characteristic they are with respect to the labels of a text classification dataset. The polarized weirdness scores are then used to update the word embeddings to reflect taskspecific semantic shifts. Our experiments show that this method is beneficial to the performance of several text classification tasks in different languages.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117204780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Irene Sucameli, Alessandro Lenci, B. Magnini, M. Simi, Manuela Speranza
{"title":"Becoming JILDA","authors":"Irene Sucameli, Alessandro Lenci, B. Magnini, M. Simi, Manuela Speranza","doi":"10.4000/books.aaccademia.8915","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8915","url":null,"abstract":"English. The difficulty in finding useful dialogic data to train a conversational agent is an open issue even nowadays, when chatbots and spoken dialogue systems are widely used. For this reason we decided to build JILDA, a novel data collection of chat-based dialogues, produced by Italian native speakers and related to the job-offer domain. JILDA is the first dialogue collection related to this domain for the Italian language. Because of its collection modalities, we believe that JILDA can be a useful resource not only for the Italian research community, but also for the international one. Italiano. Negli ultimi anni l’utilizzo di chatbot e sistemi dialogici è diventato sempre più comune; tuttavia, il reperimento di dati di apprendimento adeguati per addestrare agenti conversazionali costituisce ancora una questione irrisolta. Per questo motivo abbiamo deciso di produrre JILDA, un nuovo dataset di dialoghi relativi al dominio della ricerca del lavoro e realizzati via chat da parlanti nativi italiani. JILDA costituisce la prima collezione di dialoghi relativi a questo dominio, in lingua italiana. Per gli aspetti metodologici e la modalità di raccolta dei dati, riteniamo che una simile risorsa possa essere utile ed interessante non solo per la comunità di ricerca italiana ma anche per quella internazionale.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122178507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyses of Character Emotions in Dramatic Works by Using EmoLex Unigrams","authors":"Mehmet Can Yavuz","doi":"10.4000/books.aaccademia.9004","DOIUrl":"https://doi.org/10.4000/books.aaccademia.9004","url":null,"abstract":"In theatrical pieces, written language is the primary medium for establishing antagonisms. As one of the most important figures of renaissance, Shakespeare wrote characters which express themselves clearly. Thus, the emotional landscape of the plays can be revealed from the texts. It is important to analyze such landscapes for further demonstrating these structures. We use word-emotion association lexicon with eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive). By using this lexicon, the emotional state of each character is represented in 10 dimensional space and mapped onto a plane. This principle axes planes position each character relatively. Additionally, tempora-emotional evaluation of each play is graphed. We conclude that the protagonist and the antagonist have different emotional states from the rest and these two emotionally oppose each other. Temporal-Emotional timeline of the plays are meaningful to have a better insight into the tragedies.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126292994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicola Benvenuti, Andrea Bolioli, A. Mazzei, Pietro Vigorelli, A. Bosca
{"title":"The \"Corpus Anchise 320\" and the Analysis of Conversations between Healthcare Workers and People with Dementia","authors":"Nicola Benvenuti, Andrea Bolioli, A. Mazzei, Pietro Vigorelli, A. Bosca","doi":"10.4000/books.aaccademia.8260","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8260","url":null,"abstract":"The aim of this research was to create the first Italian corpus of free conversations between healthcare workers and people with dementia, in order to investigate specific linguistic phenomena from a computational point of view. Most of the previous researches on speech disorders of people with dementia have been based on qualitative analysis, or on the study of a few dozen cases executed in laboratory conditions, and not in spontaneous speech (in particular for the Italian language). The creation of the Corpus Anchise 320 aims to investigate Dementia language by providing a broader number of dialogues collected in ecological conditions and obtained transcribing spontaneous speech. Moreover, quantitative linguistic analysis can show some peculiarities of this language.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124240892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"(Stem and Word) Predictability in Italian Verb Paradigms: An Entropy-Based Study Exploiting the New Resource LeFFI","authors":"Matteo Pellegrini, A. T. Cignarella","doi":"10.4000/books.aaccademia.8830","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8830","url":null,"abstract":"English. In this paper we present LeFFI, an inflected lexicon of Italian listing all the available wordforms of 2,053 verbs. We then use this resource to perform an entropy-based analysis of the mutual predictability of wordforms within Italian verb paradigms, and compare our findings to the ones of previous work on stem predictability in Italian verb inflection.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114855565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}