Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020最新文献

筛选
英文 中文
How "BERTology" Changed the State-of-the-Art also for Italian NLP “BERTology”如何改变了意大利NLP的技术水平
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8920
F. Tamburini
{"title":"How \"BERTology\" Changed the State-of-the-Art also for Italian NLP","authors":"F. Tamburini","doi":"10.4000/books.aaccademia.8920","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8920","url":null,"abstract":"The use of contextualised word embeddings allowed for a relevant performance increase for almost all Natural Language Processing (NLP) applications. Recently some new models especially developed for Italian became available to scholars. This work aims at evaluating the impact of these models in enhancing application performance for Italian establishing the new state-of-the-art for some fundamental NLP tasks.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128966083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
MultiEmotions-It: a New Dataset for Opinion Polarity and Emotion Analysis for Italian multi - emotions - it:意大利语意见极性和情绪分析的新数据集
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8910
R. Sprugnoli
{"title":"MultiEmotions-It: a New Dataset for Opinion Polarity and Emotion Analysis for Italian","authors":"R. Sprugnoli","doi":"10.4000/books.aaccademia.8910","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8910","url":null,"abstract":"English. This paper1 presents a new linguistic resource for Italian, called MultiEmotions-It, containing comments to music videos and advertisements posted on YouTube and Facebook. These comments are manually annotated according to four different dimensions: i.e., relatedness, opinion polarity, emotions and sarcasm. For the annotation of emotions we adopted the Plutchik’s model taking into account both basic and complex emotions, i.e. dyads.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122087617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Græcissare: Ancient Greek Loanwords in the LiLa Knowledge Base of Linguistic Resources for Latin 拉丁语语言资源LiLa知识库中的古希腊语外来词
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8565
G. Franzini, Federica Zampedri, M. Passarotti, Francesco Mambrini, Giovanni Moretti
{"title":"Græcissare: Ancient Greek Loanwords in the LiLa Knowledge Base of Linguistic Resources for Latin","authors":"G. Franzini, Federica Zampedri, M. Passarotti, Francesco Mambrini, Giovanni Moretti","doi":"10.4000/books.aaccademia.8565","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8565","url":null,"abstract":"English. This paper describes the addition of an index of 1, 763 Ancient Greek loanwords to the collection of Latin lemmas of the LiLa: Linking Latin Knowledge Base of interoperable linguistic resources. This lexical resource increases LiLa’s lemma count and tunes its underlying data model to etymological borrowing.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132229940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Deep Learning Model for the Analysis of Medical Reports in ICD-10 Clinical Coding Task ICD-10临床编码任务中医疗报告分析的深度学习模型
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8834
Marco Polignano, Pierpaolo Basile, M. Degemmis, P. Lops, G. Semeraro
{"title":"A Deep Learning Model for the Analysis of Medical Reports in ICD-10 Clinical Coding Task","authors":"Marco Polignano, Pierpaolo Basile, M. Degemmis, P. Lops, G. Semeraro","doi":"10.4000/books.aaccademia.8834","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8834","url":null,"abstract":"English. The practice of assigning a uniquely identifiable and easily traceable code to pathology from medical diagnoses is an added value to the current modality of archiving health data collected to build the clinical history of each of us. Unfortunately, the enormous amount of possible pathologies and medical conditions has led to the realization of extremely wide international codifications that are difficult to consult even for a human being. This difficulty makes the practice of annotation of diagnoses with ICD-10 codes very cumbersome and rarely performed. In order to support this operation, a classification model was proposed, able to analyze medical diagnoses written in natural language and automatically assign one or more international reference codes. The model has been evaluated on a dataset released in the Spanish language for the eHealth challenge (CodiEsp) of the international conference CLEF 2020, but it could be extended to any language with latin characters. We proposed a model based on a two-step classification process based on BERT and BiLSTM. Although still far from an accuracy sufficient to do without a licensed physician opinion, the results obtained show the feasibility of the task and are a starting point for future studies in this direction. Copyright c © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Italian. La pratica di assegnare un codice univocamente identificabile e facilmente riconducibile ad una patologia a partire da diagnosi mediche e un valore aggiunto alla attuale modalità di archiviazione dei dati sanitari raccolti per costruire la storia clinica di ciascuno di noi. Purtroppo però, lenorme numero di possibili patologie e condizioni mediche ha portato alla realizzazione di codifiche internazionali estremamente ampie e di difficile consultazione anche per un essere umano. Tale difficolt rende la pratica di annotazione delle diagnosi con i codici ICD-10 molto complessa e raramente svolta. Col fine di supportare tale operazione si è proposto un modello di classificazione, in grado di analizzare le diagnosi mediche scritte in linguaggio naturale ed assegnarle automaticamente uno o più codici internazionali di riferimento. Il modello è stato valutato su un dataset rilasciato in lingua Spagnola per la challenge (CodiEsp) di eHealth della conferenza internazionale CLEF 2020 ma è di semplice estensione su qualsiasi lingua con caratteri latini. Abbiamo proposto un modello basato su due passi di classificazione e basati sullutilizzo di BERT e delle BiLSTM. I risultati ottenuti, seppur ancora lontani da una accuratezza sufficiente per far a meno di un parere di un medico esperto, mostrano la fattibilità del task e si pongono come punto di partenza per futuri studi in tale direzione.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126475177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Polarity Imbalance in Lexicon-based Sentiment Analysis 基于词典的情感分析中的极性不平衡
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8964
Marco Vassallo, G. Gabrieli, Valerio Basile, C. Bosco
{"title":"Polarity Imbalance in Lexicon-based Sentiment Analysis","authors":"Marco Vassallo, G. Gabrieli, Valerio Basile, C. Bosco","doi":"10.4000/books.aaccademia.8964","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8964","url":null,"abstract":"Polarity imbalance is an asymmetric situation that occurs while using parametric threshold values in lexicon-based Sentiment-Analysis (SA). The variation across the thresholds may have an opposite impact on the prediction of negative and positive polarity. We hypothesize that this may be due to asymmetries in the data or in the lexicon, or both. We carry out therefore experiments for evaluating the effect of lexicon and of the topics addressed in the data. Our experiments are based on a weighted version of the Italian linguistic resource MAL (Morphologicallyinflected Affective Lexicon) by using as weighting corpus TWITA, a large-scale corpus of messages from Twitter in Italian. The novel Weighted-MAL (W-MAL), presented for the first time int this paper, achieved better polarity classification results especially for negative tweets, along with alleviating the aforementioned polarity imbalance. Italiano. Lo sbilanciamento della polarità è una situazione di asimmetria che si viene a creare quando si impiegano valori soglia parametrici nella Sentiment Analysis (SA) basata su dizionario. La variazione dei valori soglia può avere un impatto opposto rispetto alla predizione di polarità negativa e positiva. Si ipotizza che questo effetto sia dovuto ad asimmetrie nei dati o nel dizionario, o in entrambi. Abbiamo condotto esperimenti per misurare l’effetto del lessico e degli argomenti trattati nel nostro dataset. I nostri esperimenti sono basati su una versione ponderata della risorsa per l’italiano MAL (Morphologically-inflected Affective Lexicon), usando come corpus per la ponderazione TWITA, un corpus di larga scala di messaggi da Twitter in italiano. La nuova risorsa Weighted-MAL (W-MAL), presentata per la prima volta in questo articolo, ottiene migliori risultati nella classificazione della polarità specialmente, per i messaggi negativi, oltre ad alleviare il problema sopracitato di sbilanciamento","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120972448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Tracing Metonymic Relations in T-PAS: An Annotation Exercise on a Corpus-based Resource for Italian T-PAS中转喻关系的追踪:基于语料库的意大利语资源标注练习
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8870
Emma Romani, Elisabetta Jezek
{"title":"Tracing Metonymic Relations in T-PAS: An Annotation Exercise on a Corpus-based Resource for Italian","authors":"Emma Romani, Elisabetta Jezek","doi":"10.4000/books.aaccademia.8870","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8870","url":null,"abstract":"In this paper we address the main issues and results of a research thesis (Romani, 2020) dedicated to the annotation of metonymies in T-PAS, a corpus-based digital repository of Italian verbal patterns (Ježek et al., 2014). The annotation was performed on the corpus instances of a selected list of 30 verbs and was aimed at both implementing the resource with metonymic patterns and identifying and creating a map of the metonymic relations that occur in the verbal patterns. The annotated corpus data (consisting of 1218 corpus instances), the patterns, and the relations can be useful for NLP tasks such as metonymy recognition.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115618589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating Proactivity in Task-Oriented Dialogues 任务导向对话中的主动性研究
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8243
Vevake Balaraman, B. Magnini
{"title":"Investigating Proactivity in Task-Oriented Dialogues","authors":"Vevake Balaraman, B. Magnini","doi":"10.4000/books.aaccademia.8243","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8243","url":null,"abstract":"Proactivity (i.e., the capacity to provide useful information even when not explicitly required) is a fundamental characteristic of human dialogues. Although current task-oriented dialogue systems are good at providing information explicitly requested by the user, they are poor in exhibiting proactivity, which is typical in humanhuman interactions. In this study, we investigate the presence of proactive behaviours in several available dialogue collections, both human-human and humanmachine and show how the data acquisition decision affects the proactive behaviour present in the dataset. We adopt a two-step approach to semi-automatically detect proactive situations in the datasets, where proactivity is not annotated, and show that the dialogues collected with approaches that provide more freedom to the agent/user, exhibit high proactivity.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117119189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dialog-based Help Desk through Automated Question Answering and Intent Detection 通过自动问答和意图检测的基于对话框的帮助台
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8945
A. Uva, Pierluigi Roberti, Alessandro Moschitti
{"title":"Dialog-based Help Desk through Automated Question Answering and Intent Detection","authors":"A. Uva, Pierluigi Roberti, Alessandro Moschitti","doi":"10.4000/books.aaccademia.8945","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8945","url":null,"abstract":"Modern personal assistants require to access unstructured information in order to successfully fulfill user requests. In this paper, we have studied the use of two machine learning components to design personal assistants: intent classification, to understand the user request, and answer sentence selection, to carry out question answering from unstructured text. The evaluation results derived on five different real-world datasets, associated with different companies, show high accuracy for both tasks. This suggests that modern QA and dialog technology is effective for real-world tasks. I moderni personal assistant richiedono di accedere ad informazioni non strutturate per soddisfare con successo le richieste degli utenti. In questo articolo, abbiamo studiato l’uso dell’ apprendimento automatico per progettare due componenti di un personal assistant: classificazione degli intenti, per comprendere la richiesta dell’utente, e la selezione della frase di risposta per rispondere alle domande con testo non strutturato. I risultati della valutazione derivati da cinque diversi datasets del mondo reale, associati a diverse società, mostrano un’elevata precisione per entrambi i modelli. Ciò suggerisce che la moderna tecnologia di question answering e dialogo è efficace per attività reali.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115308075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Language Transformer Adaptation for Frequently Asked Questions 常见问题的跨语言转换器改编
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8463
Luca Di Liello, Daniele Bonadiman, Alessandro Moschitti, Cristina Giannone, A. Favalli, Raniero Romagnoli
{"title":"Cross-Language Transformer Adaptation for Frequently Asked Questions","authors":"Luca Di Liello, Daniele Bonadiman, Alessandro Moschitti, Cristina Giannone, A. Favalli, Raniero Romagnoli","doi":"10.4000/books.aaccademia.8463","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8463","url":null,"abstract":"Transfer learning has been proven to be effective, especially when data for the target domain/task is scarce. Sometimes data for a similar task is only available in another language because it may be very specific. In this paper, we explore the use of machine-translated data to transfer models on a related domain. Specifically, we transfer models from the question duplication task (QDT) to similar FAQ selection tasks. The source domain is the wellknown English Quora dataset, while the target domain is a collection of small Italian datasets for real case scenarios consisting of FAQ groups retrieved by pivoting on common answers. Our results show great improvements in the zero-shot learning setting and modest improvements using the standard transfer approach for direct in-domain adaptation 1.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116024253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Italian Transformers Under the Linguistic Lens 语言学镜头下的意大利变形金刚
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8745
Alessio Miaschi, Gabriele Sarti, D. Brunato, F. Dell’Orletta, Giulia Venturi
{"title":"Italian Transformers Under the Linguistic Lens","authors":"Alessio Miaschi, Gabriele Sarti, D. Brunato, F. Dell’Orletta, Giulia Venturi","doi":"10.4000/books.aaccademia.8745","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8745","url":null,"abstract":"In this paper we present an in-depth investigation of the linguistic knowledge encoded by the transformer models currently available for the Italian language. In particular, we investigate whether and how using different architectures of probing models affects the performance of Italian transformers in encoding a wide spectrum of linguistic features. Moreover, we explore how this implicit knowledge varies according to different textual genres.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128139027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信