Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020最新文献_第4页

How "BERTology" Changed the State-of-the-Art also for Italian NLP “BERTology”如何改变了意大利NLP的技术水平

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8920

F. Tamburini

引用次数: 10

MultiEmotions-It: a New Dataset for Opinion Polarity and Emotion Analysis for Italian multi - emotions - it:意大利语意见极性和情绪分析的新数据集

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8910

R. Sprugnoli

引用次数: 13

Græcissare: Ancient Greek Loanwords in the LiLa Knowledge Base of Linguistic Resources for Latin 拉丁语语言资源LiLa知识库中的古希腊语外来词

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8565

G. Franzini, Federica Zampedri, M. Passarotti, Francesco Mambrini, Giovanni Moretti

引用次数: 4

A Deep Learning Model for the Analysis of Medical Reports in ICD-10 Clinical Coding Task ICD-10临床编码任务中医疗报告分析的深度学习模型

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8834

Marco Polignano, Pierpaolo Basile, M. Degemmis, P. Lops, G. Semeraro

{"title":"A Deep Learning Model for the Analysis of Medical Reports in ICD-10 Clinical Coding Task","authors":"Marco Polignano, Pierpaolo Basile, M. Degemmis, P. Lops, G. Semeraro","doi":"10.4000/books.aaccademia.8834","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8834","url":null,"abstract":"English. The practice of assigning a uniquely identifiable and easily traceable code to pathology from medical diagnoses is an added value to the current modality of archiving health data collected to build the clinical history of each of us. Unfortunately, the enormous amount of possible pathologies and medical conditions has led to the realization of extremely wide international codifications that are difficult to consult even for a human being. This difficulty makes the practice of annotation of diagnoses with ICD-10 codes very cumbersome and rarely performed. In order to support this operation, a classification model was proposed, able to analyze medical diagnoses written in natural language and automatically assign one or more international reference codes. The model has been evaluated on a dataset released in the Spanish language for the eHealth challenge (CodiEsp) of the international conference CLEF 2020, but it could be extended to any language with latin characters. We proposed a model based on a two-step classification process based on BERT and BiLSTM. Although still far from an accuracy sufficient to do without a licensed physician opinion, the results obtained show the feasibility of the task and are a starting point for future studies in this direction. Copyright c © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Italian. La pratica di assegnare un codice univocamente identificabile e facilmente riconducibile ad una patologia a partire da diagnosi mediche e un valore aggiunto alla attuale modalità di archiviazione dei dati sanitari raccolti per costruire la storia clinica di ciascuno di noi. Purtroppo però, lenorme numero di possibili patologie e condizioni mediche ha portato alla realizzazione di codifiche internazionali estremamente ampie e di difficile consultazione anche per un essere umano. Tale difficolt rende la pratica di annotazione delle diagnosi con i codici ICD-10 molto complessa e raramente svolta. Col fine di supportare tale operazione si è proposto un modello di classificazione, in grado di analizzare le diagnosi mediche scritte in linguaggio naturale ed assegnarle automaticamente uno o più codici internazionali di riferimento. Il modello è stato valutato su un dataset rilasciato in lingua Spagnola per la challenge (CodiEsp) di eHealth della conferenza internazionale CLEF 2020 ma è di semplice estensione su qualsiasi lingua con caratteri latini. Abbiamo proposto un modello basato su due passi di classificazione e basati sullutilizzo di BERT e delle BiLSTM. I risultati ottenuti, seppur ancora lontani da una accuratezza sufficiente per far a meno di un parere di un medico esperto, mostrano la fattibilità del task e si pongono come punto di partenza per futuri studi in tale direzione.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126475177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Polarity Imbalance in Lexicon-based Sentiment Analysis 基于词典的情感分析中的极性不平衡

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8964

Marco Vassallo, G. Gabrieli, Valerio Basile, C. Bosco

{"title":"Polarity Imbalance in Lexicon-based Sentiment Analysis","authors":"Marco Vassallo, G. Gabrieli, Valerio Basile, C. Bosco","doi":"10.4000/books.aaccademia.8964","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8964","url":null,"abstract":"Polarity imbalance is an asymmetric situation that occurs while using parametric threshold values in lexicon-based Sentiment-Analysis (SA). The variation across the thresholds may have an opposite impact on the prediction of negative and positive polarity. We hypothesize that this may be due to asymmetries in the data or in the lexicon, or both. We carry out therefore experiments for evaluating the effect of lexicon and of the topics addressed in the data. Our experiments are based on a weighted version of the Italian linguistic resource MAL (Morphologicallyinflected Affective Lexicon) by using as weighting corpus TWITA, a large-scale corpus of messages from Twitter in Italian. The novel Weighted-MAL (W-MAL), presented for the first time int this paper, achieved better polarity classification results especially for negative tweets, along with alleviating the aforementioned polarity imbalance. Italiano. Lo sbilanciamento della polarità è una situazione di asimmetria che si viene a creare quando si impiegano valori soglia parametrici nella Sentiment Analysis (SA) basata su dizionario. La variazione dei valori soglia può avere un impatto opposto rispetto alla predizione di polarità negativa e positiva. Si ipotizza che questo effetto sia dovuto ad asimmetrie nei dati o nel dizionario, o in entrambi. Abbiamo condotto esperimenti per misurare l’effetto del lessico e degli argomenti trattati nel nostro dataset. I nostri esperimenti sono basati su una versione ponderata della risorsa per l’italiano MAL (Morphologically-inflected Affective Lexicon), usando come corpus per la ponderazione TWITA, un corpus di larga scala di messaggi da Twitter in italiano. La nuova risorsa Weighted-MAL (W-MAL), presentata per la prima volta in questo articolo, ottiene migliori risultati nella classificazione della polarità specialmente, per i messaggi negativi, oltre ad alleviare il problema sopracitato di sbilanciamento","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120972448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Tracing Metonymic Relations in T-PAS: An Annotation Exercise on a Corpus-based Resource for Italian T-PAS中转喻关系的追踪:基于语料库的意大利语资源标注练习

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8870

Emma Romani, Elisabetta Jezek

引用次数: 0

Investigating Proactivity in Task-Oriented Dialogues 任务导向对话中的主动性研究

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8243

Vevake Balaraman, B. Magnini

引用次数: 2

Dialog-based Help Desk through Automated Question Answering and Intent Detection 通过自动问答和意图检测的基于对话框的帮助台

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8945

A. Uva, Pierluigi Roberti, Alessandro Moschitti

{"title":"Dialog-based Help Desk through Automated Question Answering and Intent Detection","authors":"A. Uva, Pierluigi Roberti, Alessandro Moschitti","doi":"10.4000/books.aaccademia.8945","DOIUrl":"https://doi.org/10.4000/books.aaccademia.8945","url":null,"abstract":"Modern personal assistants require to access unstructured information in order to successfully fulfill user requests. In this paper, we have studied the use of two machine learning components to design personal assistants: intent classification, to understand the user request, and answer sentence selection, to carry out question answering from unstructured text. The evaluation results derived on five different real-world datasets, associated with different companies, show high accuracy for both tasks. This suggests that modern QA and dialog technology is effective for real-world tasks. I moderni personal assistant richiedono di accedere ad informazioni non strutturate per soddisfare con successo le richieste degli utenti. In questo articolo, abbiamo studiato l’uso dell’ apprendimento automatico per progettare due componenti di un personal assistant: classificazione degli intenti, per comprendere la richiesta dell’utente, e la selezione della frase di risposta per rispondere alle domande con testo non strutturato. I risultati della valutazione derivati da cinque diversi datasets del mondo reale, associati a diverse società, mostrano un’elevata precisione per entrambi i modelli. Ciò suggerisce che la moderna tecnologia di question answering e dialogo è efficace per attività reali.","PeriodicalId":300279,"journal":{"name":"Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020","volume":"210 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115308075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cross-Language Transformer Adaptation for Frequently Asked Questions 常见问题的跨语言转换器改编

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8463

Luca Di Liello, Daniele Bonadiman, Alessandro Moschitti, Cristina Giannone, A. Favalli, Raniero Romagnoli

引用次数: 0

Italian Transformers Under the Linguistic Lens 语言学镜头下的意大利变形金刚

Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020 Pub Date : 1900-01-01 DOI: 10.4000/books.aaccademia.8745

Alessio Miaschi, Gabriele Sarti, D. Brunato, F. Dell’Orletta, Giulia Venturi

引用次数: 8