EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020最新文献

筛选
英文 中文
University of Padova @ DIACR-Ita
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7618
Benyou Wang, Emanuele Di Buccio, M. Melucci
{"title":"University of Padova @ DIACR-Ita","authors":"Benyou Wang, Emanuele Di Buccio, M. Melucci","doi":"10.4000/BOOKS.AACCADEMIA.7618","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7618","url":null,"abstract":"Semantic change detection task in a relatively low-resource language like Italian is challenging. By using contextualized word embeddings, we formalize the task as a distance metric for two flexible-size sets of vectors. Various distance metrics like average Euclidean Distance, average Canberra distance, Hausdorff distance, as well as Jensen–Shannon divergence between cluster distributions based on K-means clustering and Gaussian mixture model are used. The final prediction is given by an ensemble of top-ranked words based on each distance metric. The proposed method achieved better performance than a frequency and collocation based baselines.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131207821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Flattening the Curve of the COVID-19 Infodemic: These Evaluation Campaigns Can Help! 扁平化COVID-19信息大流行曲线:这些评估活动可以提供帮助!
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6752
Preslav Nakov
{"title":"Flattening the Curve of the COVID-19 Infodemic: These Evaluation Campaigns Can Help!","authors":"Preslav Nakov","doi":"10.4000/BOOKS.AACCADEMIA.6752","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6752","url":null,"abstract":"The World Health Organization acknowledged that “The 2019-nCoV outbreak and response has been accompanied by a massive ‘infodemic’ ... that makes it hard for people to find trustworthy sources and reliable guidance when they need it.” While fighting this infodemic is typically thought of in terms of factuality, the problem is much broader as malicious content includes not only “fake news”, rumors, and conspiracy theories, but also promotion of fake cures, panic, racism, xenophobia, and mistrust in the authorities, among others. Thus, we argue for the need of a holistic approach combining the perspectives of journalists, fact-checkers, policymakers, social media platforms, and society as a whole, and we present our initial work in this direction. We further discuss evaluation campaigns at CLEF and SemEval that feature relevant tasks (not necessarily focusing on COVID-19). One relevant evaluation campaign is the CLEF CheckThat! Lab, which has focused on tasks that make human fact-checkers more productive: spotting check-worthy claims (in tweets, political debates, and speeches), determining whether these claims have been previously factchecked, retrieving relevant pages and passages, and finally, making a prediction about the factuality of the claims. There have been also a number of relevant SemEval tasks related to factuality, e.g., on rumor detection and verification in social media, on fact-checking in community question answering forums, and on stance detection. Other relevant SemEval tasks have looked beyond factuality, focusing on intent, e.g., on offensive language detection in social media, as well as on spotting the use of propaganda techniques (e.g., appeal to emotions, fear, prejudices, logical fallacies, etc.) in the news and in memes (text + image). Of course, relevant tasks can be also found beyond CLEF and SemEval; most notably, this includes FEVER and the Fake News Challenge. Finally, we demonstrate two systems developed at the Qatar Computing Research Institute, HBKU, to address some of the above challenges: one reflecting the proposed holistic approach, and one on fine-grained propagada detection. The latter system, Prta (https://www.tanbih.org/prta), was featured at ACL-2020 with a Best Demo Award (Honorable Mention).","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"15 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120986718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DaDoEval @ EVALITA 2020: Same-Genre and Cross-Genre Dating of Historical Documents 历史文献的同体裁和跨体裁年代测定
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7590
S. Menini, Giovanni Moretti, R. Sprugnoli, Sara Tonelli
{"title":"DaDoEval @ EVALITA 2020: Same-Genre and Cross-Genre Dating of Historical Documents","authors":"S. Menini, Giovanni Moretti, R. Sprugnoli, Sara Tonelli","doi":"10.4000/BOOKS.AACCADEMIA.7590","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7590","url":null,"abstract":"English. In this paper we introduce the DaDoEval shared task at EVALITA 2020, aimed at automatically assigning temporal information to documents written in Italian. The evaluation exercise comprises three levels of temporal granularity, from coarse-grained to year-based, and includes two types of test sets, either having the same genre of the training set, or a different one. More specifically, DaDoEval deals with the corpus of Alcide De Gasperi’s documents, providing both public documents and letters as test sets. Two systems participated in the competition, achieving results always above the baseline in all subtasks. As expected, coarse-grained classification into five periods is rather easy to perform automatically, while the year-based one is still an unsolved problem also due to the lack of enough training data for some years. Results showed also that, although De Gasperi’s letters in our test set were written in standard Italian and in a style which was not too colloquial, cross-genre classification yields remarkably lower results than the same-genre setting.1","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128624760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
YNU_OXZ @ HaSpeeDe 2 and AMI : XLM-RoBERTa with Ordered Neurons LSTM for Classification Task at EVALITA 2020 基于有序神经元LSTM的XLM-RoBERTa分类任务在EVALITA上的应用[j]
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6912
Xiaozhi Ou, Hongling Li
{"title":"YNU_OXZ @ HaSpeeDe 2 and AMI : XLM-RoBERTa with Ordered Neurons LSTM for Classification Task at EVALITA 2020","authors":"Xiaozhi Ou, Hongling Li","doi":"10.4000/BOOKS.AACCADEMIA.6912","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6912","url":null,"abstract":"English. This paper describes the system that team YNU OXZ submitted for EVALITA 2020. We participate in the shared task on Automatic Misogyny Identification (AMI) and Hate Speech Detection (HaSpeeDe 2) at the 7th evaluation campaign EVALITA 2020. For HaSpeeDe 2, we participate in Task A Hate Speech Detection and submitted two-run results for the news headline test and tweets headline test, respectively. Our submitted run is based on the pre-trained multilanguage model XLM-RoBERTa, and input into Convolution Neural Network and K-max Pooling (CNN + K-max Pooling). Then, an Ordered Neurons LSTM (ONLSTM) is added to the previous representation and submitted to a linear decision function. Regarding the AMI shared task for the automatic identification of misogynous content in the Italian language. We participate in subtask A about Misogyny & Aggressive Behaviour Identification. Our system is similar to the one defined for HaSpeeDe and is based on the pre-trained multi-language model XLMRoBERTa, an Ordered Neurons LSTM (ON-LSTM), a Capsule Network, and a final classifier.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127325753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
CHANGE-IT @ EVALITA 2020: Change Headlines, Adapt News, GEnerate (short paper) Change - it @ EVALITA 2020:改变头条,改编新闻,生成(短文)
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7250
Lorenzo De Mattei, Michele Cafagna, F. Dell’Orletta, M. Nissim, Albert Gatt
{"title":"CHANGE-IT @ EVALITA 2020: Change Headlines, Adapt News, GEnerate (short paper)","authors":"Lorenzo De Mattei, Michele Cafagna, F. Dell’Orletta, M. Nissim, Albert Gatt","doi":"10.4000/BOOKS.AACCADEMIA.7250","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7250","url":null,"abstract":"We propose a generation task for Italian – more specifically, a style transfer task for headlines of Italian newspapers. This is the first shared task on generation included in the EVALITA evaluation framework. Indeed, one of the reasons to have this task is to stimulate more research on generation within the Italian community. With this aim in mind, we release to the participating teams not only training data, but also a baseline sequence to sequence model that performs the task in order to help everyone get started, even when not accustomed to Natural Language Generation (NLG) approaches. Contextually, we explore the complex issue of automatic evaluation of generated text, which is receiving particular attention in the NLG community. 1 Task and Motivation We propose a generation task for Italian in the context of the EVALITA 2020 campaign (Basile et al., 2020). More specifically, we design a style transfer task for headlines of Italian newspapers. We believe it is the first time that a shared task on generation is offered in the context of EVALITA. Indeed, one of the reasons to have this task is to stimulate more research on generation within the Italian community. With this goal in mind, we release to the potential participating Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). teams not only training data, but also a baseline sequence to sequence model that performs the task in order to help everyone get started, even when not accustomed to generation models, yet. This baseline model casts the style transfer problem as an extreme summarisation task, just showing how versatile the problem is in terms of possible approaches. Contextually, this task will help to further explore the complex issue of evaluation of generated text, which is receiving particular attention in the Natural Language Generation international community (Gatt and Krahmer, 2018; van der Lee et al., 2019). Task The task is cast as a “headline translation” problem, and it is as follows. Given a collection of headlines from two Italian newspapers at opposite ends of the political spectrum, call them G and R, change all G-headlines to headlines into style R, and all R-headlines to headlines in style G. In the context of this task we need to take care of two crucial aspects: data and evaluation. Details on data are provided in Section 2, and on evaluation in Section 3.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129882788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
ItaliaNLP @ TAG-IT: UmBERTo for Author Profiling at TAG-it 2020 (short paper) ItaliaNLP @ TAG-IT: TAG-IT 2020作者分析编号(短论文)
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7297
Daniela Occhipinti, A. Tesei, Maria Iacono, C. Aliprandi, Lorenzo De Mattei
{"title":"ItaliaNLP @ TAG-IT: UmBERTo for Author Profiling at TAG-it 2020 (short paper)","authors":"Daniela Occhipinti, A. Tesei, Maria Iacono, C. Aliprandi, Lorenzo De Mattei","doi":"10.4000/BOOKS.AACCADEMIA.7297","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7297","url":null,"abstract":"In this paper we describe the systems we used to participate in the task TAG-it of EVALITA 2020. The first system we developed uses linear Support Vector Machine as learning algorithm. The other two systems are based on the pretrained Italian Language Model UmBERTo: one of them has been developed following the Multi-Task Learning approach, while the other following the Single-Task Learning approach. These systems have been evaluated on TAG-it official test sets and ranked first in all the TAG-it subtasks, demonstrating the validity of the approaches we followed.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133454667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ATE_ABSITA @ EVALITA2020: Overview of the Aspect Term Extraction and Aspect-based Sentiment Analysis Task ATE_ABSITA @ EVALITA2020:方面术语提取和基于方面的情感分析任务概述
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.6849
Lorenzo De Mattei, Graziella De Martino, Andrea Iovine, Alessio Miaschi, Marco Polignano, Giulia Rambelli
{"title":"ATE_ABSITA @ EVALITA2020: Overview of the Aspect Term Extraction and Aspect-based Sentiment Analysis Task","authors":"Lorenzo De Mattei, Graziella De Martino, Andrea Iovine, Alessio Miaschi, Marco Polignano, Giulia Rambelli","doi":"10.4000/BOOKS.AACCADEMIA.6849","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.6849","url":null,"abstract":"Over the last years, the rise of novel sentiment analysis techniques to assess aspect-based opinions on product reviews has become a key component for providing valuable insights to both consumers and businesses. To this extent, we propose ATE ABSITA: the EVALITA 2020 shared task on Aspect Term Extraction and Aspect-Based Sentiment Analysis. In particular, we approach the task as a cascade of three subtasks: Aspect Term Extraction (ATE), Aspect-based Sentiment Analysis (ABSA) and Sentiment Analysis (SA). Therefore, we invited participants to submit systems designed to automatically identify the ”aspect term” in each review and to predict the sentiment expressed for each aspect, along with the sentiment of the entire review. The task received broad interest, with 27 teams registered and more than 45 participants. However, only three teams submitted their working systems. The results obtained underline the task’s difficulty, but they also show how it is possible to deal with it using innovative approaches and models. Indeed, two of them are based on large pre-trained language models as typical in the current state of the art for the English language. (de Mattei et al., 2020) “Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).”","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126911096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
SSNCSE-NLP @ EVALITA2020: Textual and Contextual Stance Detection from Tweets Using Machine Learning Approach (short paper) SSNCSE-NLP @ EVALITA2020:使用机器学习方法从推文中进行文本和上下文姿态检测(短论文)
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7224
B. Bharathi, J. Bhuvana, Nitin Nikamanth Appiah Balaji
{"title":"SSNCSE-NLP @ EVALITA2020: Textual and Contextual Stance Detection from Tweets Using Machine Learning Approach (short paper)","authors":"B. Bharathi, J. Bhuvana, Nitin Nikamanth Appiah Balaji","doi":"10.4000/BOOKS.AACCADEMIA.7224","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7224","url":null,"abstract":"Opinions expressed via online social media platforms can be used to analyse the stand taken by the public about any event or topic. Recognizing the stand taken is the stance detection, in this paper an automatic stance detection approach is proposed that uses both deep learning based feature extraction and hand crafted feature extraction. BERT is used as a feature extraction scheme along with stylistic, structural, contextual and community based features extracted from tweets to build a machine learning based model. This work has used multilayer perceptron to detect the stances as favour, against and neutral tweets. The dataset used is provided by SardiStance task with tweets in Italian about Sardines movement. Several variants of models were built with different feature combinations and are compared against the baseline model provided by the task organisers. The models with BERT and the same combined with other contextual features proven to be the best per-forming models that outperform the baseline model performance.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116385121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TAG-it @ EVALITA2020: Overview of the Topic, Age, and Gender Prediction Task for Italian 标签-it @ EVALITA2020:意大利语主题,年龄和性别预测任务概述
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7262
Andrea Cimino, F. Dell’Orletta, M. Nissim
{"title":"TAG-it @ EVALITA2020: Overview of the Topic, Age, and Gender Prediction Task for Italian","authors":"Andrea Cimino, F. Dell’Orletta, M. Nissim","doi":"10.4000/BOOKS.AACCADEMIA.7262","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7262","url":null,"abstract":"The Topic, Age, and Gender (TAG-it) prediction task in Italian was organised in the context of EVALITA 2020, using forum posts as textual evidence for profiling their authors. The task was articulated in two separate subtasks: one where all three dimensions (topic, gender, age) were to be predicted at once; the other where training and test sets were drawn from different forum topics and gender or age had to be predicted separately. Teams tackled the problems both with classical machine learning methods as well as neural models. Using the training-data to fine-tuning a BERT-based monolingual model for Italian proved eventually as the most successful strategy in both subtasks. We observe that topic and gender are easier to predict than age. The higher results for gender obtained in this shared task with respect to a comparable challenge at EVALITA 2018 might be due to the larger evidence per author provided at this edition, as well as to the availability of pre-trained large models for fine-tuning, which have shown improvement on very many NLP tasks.","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115496987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
UniBA @ KIPoS: A Hybrid Approach for Part-of-Speech Tagging (short paper) UniBA @ KIPoS:词性标注的混合方法(短文)
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI: 10.4000/BOOKS.AACCADEMIA.7773
Giovanni Luca Izzi, S. Ferilli
{"title":"UniBA @ KIPoS: A Hybrid Approach for Part-of-Speech Tagging (short paper)","authors":"Giovanni Luca Izzi, S. Ferilli","doi":"10.4000/BOOKS.AACCADEMIA.7773","DOIUrl":"https://doi.org/10.4000/BOOKS.AACCADEMIA.7773","url":null,"abstract":"English. The Part of Speech tagging operation is becoming increasingly important as it represents the starting point for other high-level operations such as Speech Recognition, Machine Translation, Parsing and Information Retrieval. Although the accuracy of state-of-the-art POS-taggers reach a high level of accuracy (around 96-97%) it cannot yet be considered a solved problem because there are many variables to take into account. For example, most of these systems use lexical knowledge to assign a tag to unknown words. The task solution proposed in this work is based on a hybrid tagger, which doesn’t use any prior lexical knowledge, consisting of two different types of POS-taggers used sequentially: HMM tagger and RDRPOSTagger [ (Nguyen et al., 2014), (Nguyen et al., 2016)]. We trained the hybrid model using the Development set and the combination of Development and Silver sets. The results have shown an accuracy of 0,8114 and 0,8100 respectively for the main task. Italiano. L’operazione di Part of Speech tagging sta diventando sempre più importante in quanto rappresenta il punto di partenza per altre operazioni di alto livello come Speech Recognition, Machine Translation, Parsing e Information Retrieval. Sebbene l’accuratezza dei POS tagger allo stato dell’arte raggiunga un alto livello di accuratezza (intorno al 9697%), esso non può ancora essere considerato un problema risolto perché ci Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). sono molte variabili da tenere in considerazione. Ad esempio, la maggior parte di questi sistemi utilizza della conoscenza linguistica per assegnare un tag alle parole sconosciute. La soluzione proposta in questo lavoro si basa su un tagger ibrido, che non utilizza alcuna conoscenza linguistica pregressa, costituito da due diversi tipi di POS-tagger usati in sequenza: HMM tagger e RDRPOSTagger [ (Nguyen et al., 2014), (Nguyen et al., 2016)]. Abbiamo addestrato il modello ibrido utilizzando il Development Set e la combinazione di Silver e Development Sets. I risultati hanno mostrato un’accuratezza pari a 0,8114 e 0,8100 rispettivamente per","PeriodicalId":184564,"journal":{"name":"EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126289629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信