Jeziel C. Marinho, Rafael T. Anchiêta, Raimundo S. Moura
{"title":"Essay-BR: a Brazilian Corpus to Automatic Essay Scoring Task","authors":"Jeziel C. Marinho, Rafael T. Anchiêta, Raimundo S. Moura","doi":"10.5753/jidm.2022.2340","DOIUrl":null,"url":null,"abstract":"Automatic Essay Scoring (AES) is the computer technology that evaluates and scores the written essays, aiming to provide computational models to grade essays automatically or with minimal human involvement. While there are several AES studies in a variety of languages, few of them are focused on the Portuguese language. The main reason is the lack of a corpus with manually graded essays. In order to bridge this gap, in this paper we extended a corpus of essays written by Brazilian high school students in an online platform. All of the essays are argumentative and were scored across five competences by experts. Moreover, we conducted an experiment with the extended corpus to show some challenges posed by the Portuguese language. The corpus are publicly available at https://github.com/lplnufpi/essay-br.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information and Data Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/jidm.2022.2340","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Automatic Essay Scoring (AES) is the computer technology that evaluates and scores the written essays, aiming to provide computational models to grade essays automatically or with minimal human involvement. While there are several AES studies in a variety of languages, few of them are focused on the Portuguese language. The main reason is the lack of a corpus with manually graded essays. In order to bridge this gap, in this paper we extended a corpus of essays written by Brazilian high school students in an online platform. All of the essays are argumentative and were scored across five competences by experts. Moreover, we conducted an experiment with the extended corpus to show some challenges posed by the Portuguese language. The corpus are publicly available at https://github.com/lplnufpi/essay-br.