Diogo Fernandes Costa Silva, A. Junior, Gabriel Marques, A. Soares, A. R. G. Filho
{"title":"CEIA-NLP在CASE 2022任务1:葡萄牙语抗议新闻检测","authors":"Diogo Fernandes Costa Silva, A. Junior, Gabriel Marques, A. Soares, A. R. G. Filho","doi":"10.18653/v1/2022.case-1.26","DOIUrl":null,"url":null,"abstract":"This paper summarizes our work on the document classification subtask of Multilingual protest news detection of the CASE @ ACL-IJCNLP 2022 workshok. In this context, we investigate the performance of monolingual and multilingual transformer-based models in low data resources, taking Portuguese as an example and evaluating language models on document classification. Our approach became the winning solution in Portuguese document classification achieving 0.8007 F1 Score on Test set. The experimental results demonstrate that multilingual models achieve best results in scenarios with few dataset samples of specific language, because we can train models using datasets from other languages of the same task and domain.","PeriodicalId":80307,"journal":{"name":"The Case manager","volume":"16 1","pages":"184-188"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"CEIA-NLP at CASE 2022 Task 1: Protest News Detection for Portuguese\",\"authors\":\"Diogo Fernandes Costa Silva, A. Junior, Gabriel Marques, A. Soares, A. R. G. Filho\",\"doi\":\"10.18653/v1/2022.case-1.26\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper summarizes our work on the document classification subtask of Multilingual protest news detection of the CASE @ ACL-IJCNLP 2022 workshok. In this context, we investigate the performance of monolingual and multilingual transformer-based models in low data resources, taking Portuguese as an example and evaluating language models on document classification. Our approach became the winning solution in Portuguese document classification achieving 0.8007 F1 Score on Test set. The experimental results demonstrate that multilingual models achieve best results in scenarios with few dataset samples of specific language, because we can train models using datasets from other languages of the same task and domain.\",\"PeriodicalId\":80307,\"journal\":{\"name\":\"The Case manager\",\"volume\":\"16 1\",\"pages\":\"184-188\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Case manager\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/2022.case-1.26\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Case manager","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2022.case-1.26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CEIA-NLP at CASE 2022 Task 1: Protest News Detection for Portuguese
This paper summarizes our work on the document classification subtask of Multilingual protest news detection of the CASE @ ACL-IJCNLP 2022 workshok. In this context, we investigate the performance of monolingual and multilingual transformer-based models in low data resources, taking Portuguese as an example and evaluating language models on document classification. Our approach became the winning solution in Portuguese document classification achieving 0.8007 F1 Score on Test set. The experimental results demonstrate that multilingual models achieve best results in scenarios with few dataset samples of specific language, because we can train models using datasets from other languages of the same task and domain.