Julia da Rocha Junqueira, F. Silva, Wesley Costa, Rodrigo Carvalho, A. Bender, U. Corrêa, L. Freitas
{"title":"BERTimbau在行动中的应用:情感分析、方面提取、仇恨言论检测和反语检测能力的研究","authors":"Julia da Rocha Junqueira, F. Silva, Wesley Costa, Rodrigo Carvalho, A. Bender, U. Corrêa, L. Freitas","doi":"10.32473/flairs.36.133186","DOIUrl":null,"url":null,"abstract":"Social Media has revolutionized how individuals, groups, and communities interact. This immense quantity of unstructured data holds valuable information expressed in informal language. However, automatically extracting this information using Natural Language Processing requires adaptations of traditional methods or the development of new strategies capable of extracting information tackling web-prone language. BERT, a Deep Learning methodology proposed by Google in 2018, brought transfer learning to Natural Language Processing. In this work, we used a BERT model for the Portuguese language called BERTimbau to create models for Sentiment Analysis, Aspect Extraction, Hate Speech Detection, and Irony Detection. We experimented with the two BERTimbau models, base and large. Finally, we compared the results obtained in each task. Experiments with BERTimbau based models obtained improved results, F-Measure of 0.88 and 0.89 in Sentiment Analysis and Hate Speech Detection tasks, respectively, compared to classical Machine Learning approaches.","PeriodicalId":302103,"journal":{"name":"The International FLAIRS Conference Proceedings","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BERTimbau in Action: An Investigation of its Abilities in Sentiment Analysis, Aspect Extraction, Hate Speech Detection, and Irony Detection\",\"authors\":\"Julia da Rocha Junqueira, F. Silva, Wesley Costa, Rodrigo Carvalho, A. Bender, U. Corrêa, L. Freitas\",\"doi\":\"10.32473/flairs.36.133186\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social Media has revolutionized how individuals, groups, and communities interact. This immense quantity of unstructured data holds valuable information expressed in informal language. However, automatically extracting this information using Natural Language Processing requires adaptations of traditional methods or the development of new strategies capable of extracting information tackling web-prone language. BERT, a Deep Learning methodology proposed by Google in 2018, brought transfer learning to Natural Language Processing. In this work, we used a BERT model for the Portuguese language called BERTimbau to create models for Sentiment Analysis, Aspect Extraction, Hate Speech Detection, and Irony Detection. We experimented with the two BERTimbau models, base and large. Finally, we compared the results obtained in each task. Experiments with BERTimbau based models obtained improved results, F-Measure of 0.88 and 0.89 in Sentiment Analysis and Hate Speech Detection tasks, respectively, compared to classical Machine Learning approaches.\",\"PeriodicalId\":302103,\"journal\":{\"name\":\"The International FLAIRS Conference Proceedings\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The International FLAIRS Conference Proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32473/flairs.36.133186\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International FLAIRS Conference Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32473/flairs.36.133186","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
社交媒体彻底改变了个人、团体和社区的互动方式。这些大量的非结构化数据包含了用非正式语言表达的有价值的信息。然而,使用自然语言处理(Natural Language Processing)自动提取这些信息,需要对传统方法进行调整,或者开发新的策略,能够提取易受网络影响的语言的信息。BERT是谷歌在2018年提出的一种深度学习方法,它将迁移学习引入了自然语言处理。在这项工作中,我们使用了一个名为BERTimbau的葡萄牙语BERT模型来创建情感分析、方面提取、仇恨言论检测和讽刺检测的模型。我们试验了两种BERTimbau模型,小模型和大模型。最后,我们比较了每个任务得到的结果。与经典机器学习方法相比,基于BERTimbau模型的实验获得了更好的结果,在情感分析和仇恨语音检测任务中,F-Measure分别为0.88和0.89。
BERTimbau in Action: An Investigation of its Abilities in Sentiment Analysis, Aspect Extraction, Hate Speech Detection, and Irony Detection
Social Media has revolutionized how individuals, groups, and communities interact. This immense quantity of unstructured data holds valuable information expressed in informal language. However, automatically extracting this information using Natural Language Processing requires adaptations of traditional methods or the development of new strategies capable of extracting information tackling web-prone language. BERT, a Deep Learning methodology proposed by Google in 2018, brought transfer learning to Natural Language Processing. In this work, we used a BERT model for the Portuguese language called BERTimbau to create models for Sentiment Analysis, Aspect Extraction, Hate Speech Detection, and Irony Detection. We experimented with the two BERTimbau models, base and large. Finally, we compared the results obtained in each task. Experiments with BERTimbau based models obtained improved results, F-Measure of 0.88 and 0.89 in Sentiment Analysis and Hate Speech Detection tasks, respectively, compared to classical Machine Learning approaches.