{"title":"Removement of MisLeading Data in Transfer Learning Using BERT","authors":"S. Iwamoto, Hiroyuki Shinnou","doi":"10.1109/TAAI57707.2022.00043","DOIUrl":null,"url":null,"abstract":"When using machine learning to solve natural language processing tasks, the domains in which the model is trained and the domains in which the learned model is applied are different. Domain shift problem reduces model performance. Transfer learning using Bidirectional Encoder Representations from Transformers(BERT) is an effective method used for solving this problem. However, even with this method, we face the problem known as “negative transfer”, which occurs when some source labeled data adversely affect the learning in the target domain. In this study, we propose a for removing misleading data, causing negative transfer, for document classification tasks. We demonstrated the effectiveness of our proposed method in an experiment using the Webis-CLS-10 dataset.","PeriodicalId":111620,"journal":{"name":"2022 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAAI57707.2022.00043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
When using machine learning to solve natural language processing tasks, the domains in which the model is trained and the domains in which the learned model is applied are different. Domain shift problem reduces model performance. Transfer learning using Bidirectional Encoder Representations from Transformers(BERT) is an effective method used for solving this problem. However, even with this method, we face the problem known as “negative transfer”, which occurs when some source labeled data adversely affect the learning in the target domain. In this study, we propose a for removing misleading data, causing negative transfer, for document classification tasks. We demonstrated the effectiveness of our proposed method in an experiment using the Webis-CLS-10 dataset.