H. Wang, Mu Liu, Katsushi Yamashita, Yasuhiro Okamoto, Satoshi Yamada
{"title":"A Data-Efficient Method for One-Shot Text Classification","authors":"H. Wang, Mu Liu, Katsushi Yamashita, Yasuhiro Okamoto, Satoshi Yamada","doi":"10.1109/CCAI55564.2022.9807798","DOIUrl":null,"url":null,"abstract":"In this paper, we propose BiGBERT (Binary Grouping BERT), a data-efficient training method for one-shot text classification. With the idea of One-vs-Rest method, we designed an extensible output layer for BERT, which can increase the usability of the training data. To evaluate our approach, we conducted extensive experiments on four celebrated text classification datasets, and reform these datasets into one-shot training scenario, which is approximately equal to the situation of our commercial datasets. The experiment result shows our approach achieves 54.9% in 5AbstractsGroup dataset, 40.2% in 20NewsGroup dataset, 57.0% in IMDB dataset, and 33.6% in TREC dataset. Overall, compare to the baseline BERT, our proposed method achieves 2.3% $\\sim$ 28.6% improved in accuracy. This result shows BiGBERT is stable and have significantly improved on one-shot text classification.","PeriodicalId":340195,"journal":{"name":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCAI55564.2022.9807798","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we propose BiGBERT (Binary Grouping BERT), a data-efficient training method for one-shot text classification. With the idea of One-vs-Rest method, we designed an extensible output layer for BERT, which can increase the usability of the training data. To evaluate our approach, we conducted extensive experiments on four celebrated text classification datasets, and reform these datasets into one-shot training scenario, which is approximately equal to the situation of our commercial datasets. The experiment result shows our approach achieves 54.9% in 5AbstractsGroup dataset, 40.2% in 20NewsGroup dataset, 57.0% in IMDB dataset, and 33.6% in TREC dataset. Overall, compare to the baseline BERT, our proposed method achieves 2.3% $\sim$ 28.6% improved in accuracy. This result shows BiGBERT is stable and have significantly improved on one-shot text classification.