G. Pradnyana, Wiwik Anggraeni, E. M. Yuniarno, M. Purnomo
{"title":"Fine-Tuning IndoBERT Model for Big Five Personality Prediction from Indonesian Social Media","authors":"G. Pradnyana, Wiwik Anggraeni, E. M. Yuniarno, M. Purnomo","doi":"10.1109/ISITIA59021.2023.10221074","DOIUrl":null,"url":null,"abstract":"The increasing amount of data generated from social media brings opportunities to produce various helpful knowledge and information, one of which is predicting a person’s personality. The advent of attention-based classification techniques and transformers brings promising results on multiple tasks in Natural Language Processing (NLP). In this study, we predicted the Big Five personalities from Indonesian-language social media by fine-tuning the IndoBERT model. The IndoBERT model is a Bidirectional Encoder Representation from Transformers (BERT)-based mono-language model trained in the Indonesian corpus. Based on the experimental results, the prediction model we proposed obtained an accuracy value of 72% and an F1-score of 71% using IndoBERT-base. Meanwhile, using IndoBERT-large can increase the accuracy value by 78% and the F1-score by 74%. The proposed model also outperforms previous models in predicting the Big Five personality.","PeriodicalId":116682,"journal":{"name":"2023 International Seminar on Intelligent Technology and Its Applications (ISITIA)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Seminar on Intelligent Technology and Its Applications (ISITIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISITIA59021.2023.10221074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The increasing amount of data generated from social media brings opportunities to produce various helpful knowledge and information, one of which is predicting a person’s personality. The advent of attention-based classification techniques and transformers brings promising results on multiple tasks in Natural Language Processing (NLP). In this study, we predicted the Big Five personalities from Indonesian-language social media by fine-tuning the IndoBERT model. The IndoBERT model is a Bidirectional Encoder Representation from Transformers (BERT)-based mono-language model trained in the Indonesian corpus. Based on the experimental results, the prediction model we proposed obtained an accuracy value of 72% and an F1-score of 71% using IndoBERT-base. Meanwhile, using IndoBERT-large can increase the accuracy value by 78% and the F1-score by 74%. The proposed model also outperforms previous models in predicting the Big Five personality.