{"title":"Improving BERT Classification Performance on Short Queries About UNIX Commands Using an Additional Round of Fine-Tuning on Related Data","authors":"Grady McPeak","doi":"10.1109/AICT55583.2022.10013532","DOIUrl":null,"url":null,"abstract":"One of the great advantages of machine learning as a whole is its ability to assist a human with digesting extremely large sets of data, and helping them to learn useful information that otherwise would have been significantly more difficult to piece together, and improvements to ML models often can result in improvements in this ability. To that end, this paper presents an evaluation of the relative performances of differing versions of Bidirectional Encoder Representations from Transformers (BERT) on the task of classifying a dataset of titles from posts scraped from two UNIX-related Q&A forum websites into classes based on what command each post is most likely about. The differing versions of BERT were each first fine-tuned on a different dataset from the post titles in order to try to improve the accuracy and precision of the model’s classification abilities through the introduction of relevant yet longer, more detailed, and more information-rich information. Additionally, the performances of these models are compared to that of the Heterogeneous Graph Attention Network (HGAT). The novel contributions of this paper are a real-world-use comparison between HGAT and BERT, the production of a novel dataset, and the presentation of supporting evidence for the value of relevance and length of text in pretraining for short-text classification.","PeriodicalId":441475,"journal":{"name":"2022 IEEE 16th International Conference on Application of Information and Communication Technologies (AICT)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 16th International Conference on Application of Information and Communication Technologies (AICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICT55583.2022.10013532","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
One of the great advantages of machine learning as a whole is its ability to assist a human with digesting extremely large sets of data, and helping them to learn useful information that otherwise would have been significantly more difficult to piece together, and improvements to ML models often can result in improvements in this ability. To that end, this paper presents an evaluation of the relative performances of differing versions of Bidirectional Encoder Representations from Transformers (BERT) on the task of classifying a dataset of titles from posts scraped from two UNIX-related Q&A forum websites into classes based on what command each post is most likely about. The differing versions of BERT were each first fine-tuned on a different dataset from the post titles in order to try to improve the accuracy and precision of the model’s classification abilities through the introduction of relevant yet longer, more detailed, and more information-rich information. Additionally, the performances of these models are compared to that of the Heterogeneous Graph Attention Network (HGAT). The novel contributions of this paper are a real-world-use comparison between HGAT and BERT, the production of a novel dataset, and the presentation of supporting evidence for the value of relevance and length of text in pretraining for short-text classification.