{"title":"Security Requirements Classification into Groups Using NLP Transformers","authors":"V. Varenov, Aydar Gabdrahmanov","doi":"10.1109/REW53955.2021.9714713","DOIUrl":null,"url":null,"abstract":"This study presents an implementation of sentencelevel classification of security requirements into predefined groups. The method of this paper suggests using three models: BERT, XLNET, and DistilBERT for classification task and figures out evaluation metrics such as precision, recall, and F1-score. We compiled a new dataset of 1086 security requirements of 7 classes collected from multiple existing datasets, such as PURE, SecReq and Riaz's dataset. The best-achieved result is DistilBERT’s 78% F1-score on the multiclass classification task. The main contribution of this study is the new multiclass dataset of security requirements and an example of how a deep transformer model can be used for requirements elicitation, which can be used as a basis for further improvement.","PeriodicalId":393646,"journal":{"name":"2021 IEEE 29th International Requirements Engineering Conference Workshops (REW)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 29th International Requirements Engineering Conference Workshops (REW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/REW53955.2021.9714713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
This study presents an implementation of sentencelevel classification of security requirements into predefined groups. The method of this paper suggests using three models: BERT, XLNET, and DistilBERT for classification task and figures out evaluation metrics such as precision, recall, and F1-score. We compiled a new dataset of 1086 security requirements of 7 classes collected from multiple existing datasets, such as PURE, SecReq and Riaz's dataset. The best-achieved result is DistilBERT’s 78% F1-score on the multiclass classification task. The main contribution of this study is the new multiclass dataset of security requirements and an example of how a deep transformer model can be used for requirements elicitation, which can be used as a basis for further improvement.