A Domain-Independent Model for Identifying Security Requirements

2017 IEEE 25th International Requirements Engineering Conference (RE) Pub Date : 2017-09-01 DOI:10.1109/RE.2017.79

Nuthan Munaiah, Andrew Meneely, Pradeep K. Murukannaiah

{"title":"A Domain-Independent Model for Identifying Security Requirements","authors":"Nuthan Munaiah, Andrew Meneely, Pradeep K. Murukannaiah","doi":"10.1109/RE.2017.79","DOIUrl":null,"url":null,"abstract":"Existing work on identifying security requirements relies on training binary classification models using domain-specific data sets to achieve a high accuracy. Considering that domain-specific data sets are often not readily available, we propose a domain-independent model for classifying security requirements based on two key ideas. First, we train our model on the description of weaknesses from the Common Weakness Enumeration (CWE) data set. Although CWE does not describe requirements, it describes security weaknesses that are manifestations of unrealized security requirements. Second, we exploit a one-class classification model that relies only on positive samples (description of weaknesses in CWE), eliminating the need for negative samples, collecting which can be nontrivial.We evaluated our model on three industrial requirements documents from different domains. We found that a One-Class Support Vector Machine trained with domain-independent CWE data set outperforms a model from prior literature by identifying security requirements with an average precision, recall and F-score of 67.35%, 70.48% and 67.68%, respectively. Further, considering data sets from prior literature (consisting of both positive and negative examples), we found that one-class classifiers trained with only positive examples outperformed binary classifiers trained with both positive and negative examples in two out of three evaluation data sets, demonstrating the potential value of one-class classification for security requirements identification.","PeriodicalId":176958,"journal":{"name":"2017 IEEE 25th International Requirements Engineering Conference (RE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 25th International Requirements Engineering Conference (RE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RE.2017.79","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

Existing work on identifying security requirements relies on training binary classification models using domain-specific data sets to achieve a high accuracy. Considering that domain-specific data sets are often not readily available, we propose a domain-independent model for classifying security requirements based on two key ideas. First, we train our model on the description of weaknesses from the Common Weakness Enumeration (CWE) data set. Although CWE does not describe requirements, it describes security weaknesses that are manifestations of unrealized security requirements. Second, we exploit a one-class classification model that relies only on positive samples (description of weaknesses in CWE), eliminating the need for negative samples, collecting which can be nontrivial.We evaluated our model on three industrial requirements documents from different domains. We found that a One-Class Support Vector Machine trained with domain-independent CWE data set outperforms a model from prior literature by identifying security requirements with an average precision, recall and F-score of 67.35%, 70.48% and 67.68%, respectively. Further, considering data sets from prior literature (consisting of both positive and negative examples), we found that one-class classifiers trained with only positive examples outperformed binary classifiers trained with both positive and negative examples in two out of three evaluation data sets, demonstrating the potential value of one-class classification for security requirements identification.

查看原文本刊更多论文

用于识别安全需求的领域独立模型

现有的识别安全需求的工作依赖于使用特定于领域的数据集来训练二进制分类模型，以达到较高的准确性。考虑到领域特定的数据集通常不容易获得，我们提出了一个基于两个关键思想的领域独立模型，用于对安全需求进行分类。首先，我们在通用弱点枚举(Common Weakness Enumeration, CWE)数据集的弱点描述上训练我们的模型。尽管CWE没有描述需求，但它描述了未实现的安全需求的表现形式的安全弱点。其次，我们利用了一个单类分类模型，它只依赖于正样本(描述CWE中的弱点)，消除了对负样本的需要，收集的负样本可能是不平凡的。我们在来自不同领域的三个工业需求文档上评估了我们的模型。我们发现，用领域无关的CWE数据集训练的一类支持向量机识别安全需求的平均精度、召回率和f分分别为67.35%、70.48%和67.68%，优于先前文献中的模型。此外，考虑到来自先前文献的数据集(由正例和负例组成)，我们发现仅用正例训练的单类分类器在三个评估数据集中的两个中优于同时用正例和负例训练的二元分类器，证明了单类分类对安全需求识别的潜在价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE 25th International Requirements Engineering Conference (RE)

自引率

0.00%

发文量