Hidden in plain sight: Automatically identifying security requirements from natural language artifacts

2014 IEEE 22nd International Requirements Engineering Conference (RE) Pub Date : 2014-09-29 DOI:10.1109/RE.2014.6912260

M. Riaz, J. King, John Slankas, L. Williams

{"title":"Hidden in plain sight: Automatically identifying security requirements from natural language artifacts","authors":"M. Riaz, J. King, John Slankas, L. Williams","doi":"10.1109/RE.2014.6912260","DOIUrl":null,"url":null,"abstract":"Natural language artifacts, such as requirements specifications, often explicitly state the security requirements for software systems. However, these artifacts may also imply additional security requirements that developers may overlook but should consider to strengthen the overall security of the system. The goal of this research is to aid requirements engineers in producing a more comprehensive and classified set of security requirements by (1) automatically identifying security-relevant sentences in natural language requirements artifacts, and (2) providing context-specific security requirements templates to help translate the security-relevant sentences into functional security requirements. Using machine learning techniques, we have developed a tool-assisted process that takes as input a set of natural language artifacts. Our process automatically identifies security-relevant sentences in the artifacts and classifies them according to the security objectives, either explicitly stated or implied by the sentences. We classified 10,963 sentences in six different documents from healthcare domain and extracted corresponding security objectives. Our manual analysis showed that 46% of the sentences were security-relevant. Of these, 28% explicitly mention security while 72% of the sentences are functional requirements with security implications. Using our tool, we correctly predict and classify 82% of the security objectives for all the sentences (precision). We identify 79% of all security objectives implied by the sentences within the documents (recall). Based on our analysis, we develop context-specific templates that can be instantiated into a set of functional security requirements by filling in key information from security-relevant sentences.","PeriodicalId":307764,"journal":{"name":"2014 IEEE 22nd International Requirements Engineering Conference (RE)","volume":"169 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"71","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 22nd International Requirements Engineering Conference (RE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RE.2014.6912260","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 71

Abstract

Natural language artifacts, such as requirements specifications, often explicitly state the security requirements for software systems. However, these artifacts may also imply additional security requirements that developers may overlook but should consider to strengthen the overall security of the system. The goal of this research is to aid requirements engineers in producing a more comprehensive and classified set of security requirements by (1) automatically identifying security-relevant sentences in natural language requirements artifacts, and (2) providing context-specific security requirements templates to help translate the security-relevant sentences into functional security requirements. Using machine learning techniques, we have developed a tool-assisted process that takes as input a set of natural language artifacts. Our process automatically identifies security-relevant sentences in the artifacts and classifies them according to the security objectives, either explicitly stated or implied by the sentences. We classified 10,963 sentences in six different documents from healthcare domain and extracted corresponding security objectives. Our manual analysis showed that 46% of the sentences were security-relevant. Of these, 28% explicitly mention security while 72% of the sentences are functional requirements with security implications. Using our tool, we correctly predict and classify 82% of the security objectives for all the sentences (precision). We identify 79% of all security objectives implied by the sentences within the documents (recall). Based on our analysis, we develop context-specific templates that can be instantiated into a set of functional security requirements by filling in key information from security-relevant sentences.

查看原文本刊更多论文

隐藏在显而易见的地方:从自然语言工件自动识别安全需求

自然语言工件，例如需求规范，经常明确地说明软件系统的安全性需求。然而，这些工件也可能意味着额外的安全需求，开发人员可能会忽略这些需求，但是应该考虑加强系统的整体安全性。这项研究的目标是通过(1)在自然语言需求工件中自动识别与安全相关的句子，以及(2)提供特定于上下文的安全需求模板来帮助将与安全相关的句子转换为功能安全需求，从而帮助需求工程师生成更全面和分类的安全需求集。使用机器学习技术，我们开发了一种工具辅助过程，将一组自然语言工件作为输入。我们的流程自动识别工件中与安全相关的句子，并根据安全目标对它们进行分类，这些安全目标可以是显式陈述的，也可以是句子暗示的。我们对来自医疗保健领域的6个不同文档中的10,963个句子进行了分类，并提取了相应的安全目标。我们的人工分析显示46%的句子与安全相关。其中，28%明确提到了安全性，而72%的句子是带有安全性含义的功能需求。使用我们的工具，我们正确地预测和分类了所有句子的82%的安全目标(精度)。我们确定了文档中句子所暗示的79%的安全目标(回忆)。根据我们的分析，我们开发了特定于上下文的模板，通过填充与安全相关的句子中的关键信息，这些模板可以实例化为一组功能安全需求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE 22nd International Requirements Engineering Conference (RE)

自引率

0.00%

发文量