用于策略文档需求分析的自动文本挖掘

Aaron K. Massey, Jacob Eisenstein, A. Antón, Peter P. Swire
{"title":"用于策略文档需求分析的自动文本挖掘","authors":"Aaron K. Massey, Jacob Eisenstein, A. Antón, Peter P. Swire","doi":"10.1109/RE.2013.6636700","DOIUrl":null,"url":null,"abstract":"Businesses and organizations in jurisdictions around the world are required by law to provide their customers and users with information about their business practices in the form of policy documents. Requirements engineers analyze these documents as sources of requirements, but this analysis is a time-consuming and mostly manual process. Moreover, policy documents contain legalese and present readability challenges to requirements engineers seeking to analyze them. In this paper, we perform a large-scale analysis of 2,061 policy documents, including policy documents from the Google Top 1000 most visited websites and the Fortune 500 companies, for three purposes: (1) to assess the readability of these policy documents for requirements engineers; (2) to determine if automated text mining can indicate whether a policy document contains requirements expressed as either privacy protections or vulnerabilities; and (3) to establish the generalizability of prior work in the identification of privacy protections and vulnerabilities from privacy policies to other policy documents. Our results suggest that this requirements analysis technique, developed on a small set of policy documents in two domains, may generalize to other domains.","PeriodicalId":6342,"journal":{"name":"2013 21st IEEE International Requirements Engineering Conference (RE)","volume":"29 1","pages":"4-13"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"75","resultStr":"{\"title\":\"Automated text mining for requirements analysis of policy documents\",\"authors\":\"Aaron K. Massey, Jacob Eisenstein, A. Antón, Peter P. Swire\",\"doi\":\"10.1109/RE.2013.6636700\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Businesses and organizations in jurisdictions around the world are required by law to provide their customers and users with information about their business practices in the form of policy documents. Requirements engineers analyze these documents as sources of requirements, but this analysis is a time-consuming and mostly manual process. Moreover, policy documents contain legalese and present readability challenges to requirements engineers seeking to analyze them. In this paper, we perform a large-scale analysis of 2,061 policy documents, including policy documents from the Google Top 1000 most visited websites and the Fortune 500 companies, for three purposes: (1) to assess the readability of these policy documents for requirements engineers; (2) to determine if automated text mining can indicate whether a policy document contains requirements expressed as either privacy protections or vulnerabilities; and (3) to establish the generalizability of prior work in the identification of privacy protections and vulnerabilities from privacy policies to other policy documents. Our results suggest that this requirements analysis technique, developed on a small set of policy documents in two domains, may generalize to other domains.\",\"PeriodicalId\":6342,\"journal\":{\"name\":\"2013 21st IEEE International Requirements Engineering Conference (RE)\",\"volume\":\"29 1\",\"pages\":\"4-13\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"75\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 21st IEEE International Requirements Engineering Conference (RE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RE.2013.6636700\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 21st IEEE International Requirements Engineering Conference (RE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RE.2013.6636700","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 75

摘要

法律要求世界各地的企业和组织以政策文件的形式向其客户和用户提供有关其业务实践的信息。需求工程师将这些文档作为需求的来源进行分析,但是这种分析是一个耗时且主要是手工的过程。此外,策略文档包含法律术语,并对寻求分析它们的需求工程师提出可读性挑战。在本文中,我们对2061份政策文件进行了大规模的分析,其中包括来自b谷歌访问量最大的1000家网站和财富500强公司的政策文件,目的有三个:(1)评估这些政策文件对需求工程师的可读性;(2)确定自动文本挖掘是否可以指示策略文档是否包含以隐私保护或漏洞表示的需求;(3)建立从隐私政策到其他政策文件识别隐私保护和漏洞的先前工作的通用性。我们的结果表明,这种在两个领域的一小组策略文档上开发的需求分析技术可以推广到其他领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automated text mining for requirements analysis of policy documents
Businesses and organizations in jurisdictions around the world are required by law to provide their customers and users with information about their business practices in the form of policy documents. Requirements engineers analyze these documents as sources of requirements, but this analysis is a time-consuming and mostly manual process. Moreover, policy documents contain legalese and present readability challenges to requirements engineers seeking to analyze them. In this paper, we perform a large-scale analysis of 2,061 policy documents, including policy documents from the Google Top 1000 most visited websites and the Fortune 500 companies, for three purposes: (1) to assess the readability of these policy documents for requirements engineers; (2) to determine if automated text mining can indicate whether a policy document contains requirements expressed as either privacy protections or vulnerabilities; and (3) to establish the generalizability of prior work in the identification of privacy protections and vulnerabilities from privacy policies to other policy documents. Our results suggest that this requirements analysis technique, developed on a small set of policy documents in two domains, may generalize to other domains.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信