通过聚类改进需求术语表的构建:方法和行业案例研究

Chetan Arora, M. Sabetzadeh, L. Briand, Frank Zimmer
{"title":"通过聚类改进需求术语表的构建:方法和行业案例研究","authors":"Chetan Arora, M. Sabetzadeh, L. Briand, Frank Zimmer","doi":"10.1145/2652524.2652530","DOIUrl":null,"url":null,"abstract":"Context. A glossary is an important part of any software requirements document. By making explicit the technical terms in a domain and providing definitions for them, a glossary serves as a helpful tool for mitigating ambiguities.\n Goal. A necessary step for building a glossary is to decide upon the glossary terms and to identify their related terms. Doing so manually is a laborious task. Our objective is to provide automated support for identifying candidate glossary terms and their related terms. Our work differs from existing work on term extraction mainly in that, instead of providing a flat list of candidate terms, our approach clusters the terms by relevance.\n Method. We use case study research as the basis for our empirical investigation.\n Results. We present an automated approach for identifying and clustering candidate glossary terms. We evaluate the approach through two industrial case studies; one study concerns a satellite software component, and the other -- an evidence management tool for safety certification.\n Conclusions. Our results indicate that over requirements documents: (1) our approach is more accurate than other existing methods for identifying candidate glossary terms; this makes it less likely that our approach will miss important glossary terms. (2) Clustering provides an effective basis for grouping related terms; this makes clustering a useful support tool for selection of glossary terms and associating these terms with their related terms.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Improving requirements glossary construction via clustering: approach and industrial case studies\",\"authors\":\"Chetan Arora, M. Sabetzadeh, L. Briand, Frank Zimmer\",\"doi\":\"10.1145/2652524.2652530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Context. A glossary is an important part of any software requirements document. By making explicit the technical terms in a domain and providing definitions for them, a glossary serves as a helpful tool for mitigating ambiguities.\\n Goal. A necessary step for building a glossary is to decide upon the glossary terms and to identify their related terms. Doing so manually is a laborious task. Our objective is to provide automated support for identifying candidate glossary terms and their related terms. Our work differs from existing work on term extraction mainly in that, instead of providing a flat list of candidate terms, our approach clusters the terms by relevance.\\n Method. We use case study research as the basis for our empirical investigation.\\n Results. We present an automated approach for identifying and clustering candidate glossary terms. We evaluate the approach through two industrial case studies; one study concerns a satellite software component, and the other -- an evidence management tool for safety certification.\\n Conclusions. Our results indicate that over requirements documents: (1) our approach is more accurate than other existing methods for identifying candidate glossary terms; this makes it less likely that our approach will miss important glossary terms. (2) Clustering provides an effective basis for grouping related terms; this makes clustering a useful support tool for selection of glossary terms and associating these terms with their related terms.\",\"PeriodicalId\":124452,\"journal\":{\"name\":\"International Symposium on Empirical Software Engineering and Measurement\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Symposium on Empirical Software Engineering and Measurement\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2652524.2652530\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2652524.2652530","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

摘要

上下文。术语表是任何软件需求文档的重要组成部分。通过显式地表示领域中的技术术语并为其提供定义,术语表可以作为减轻歧义的有用工具。的目标。构建术语表的一个必要步骤是确定术语表术语并识别与之相关的术语。手动这样做是一项费力的任务。我们的目标是为识别候选词汇表术语及其相关术语提供自动化支持。我们的工作与现有的术语提取工作的不同之处在于,我们的方法不是提供一个候选术语的平面列表,而是根据相关性对术语进行聚类。方法。我们使用案例研究作为实证调查的基础。结果。我们提出了一种自动识别和聚类候选术语表术语的方法。我们通过两个工业案例研究来评估这种方法;一项研究涉及卫星软件组件,另一项研究涉及安全认证的证据管理工具。结论。我们的结果表明,在需求文档中:(1)我们的方法在识别候选术语表术语方面比其他现有方法更准确;这使得我们的方法不太可能遗漏重要的术语表。(2)聚类为相关术语的分组提供了有效的依据;这使得聚类成为选择术语表术语并将这些术语与相关术语关联起来的有用支持工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving requirements glossary construction via clustering: approach and industrial case studies
Context. A glossary is an important part of any software requirements document. By making explicit the technical terms in a domain and providing definitions for them, a glossary serves as a helpful tool for mitigating ambiguities. Goal. A necessary step for building a glossary is to decide upon the glossary terms and to identify their related terms. Doing so manually is a laborious task. Our objective is to provide automated support for identifying candidate glossary terms and their related terms. Our work differs from existing work on term extraction mainly in that, instead of providing a flat list of candidate terms, our approach clusters the terms by relevance. Method. We use case study research as the basis for our empirical investigation. Results. We present an automated approach for identifying and clustering candidate glossary terms. We evaluate the approach through two industrial case studies; one study concerns a satellite software component, and the other -- an evidence management tool for safety certification. Conclusions. Our results indicate that over requirements documents: (1) our approach is more accurate than other existing methods for identifying candidate glossary terms; this makes it less likely that our approach will miss important glossary terms. (2) Clustering provides an effective basis for grouping related terms; this makes clustering a useful support tool for selection of glossary terms and associating these terms with their related terms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信