通过聚类改进需求术语表的构建:方法和行业案例研究

International Symposium on Empirical Software Engineering and Measurement Pub Date : 2014-09-18 DOI:10.1145/2652524.2652530

Chetan Arora, M. Sabetzadeh, L. Briand, Frank Zimmer

{"title":"通过聚类改进需求术语表的构建:方法和行业案例研究","authors":"Chetan Arora, M. Sabetzadeh, L. Briand, Frank Zimmer","doi":"10.1145/2652524.2652530","DOIUrl":null,"url":null,"abstract":"Context. A glossary is an important part of any software requirements document. By making explicit the technical terms in a domain and providing definitions for them, a glossary serves as a helpful tool for mitigating ambiguities.\n Goal. A necessary step for building a glossary is to decide upon the glossary terms and to identify their related terms. Doing so manually is a laborious task. Our objective is to provide automated support for identifying candidate glossary terms and their related terms. Our work differs from existing work on term extraction mainly in that, instead of providing a flat list of candidate terms, our approach clusters the terms by relevance.\n Method. We use case study research as the basis for our empirical investigation.\n Results. We present an automated approach for identifying and clustering candidate glossary terms. We evaluate the approach through two industrial case studies; one study concerns a satellite software component, and the other -- an evidence management tool for safety certification.\n Conclusions. Our results indicate that over requirements documents: (1) our approach is more accurate than other existing methods for identifying candidate glossary terms; this makes it less likely that our approach will miss important glossary terms. (2) Clustering provides an effective basis for grouping related terms; this makes clustering a useful support tool for selection of glossary terms and associating these terms with their related terms.","PeriodicalId":124452,"journal":{"name":"International Symposium on Empirical Software Engineering and Measurement","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Improving requirements glossary construction via clustering: approach and industrial case studies\",\"authors\":\"Chetan Arora, M. Sabetzadeh, L. Briand, Frank Zimmer\",\"doi\":\"10.1145/2652524.2652530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Context. A glossary is an important part of any software requirements document. By making explicit the technical terms in a domain and providing definitions for them, a glossary serves as a helpful tool for mitigating ambiguities.\\n Goal. A necessary step for building a glossary is to decide upon the glossary terms and to identify their related terms. Doing so manually is a laborious task. Our objective is to provide automated support for identifying candidate glossary terms and their related terms. Our work differs from existing work on term extraction mainly in that, instead of providing a flat list of candidate terms, our approach clusters the terms by relevance.\\n Method. We use case study research as the basis for our empirical investigation.\\n Results. We present an automated approach for identifying and clustering candidate glossary terms. We evaluate the approach through two industrial case studies; one study concerns a satellite software component, and the other -- an evidence management tool for safety certification.\\n Conclusions. Our results indicate that over requirements documents: (1) our approach is more accurate than other existing methods for identifying candidate glossary terms; this makes it less likely that our approach will miss important glossary terms. (2) Clustering provides an effective basis for grouping related terms; this makes clustering a useful support tool for selection of glossary terms and associating these terms with their related terms.\",\"PeriodicalId\":124452,\"journal\":{\"name\":\"International Symposium on Empirical Software Engineering and Measurement\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Symposium on Empirical Software Engineering and Measurement\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2652524.2652530\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2652524.2652530","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 19

摘要

上下文。术语表是任何软件需求文档的重要组成部分。通过显式地表示领域中的技术术语并为其提供定义，术语表可以作为减轻歧义的有用工具。的目标。构建术语表的一个必要步骤是确定术语表术语并识别与之相关的术语。手动这样做是一项费力的任务。我们的目标是为识别候选词汇表术语及其相关术语提供自动化支持。我们的工作与现有的术语提取工作的不同之处在于，我们的方法不是提供一个候选术语的平面列表，而是根据相关性对术语进行聚类。方法。我们使用案例研究作为实证调查的基础。结果。我们提出了一种自动识别和聚类候选术语表术语的方法。我们通过两个工业案例研究来评估这种方法;一项研究涉及卫星软件组件，另一项研究涉及安全认证的证据管理工具。结论。我们的结果表明，在需求文档中:(1)我们的方法在识别候选术语表术语方面比其他现有方法更准确;这使得我们的方法不太可能遗漏重要的术语表。(2)聚类为相关术语的分组提供了有效的依据;这使得聚类成为选择术语表术语并将这些术语与相关术语关联起来的有用支持工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving requirements glossary construction via clustering: approach and industrial case studies

Context. A glossary is an important part of any software requirements document. By making explicit the technical terms in a domain and providing definitions for them, a glossary serves as a helpful tool for mitigating ambiguities. Goal. A necessary step for building a glossary is to decide upon the glossary terms and to identify their related terms. Doing so manually is a laborious task. Our objective is to provide automated support for identifying candidate glossary terms and their related terms. Our work differs from existing work on term extraction mainly in that, instead of providing a flat list of candidate terms, our approach clusters the terms by relevance. Method. We use case study research as the basis for our empirical investigation. Results. We present an automated approach for identifying and clustering candidate glossary terms. We evaluate the approach through two industrial case studies; one study concerns a satellite software component, and the other -- an evidence management tool for safety certification. Conclusions. Our results indicate that over requirements documents: (1) our approach is more accurate than other existing methods for identifying candidate glossary terms; this makes it less likely that our approach will miss important glossary terms. (2) Clustering provides an effective basis for grouping related terms; this makes clustering a useful support tool for selection of glossary terms and associating these terms with their related terms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Symposium on Empirical Software Engineering and Measurement

自引率

0.00%

发文量