{"title":"通过软件分类方法改进影响和依赖分析","authors":"Egbeyong E. Tanjong, D. Carver","doi":"10.1109/CONISOFT52520.2021.00029","DOIUrl":null,"url":null,"abstract":"Software requirements specifications serve as instructions for any software development engagement. These instructions are mostly written in natural language for ease of manual analysis and comprehension. Since natural language is inherently ambiguous, software requirements analysis plays a pivotal role in enhancing clarity during the software development life cycle. There are several methods of software requirements analysis. We focus on analysis methods which categorize requirements. We present a comparison of the performance of three common categorization techniques of software requirements documents, using three different datasets. We evaluate three bag of words models: count vectorization, term frequency - inverse document frequency (TF-IDF), and a word embeddings technique. We report the similarity of the categories obtained using cosine similarity as a measure of similarity between the requirements vectors produced by the different methods. Syntactic techniques outperformed semantic techniques for some datasets. These results suggest that syntactic techniques produce comparable categories to semantic techniques for some requirements categorization tasks.","PeriodicalId":380632,"journal":{"name":"2021 9th International Conference in Software Engineering Research and Innovation (CONISOFT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Impact and Dependency Analysis through Software Categorization Methods\",\"authors\":\"Egbeyong E. Tanjong, D. Carver\",\"doi\":\"10.1109/CONISOFT52520.2021.00029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software requirements specifications serve as instructions for any software development engagement. These instructions are mostly written in natural language for ease of manual analysis and comprehension. Since natural language is inherently ambiguous, software requirements analysis plays a pivotal role in enhancing clarity during the software development life cycle. There are several methods of software requirements analysis. We focus on analysis methods which categorize requirements. We present a comparison of the performance of three common categorization techniques of software requirements documents, using three different datasets. We evaluate three bag of words models: count vectorization, term frequency - inverse document frequency (TF-IDF), and a word embeddings technique. We report the similarity of the categories obtained using cosine similarity as a measure of similarity between the requirements vectors produced by the different methods. Syntactic techniques outperformed semantic techniques for some datasets. These results suggest that syntactic techniques produce comparable categories to semantic techniques for some requirements categorization tasks.\",\"PeriodicalId\":380632,\"journal\":{\"name\":\"2021 9th International Conference in Software Engineering Research and Innovation (CONISOFT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 9th International Conference in Software Engineering Research and Innovation (CONISOFT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CONISOFT52520.2021.00029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 9th International Conference in Software Engineering Research and Innovation (CONISOFT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CONISOFT52520.2021.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Impact and Dependency Analysis through Software Categorization Methods
Software requirements specifications serve as instructions for any software development engagement. These instructions are mostly written in natural language for ease of manual analysis and comprehension. Since natural language is inherently ambiguous, software requirements analysis plays a pivotal role in enhancing clarity during the software development life cycle. There are several methods of software requirements analysis. We focus on analysis methods which categorize requirements. We present a comparison of the performance of three common categorization techniques of software requirements documents, using three different datasets. We evaluate three bag of words models: count vectorization, term frequency - inverse document frequency (TF-IDF), and a word embeddings technique. We report the similarity of the categories obtained using cosine similarity as a measure of similarity between the requirements vectors produced by the different methods. Syntactic techniques outperformed semantic techniques for some datasets. These results suggest that syntactic techniques produce comparable categories to semantic techniques for some requirements categorization tasks.