{"title":"On the Use of Word Embeddings for Identifying Domain Specific Ambiguities in Requirements","authors":"S. Mishra, Arpit Sharma","doi":"10.1109/REW.2019.00048","DOIUrl":null,"url":null,"abstract":"Software requirements are usually written in common natural language. An important quality criterion for each documented requirement is unambiguity. This simply means that all readers of the requirement must arrive at the same understanding of the requirement. Due to differences in the domain expertise of requirements engineer and other stakeholders of the project, it is possible that requirements contain several words that allow alternative interpretations. Our objective is to identify and detect domain specific ambiguous words in natural language text. This paper applies an NLP technique based on word embeddings to detect such ambiguous words. More specifically, we measure the ambiguity potential of most frequently used computer science (CS) words when they are used in other application areas or subdomains of engineering, e.g., aerospace, civil, petroleum, biomedical and environmental etc. Our extensive and detailed experiments with several different subdomains show that word embedding based techniques are very effective in identifying domain specific ambiguities. Our findings also demonstrate that this technique can be applied to documents of varying sizes. Finally, we provide pointers for future research.","PeriodicalId":166923,"journal":{"name":"2019 IEEE 27th International Requirements Engineering Conference Workshops (REW)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 27th International Requirements Engineering Conference Workshops (REW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/REW.2019.00048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
Software requirements are usually written in common natural language. An important quality criterion for each documented requirement is unambiguity. This simply means that all readers of the requirement must arrive at the same understanding of the requirement. Due to differences in the domain expertise of requirements engineer and other stakeholders of the project, it is possible that requirements contain several words that allow alternative interpretations. Our objective is to identify and detect domain specific ambiguous words in natural language text. This paper applies an NLP technique based on word embeddings to detect such ambiguous words. More specifically, we measure the ambiguity potential of most frequently used computer science (CS) words when they are used in other application areas or subdomains of engineering, e.g., aerospace, civil, petroleum, biomedical and environmental etc. Our extensive and detailed experiments with several different subdomains show that word embedding based techniques are very effective in identifying domain specific ambiguities. Our findings also demonstrate that this technique can be applied to documents of varying sizes. Finally, we provide pointers for future research.