Mathias Landhäußer, Sven J. Körner, W. Tichy, Jan Keim, J. Krisch
{"title":"DeNom:一个使用NLP发现有问题的名词化的工具","authors":"Mathias Landhäußer, Sven J. Körner, W. Tichy, Jan Keim, J. Krisch","doi":"10.1109/AIRE.2015.7337623","DOIUrl":null,"url":null,"abstract":"Nominalizations in natural language requirements specifications can lead to imprecision. For example, in the phrase \"transportation of pallets\" it is unclear who transports the pallets from where to where and how. Guidelines for requirements specifications therefore recommend avoiding nominalizations. However, not all nominalizations are problematic. We present an industrial-strength text analysis tool called DeNom, which detects problematic nominalizations and reports them to the user for reformulation. DeNom uses Stanford's parser and the Cyc ontology. It classifies nominalizations as problematic or acceptable by first detecting all nominalizations in the specification and then subtracting those which are sufficiently specified within the sentence through word references, attributes, nominal phrase constructions, etc. All remaining nominalizations are incompletely specified, and are therefore prone to conceal complex processes. These nominalizations are deemed problematic. A thorough evaluation used 10 real-world requirements specifications from Daimler AG consisting of 60,000 words. DeNom identified over 1,100 nominalizations and classified 129 of them as problematic. Only 45 of which were false positives, resulting in a precision of 66%. Recall was 88%. In contrast, a naive nominalization detector would overload the user with 1,100 warnings, a thousand of which would be false positives.","PeriodicalId":320862,"journal":{"name":"2015 IEEE Second International Workshop on Artificial Intelligence for Requirements Engineering (AIRE)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"DeNom: a tool to find problematic nominalizations using NLP\",\"authors\":\"Mathias Landhäußer, Sven J. Körner, W. Tichy, Jan Keim, J. Krisch\",\"doi\":\"10.1109/AIRE.2015.7337623\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nominalizations in natural language requirements specifications can lead to imprecision. For example, in the phrase \\\"transportation of pallets\\\" it is unclear who transports the pallets from where to where and how. Guidelines for requirements specifications therefore recommend avoiding nominalizations. However, not all nominalizations are problematic. We present an industrial-strength text analysis tool called DeNom, which detects problematic nominalizations and reports them to the user for reformulation. DeNom uses Stanford's parser and the Cyc ontology. It classifies nominalizations as problematic or acceptable by first detecting all nominalizations in the specification and then subtracting those which are sufficiently specified within the sentence through word references, attributes, nominal phrase constructions, etc. All remaining nominalizations are incompletely specified, and are therefore prone to conceal complex processes. These nominalizations are deemed problematic. A thorough evaluation used 10 real-world requirements specifications from Daimler AG consisting of 60,000 words. DeNom identified over 1,100 nominalizations and classified 129 of them as problematic. Only 45 of which were false positives, resulting in a precision of 66%. Recall was 88%. In contrast, a naive nominalization detector would overload the user with 1,100 warnings, a thousand of which would be false positives.\",\"PeriodicalId\":320862,\"journal\":{\"name\":\"2015 IEEE Second International Workshop on Artificial Intelligence for Requirements Engineering (AIRE)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Second International Workshop on Artificial Intelligence for Requirements Engineering (AIRE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIRE.2015.7337623\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Second International Workshop on Artificial Intelligence for Requirements Engineering (AIRE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIRE.2015.7337623","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DeNom: a tool to find problematic nominalizations using NLP
Nominalizations in natural language requirements specifications can lead to imprecision. For example, in the phrase "transportation of pallets" it is unclear who transports the pallets from where to where and how. Guidelines for requirements specifications therefore recommend avoiding nominalizations. However, not all nominalizations are problematic. We present an industrial-strength text analysis tool called DeNom, which detects problematic nominalizations and reports them to the user for reformulation. DeNom uses Stanford's parser and the Cyc ontology. It classifies nominalizations as problematic or acceptable by first detecting all nominalizations in the specification and then subtracting those which are sufficiently specified within the sentence through word references, attributes, nominal phrase constructions, etc. All remaining nominalizations are incompletely specified, and are therefore prone to conceal complex processes. These nominalizations are deemed problematic. A thorough evaluation used 10 real-world requirements specifications from Daimler AG consisting of 60,000 words. DeNom identified over 1,100 nominalizations and classified 129 of them as problematic. Only 45 of which were false positives, resulting in a precision of 66%. Recall was 88%. In contrast, a naive nominalization detector would overload the user with 1,100 warnings, a thousand of which would be false positives.