{"title":"在基于ml的软件中,针对难以指定的领域概念行人的多层次语义网","authors":"Barzamini, Hamed, Shahzad, Murtuza, Alhoori, Hamed, Rahimi, Mona","doi":"10.1007/s00766-021-00366-0","DOIUrl":null,"url":null,"abstract":"<p>Machine Learning (ML) algorithms are widely used in building software-intensive systems, including safety-critical ones. Unlike traditional software components, Machine-Learned Components (MLC)s, software components built using ML algorithms, learn their specifications through generalizing the common features that they find in a limited set of collected examples. While this inductive nature overcomes the limitations of programming <i>hard-to-specify</i> concepts, the same feature becomes problematic for verifying safety in ML-based software systems. One reason is that, due to MLCs data-driven nature, there is often no set of explicitly written and pre-defined specifications, against which the MLC can be verified. In this regard, we propose to partially specify hard-to-specify domain concepts, which MLCs tend to classify, instead of fully relying on their inductive learning ability from arbitrarily-collected datasets. In this paper, we propose a semi-automated approach to construct a multi-level semantic web to partially outline the hard-to-specify, yet crucial, domain concept “pedestrian” in automotive domain. We evaluate the applicability of the generated semantic web in two ways: first, with a reference to the web, we augment a pedestrian dataset for a missing feature, <i>wheelchair</i>, to show training a state-of-the-art ML-based object detector on the augmented dataset improves its accuracy in detecting pedestrians; second, we evaluate the coverage of the generated semantic web based on multiple state-of-the-art pedestrian and human datasets.</p>","PeriodicalId":20912,"journal":{"name":"Requirements Engineering","volume":"116 ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2022-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A multi-level semantic web for hard-to-specify domain concept, Pedestrian, in ML-based software\",\"authors\":\"Barzamini, Hamed, Shahzad, Murtuza, Alhoori, Hamed, Rahimi, Mona\",\"doi\":\"10.1007/s00766-021-00366-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Machine Learning (ML) algorithms are widely used in building software-intensive systems, including safety-critical ones. Unlike traditional software components, Machine-Learned Components (MLC)s, software components built using ML algorithms, learn their specifications through generalizing the common features that they find in a limited set of collected examples. While this inductive nature overcomes the limitations of programming <i>hard-to-specify</i> concepts, the same feature becomes problematic for verifying safety in ML-based software systems. One reason is that, due to MLCs data-driven nature, there is often no set of explicitly written and pre-defined specifications, against which the MLC can be verified. In this regard, we propose to partially specify hard-to-specify domain concepts, which MLCs tend to classify, instead of fully relying on their inductive learning ability from arbitrarily-collected datasets. In this paper, we propose a semi-automated approach to construct a multi-level semantic web to partially outline the hard-to-specify, yet crucial, domain concept “pedestrian” in automotive domain. We evaluate the applicability of the generated semantic web in two ways: first, with a reference to the web, we augment a pedestrian dataset for a missing feature, <i>wheelchair</i>, to show training a state-of-the-art ML-based object detector on the augmented dataset improves its accuracy in detecting pedestrians; second, we evaluate the coverage of the generated semantic web based on multiple state-of-the-art pedestrian and human datasets.</p>\",\"PeriodicalId\":20912,\"journal\":{\"name\":\"Requirements Engineering\",\"volume\":\"116 \",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2022-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Requirements Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00766-021-00366-0\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Requirements Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00766-021-00366-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
A multi-level semantic web for hard-to-specify domain concept, Pedestrian, in ML-based software
Machine Learning (ML) algorithms are widely used in building software-intensive systems, including safety-critical ones. Unlike traditional software components, Machine-Learned Components (MLC)s, software components built using ML algorithms, learn their specifications through generalizing the common features that they find in a limited set of collected examples. While this inductive nature overcomes the limitations of programming hard-to-specify concepts, the same feature becomes problematic for verifying safety in ML-based software systems. One reason is that, due to MLCs data-driven nature, there is often no set of explicitly written and pre-defined specifications, against which the MLC can be verified. In this regard, we propose to partially specify hard-to-specify domain concepts, which MLCs tend to classify, instead of fully relying on their inductive learning ability from arbitrarily-collected datasets. In this paper, we propose a semi-automated approach to construct a multi-level semantic web to partially outline the hard-to-specify, yet crucial, domain concept “pedestrian” in automotive domain. We evaluate the applicability of the generated semantic web in two ways: first, with a reference to the web, we augment a pedestrian dataset for a missing feature, wheelchair, to show training a state-of-the-art ML-based object detector on the augmented dataset improves its accuracy in detecting pedestrians; second, we evaluate the coverage of the generated semantic web based on multiple state-of-the-art pedestrian and human datasets.
期刊介绍:
The journal provides a focus for the dissemination of new results about the elicitation, representation and validation of requirements of software intensive information systems or applications. Theoretical and applied submissions are welcome, but all papers must explicitly address:
-the practical consequences of the ideas for the design of complex systems
-how the ideas should be evaluated by the reflective practitioner
The journal is motivated by a multi-disciplinary view that considers requirements not only in terms of software components specification but also in terms of activities for their elicitation, representation and agreement, carried out within an organisational and social context. To this end, contributions are sought from fields such as software engineering, information systems, occupational sociology, cognitive and organisational psychology, human-computer interaction, computer-supported cooperative work, linguistics and philosophy for work addressing specifically requirements engineering issues.