A multi-level semantic web for hard-to-specify domain concept, Pedestrian, in ML-based software

IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Barzamini, Hamed, Shahzad, Murtuza, Alhoori, Hamed, Rahimi, Mona
{"title":"A multi-level semantic web for hard-to-specify domain concept, Pedestrian, in ML-based software","authors":"Barzamini, Hamed, Shahzad, Murtuza, Alhoori, Hamed, Rahimi, Mona","doi":"10.1007/s00766-021-00366-0","DOIUrl":null,"url":null,"abstract":"<p>Machine Learning (ML) algorithms are widely used in building software-intensive systems, including safety-critical ones. Unlike traditional software components, Machine-Learned Components (MLC)s, software components built using ML algorithms, learn their specifications through generalizing the common features that they find in a limited set of collected examples. While this inductive nature overcomes the limitations of programming <i>hard-to-specify</i> concepts, the same feature becomes problematic for verifying safety in ML-based software systems. One reason is that, due to MLCs data-driven nature, there is often no set of explicitly written and pre-defined specifications, against which the MLC can be verified. In this regard, we propose to partially specify hard-to-specify domain concepts, which MLCs tend to classify, instead of fully relying on their inductive learning ability from arbitrarily-collected datasets. In this paper, we propose a semi-automated approach to construct a multi-level semantic web to partially outline the hard-to-specify, yet crucial, domain concept “pedestrian” in automotive domain. We evaluate the applicability of the generated semantic web in two ways: first, with a reference to the web, we augment a pedestrian dataset for a missing feature, <i>wheelchair</i>, to show training a state-of-the-art ML-based object detector on the augmented dataset improves its accuracy in detecting pedestrians; second, we evaluate the coverage of the generated semantic web based on multiple state-of-the-art pedestrian and human datasets.</p>","PeriodicalId":20912,"journal":{"name":"Requirements Engineering","volume":"116 ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2022-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Requirements Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00766-021-00366-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1

Abstract

Machine Learning (ML) algorithms are widely used in building software-intensive systems, including safety-critical ones. Unlike traditional software components, Machine-Learned Components (MLC)s, software components built using ML algorithms, learn their specifications through generalizing the common features that they find in a limited set of collected examples. While this inductive nature overcomes the limitations of programming hard-to-specify concepts, the same feature becomes problematic for verifying safety in ML-based software systems. One reason is that, due to MLCs data-driven nature, there is often no set of explicitly written and pre-defined specifications, against which the MLC can be verified. In this regard, we propose to partially specify hard-to-specify domain concepts, which MLCs tend to classify, instead of fully relying on their inductive learning ability from arbitrarily-collected datasets. In this paper, we propose a semi-automated approach to construct a multi-level semantic web to partially outline the hard-to-specify, yet crucial, domain concept “pedestrian” in automotive domain. We evaluate the applicability of the generated semantic web in two ways: first, with a reference to the web, we augment a pedestrian dataset for a missing feature, wheelchair, to show training a state-of-the-art ML-based object detector on the augmented dataset improves its accuracy in detecting pedestrians; second, we evaluate the coverage of the generated semantic web based on multiple state-of-the-art pedestrian and human datasets.

在基于ml的软件中,针对难以指定的领域概念行人的多层次语义网
机器学习(ML)算法广泛用于构建软件密集型系统,包括安全关键系统。与传统的软件组件不同,机器学习组件(MLC)是使用ML算法构建的软件组件,它通过概括在有限的收集示例中发现的共同特征来学习它们的规范。虽然这种归纳性克服了编程中难以指定概念的限制,但在基于ml的软件系统中验证安全性时,同样的特性会成为问题。一个原因是,由于MLC的数据驱动性质,通常没有一组明确编写和预定义的规范,MLC可以根据这些规范进行验证。在这方面,我们建议部分指定难以指定的领域概念,这是mlc倾向于分类的,而不是完全依赖它们从任意收集的数据集中归纳学习的能力。在本文中,我们提出了一种半自动化的方法来构建一个多级语义网,以部分概述汽车领域中难以指定但至关重要的领域概念“行人”。我们通过两种方式评估生成的语义网的适用性:首先,参考web,我们为缺失的特征(轮椅)增强行人数据集,以显示在增强数据集上训练最先进的基于ml的对象检测器可以提高其检测行人的准确性;其次,我们基于多个最先进的行人和人类数据集评估生成的语义网的覆盖范围。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Requirements Engineering
Requirements Engineering 工程技术-计算机:软件工程
CiteScore
7.10
自引率
10.70%
发文量
27
审稿时长
>12 weeks
期刊介绍: The journal provides a focus for the dissemination of new results about the elicitation, representation and validation of requirements of software intensive information systems or applications. Theoretical and applied submissions are welcome, but all papers must explicitly address: -the practical consequences of the ideas for the design of complex systems -how the ideas should be evaluated by the reflective practitioner The journal is motivated by a multi-disciplinary view that considers requirements not only in terms of software components specification but also in terms of activities for their elicitation, representation and agreement, carried out within an organisational and social context. To this end, contributions are sought from fields such as software engineering, information systems, occupational sociology, cognitive and organisational psychology, human-computer interaction, computer-supported cooperative work, linguistics and philosophy for work addressing specifically requirements engineering issues.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信