Deriving semantic validation rules from industrial standards: An OPC UA study

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web Pub Date : 2023-06-19 DOI:10.3233/sw-233342

Yashoda Saisree Bareedu, Thomas Frühwirth, C. Niedermeier, M. Sabou, Gernot Steindl, Aparna Saisree Thuluva, Stefani Tsaneva, Nilay Tufek Ozkaya

{"title":"Deriving semantic validation rules from industrial standards: An OPC UA study","authors":"Yashoda Saisree Bareedu, Thomas Frühwirth, C. Niedermeier, M. Sabou, Gernot Steindl, Aparna Saisree Thuluva, Stefani Tsaneva, Nilay Tufek Ozkaya","doi":"10.3233/sw-233342","DOIUrl":null,"url":null,"abstract":"Industrial standards provide guidelines for data modeling to ensure interoperability between stakeholders of an industry branch (e.g., robotics). Most frequently, such guidelines are provided in an unstructured format (e.g., pdf documents) which hampers the automated validations of information objects (e.g., data models) that rely on such standards in terms of their compliance with the modeling constraints prescribed by the guidelines. This raises the risk of costly interoperability errors induced by the incorrect use of the standards. There is, therefore, an increased interest in automatic semantic validation of information objects based on industrial standards. In this paper we focus on an approach to semantic validation by formally representing the modeling constraints from unstructured documents as explicit, machine-actionable rules (to be then used for semantic validation) and (semi-)automatically extracting such rules from pdf documents. While our approach aims to be generically applicable, we exemplify an adaptation of the approach in the concrete context of the OPC UA industrial standard, given its large-scale adoption among important industrial stakeholders and the OPC UA internal efforts towards semantic validation. We conclude that (i) it is feasible to represent modeling constraints from the standard specifications as rules, which can be organized in a taxonomy and represented using Semantic Web technologies such as OWL and SPARQL; (ii) we could automatically identify modeling constraints in the specification documents by inspecting the tables ( P = 87 %) and text of these documents (F1 up to 94%); (iii) the translation of the modeling constraints into formal rules could be fully automated when constraints were extracted from tables and required a Human-in-the-loop approach for constraints extracted from text.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"57 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Semantic Web","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/sw-233342","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 2

Abstract

Industrial standards provide guidelines for data modeling to ensure interoperability between stakeholders of an industry branch (e.g., robotics). Most frequently, such guidelines are provided in an unstructured format (e.g., pdf documents) which hampers the automated validations of information objects (e.g., data models) that rely on such standards in terms of their compliance with the modeling constraints prescribed by the guidelines. This raises the risk of costly interoperability errors induced by the incorrect use of the standards. There is, therefore, an increased interest in automatic semantic validation of information objects based on industrial standards. In this paper we focus on an approach to semantic validation by formally representing the modeling constraints from unstructured documents as explicit, machine-actionable rules (to be then used for semantic validation) and (semi-)automatically extracting such rules from pdf documents. While our approach aims to be generically applicable, we exemplify an adaptation of the approach in the concrete context of the OPC UA industrial standard, given its large-scale adoption among important industrial stakeholders and the OPC UA internal efforts towards semantic validation. We conclude that (i) it is feasible to represent modeling constraints from the standard specifications as rules, which can be organized in a taxonomy and represented using Semantic Web technologies such as OWL and SPARQL; (ii) we could automatically identify modeling constraints in the specification documents by inspecting the tables ( P = 87 %) and text of these documents (F1 up to 94%); (iii) the translation of the modeling constraints into formal rules could be fully automated when constraints were extracted from tables and required a Human-in-the-loop approach for constraints extracted from text.

查看原文本刊更多论文

从工业标准中派生语义验证规则:OPC UA研究

工业标准为数据建模提供指导方针，以确保行业分支(例如，机器人技术)的涉众之间的互操作性。大多数情况下，这些指导方针以非结构化格式(例如，pdf文档)提供，这会妨碍信息对象(例如，数据模型)的自动验证，这些信息对象依赖于这些标准，因为它们符合指导方针规定的建模约束。这增加了由于不正确使用标准而导致的代价高昂的互操作性错误的风险。因此，人们对基于工业标准的信息对象的自动语义验证越来越感兴趣。在本文中，我们将重点介绍一种语义验证方法，通过将非结构化文档中的建模约束形式化地表示为显式的、机器可操作的规则(然后用于语义验证)，并(半)自动地从pdf文档中提取这些规则。虽然我们的方法旨在普遍适用，但我们举例说明了该方法在OPC UA工业标准的具体背景下的适应性，因为它在重要的工业利益相关者中被大规模采用，并且OPC UA内部正在努力进行语义验证。我们得出的结论是:(i)将标准规范中的建模约束表示为规则是可行的，这些规则可以组织在一个分类法中，并使用OWL和SPARQL等语义Web技术表示;(ii)我们可以通过检查这些文档的表(P = 87%)和文本(F1高达94%)来自动识别规范文档中的建模约束;(iii)当从表中提取约束时，将建模约束转换为正式规则可以完全自动化，并且从文本中提取约束需要人工在环方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Semantic Web COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCEC-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

8.30

自引率

6.70%

发文量

期刊介绍： The journal Semantic Web – Interoperability, Usability, Applicability brings together researchers from various fields which share the vision and need for more effective and meaningful ways to share information across agents and services on the future internet and elsewhere. As such, Semantic Web technologies shall support the seamless integration of data, on-the-fly composition and interoperation of Web services, as well as more intuitive search engines. The semantics – or meaning – of information, however, cannot be defined without a context, which makes personalization, trust, and provenance core topics for Semantic Web research. New retrieval paradigms, user interfaces, and visualization techniques have to unleash the power of the Semantic Web and at the same time hide its complexity from the user. Based on this vision, the journal welcomes contributions ranging from theoretical and foundational research over methods and tools to descriptions of concrete ontologies and applications in all areas. We especially welcome papers which add a social, spatial, and temporal dimension to Semantic Web research, as well as application-oriented papers making use of formal semantics.