Christian Chiarcos, Purificação Silvano, Mariana Damova, Giedre Valunaite Oleškeviciene, Chaya Liebeskind, Dimitar Trajanov, Ciprian-Octavian Truică, Elena-Simona Apostol, Anna Baczkowska
{"title":"Izrada OWL ontologije za prikaz, povezivanje i pretraživanje SemAF diskursnih oznaka","authors":"Christian Chiarcos, Purificação Silvano, Mariana Damova, Giedre Valunaite Oleškeviciene, Chaya Liebeskind, Dimitar Trajanov, Ciprian-Octavian Truică, Elena-Simona Apostol, Anna Baczkowska","doi":"10.31724/rihjj.49.1.6","DOIUrl":null,"url":null,"abstract":"Linguistic Linked Open Data (LLOD) are technologies that provide a powerful instrument for representing and interpreting language phenomena on a web-scale. The main objective of this paper is to demonstrate how LLOD technologies can be applied to represent and annotate a corpus composed of multiword discourse markers, and what the effects of this are. In particular, it is our aim to apply semantic web standards such as RDF and OWL for publishing and integrating data. We present a novel scheme for discourse annotation that combines ISO standards describing discourse relations and dialogue acts – ISO DR-Core (ISO 24617-8) and ISO-Dialogue Acts (ISO 24617-2) in 9 languages (cf. Silvano and Damova 2022; Silvano, et al. 2022). We develop an OWL ontology to formalize that scheme, provide a newly annotated dataset and link its RDF edition with the ontology. Consequently, we describe the conjoint querying of the ontology and the annotations by means of SPARQL, the standard query language for the web of data. The ultimate result is that we are able to perform queries over multiple, interlinked datasets with complex internal structure. This is a first, but essential step, in developing novel, powerful, and groundbreaking means for the corpus-based study of multilingual discourse, communication analysis, or attitudes discovery.","PeriodicalId":51986,"journal":{"name":"Rasprave","volume":"19 1","pages":"0"},"PeriodicalIF":0.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Rasprave","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31724/rihjj.49.1.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Linguistic Linked Open Data (LLOD) are technologies that provide a powerful instrument for representing and interpreting language phenomena on a web-scale. The main objective of this paper is to demonstrate how LLOD technologies can be applied to represent and annotate a corpus composed of multiword discourse markers, and what the effects of this are. In particular, it is our aim to apply semantic web standards such as RDF and OWL for publishing and integrating data. We present a novel scheme for discourse annotation that combines ISO standards describing discourse relations and dialogue acts – ISO DR-Core (ISO 24617-8) and ISO-Dialogue Acts (ISO 24617-2) in 9 languages (cf. Silvano and Damova 2022; Silvano, et al. 2022). We develop an OWL ontology to formalize that scheme, provide a newly annotated dataset and link its RDF edition with the ontology. Consequently, we describe the conjoint querying of the ontology and the annotations by means of SPARQL, the standard query language for the web of data. The ultimate result is that we are able to perform queries over multiple, interlinked datasets with complex internal structure. This is a first, but essential step, in developing novel, powerful, and groundbreaking means for the corpus-based study of multilingual discourse, communication analysis, or attitudes discovery.
语言关联开放数据(LLOD)是一种技术,它为在网络规模上表示和解释语言现象提供了强大的工具。本文的主要目的是演示如何应用LLOD技术来表示和注释由多词话语标记组成的语料库,以及这样做的效果。特别是,我们的目标是应用语义web标准,如RDF和OWL来发布和集成数据。我们提出了一种新的话语注释方案,该方案结合了描述话语关系和对话行为的ISO标准- ISO DR-Core (ISO 24617-8)和ISO-对话行为(ISO 24617-2),支持9种语言(参见Silvano和Damova 2022;Silvano等人,2022)。我们开发了OWL本体来形式化该方案,提供了一个新的注释数据集,并将其RDF版本与本体链接起来。因此,我们使用数据网络的标准查询语言SPARQL描述了本体和注释的联合查询。最终的结果是,我们能够对具有复杂内部结构的多个相互关联的数据集执行查询。这是为基于语料库的多语言话语研究、交际分析或态度发现开发新颖、有力和开创性手段的第一步,但也是必不可少的一步。