Elisavet Koutsiana, Gabriel Maia Rocha Amaral, Neal Reeves, Albert Meroño-Peñuela, Elena Simperl
{"title":"An analysis of discussions in collaborative knowledge engineering through the lens of Wikidata","authors":"Elisavet Koutsiana, Gabriel Maia Rocha Amaral, Neal Reeves, Albert Meroño-Peñuela, Elena Simperl","doi":"10.1016/j.websem.2023.100799","DOIUrl":null,"url":null,"abstract":"<div><p>We study <em>discussions</em><span> in Wikidata, the world’s largest open-source collaborative knowledge graph (KG). This is important because it helps KG community managers understand how discussions are used and inform the design of collaborative practices and support tools. We follow a mixed-methods approach with descriptive statistics, thematic analysis, and statistical tests to investigate how much discussions in Wikidata are used, what they are used for, and how they support knowledge engineering (KE) activities. The study covers three core sources of discussion, the talk pages that accompany Wikidata items and properties, and a general-purpose communication page. Our findings show low use of discussion capabilities and a power-law distribution similar to other KE projects such as Schema.org. When discussions are used, they are mostly about KE activities, including activities that span across the entire KE lifecycle from conceptualisation and implementation to maintenance and taxonomy building. We hope that the findings will help Wikidata devise improved practices and capabilities to encourage the use of discussions as a tool to collaborate, improve editor engagement, and engineer better KGs.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"78 ","pages":"Article 100799"},"PeriodicalIF":2.1000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Web Semantics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570826823000288","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
We study discussions in Wikidata, the world’s largest open-source collaborative knowledge graph (KG). This is important because it helps KG community managers understand how discussions are used and inform the design of collaborative practices and support tools. We follow a mixed-methods approach with descriptive statistics, thematic analysis, and statistical tests to investigate how much discussions in Wikidata are used, what they are used for, and how they support knowledge engineering (KE) activities. The study covers three core sources of discussion, the talk pages that accompany Wikidata items and properties, and a general-purpose communication page. Our findings show low use of discussion capabilities and a power-law distribution similar to other KE projects such as Schema.org. When discussions are used, they are mostly about KE activities, including activities that span across the entire KE lifecycle from conceptualisation and implementation to maintenance and taxonomy building. We hope that the findings will help Wikidata devise improved practices and capabilities to encourage the use of discussions as a tool to collaborate, improve editor engagement, and engineer better KGs.
期刊介绍:
The Journal of Web Semantics is an interdisciplinary journal based on research and applications of various subject areas that contribute to the development of a knowledge-intensive and intelligent service Web. These areas include: knowledge technologies, ontology, agents, databases and the semantic grid, obviously disciplines like information retrieval, language technology, human-computer interaction and knowledge discovery are of major relevance as well. All aspects of the Semantic Web development are covered. The publication of large-scale experiments and their analysis is also encouraged to clearly illustrate scenarios and methods that introduce semantics into existing Web interfaces, contents and services. The journal emphasizes the publication of papers that combine theories, methods and experiments from different subject areas in order to deliver innovative semantic methods and applications.