Heritage connector: A machine learning framework for building linked open data from museum collections

Kalyan Dutia, John Stack
{"title":"Heritage connector: A machine learning framework for building linked open data from museum collections","authors":"Kalyan Dutia, John Stack","doi":"10.22541/au.160994838.81187546/v1","DOIUrl":null,"url":null,"abstract":"As with almost all data, museum collection catalogues are largely\nunstructured, variable in consistency and overwhelmingly composed of\nthin records. The form of these catalogues means that the potential for\nnew forms of research, access and scholarly enquiry that range across\nmultiple collections and related datasets remains dormant. In the\nproject Heritage Connector: Transforming text into data to extract\nmeaning and make connections, we are applying a battery of digital\ntechniques to connect similar, identical and related items within and\nacross collections and other publications. In this paper we describe a\nframework to create a Linked Open Data knowledge graph (KG) from digital\nmuseum catalogues, connect entities within this graph to Wikidata, and\ncreate new connections in this graph from text. We focus on the use of\nmachine learning to create these links at scale with a small amount of\nlabelled data, on a mid-range laptop or a small cloud virtual machine.\nWe publish open-source software providing tools to perform the tasks of\nKG creation, entity matching and named entity recognition under these\nconstraints.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied AI letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22541/au.160994838.81187546/v1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

As with almost all data, museum collection catalogues are largely unstructured, variable in consistency and overwhelmingly composed of thin records. The form of these catalogues means that the potential for new forms of research, access and scholarly enquiry that range across multiple collections and related datasets remains dormant. In the project Heritage Connector: Transforming text into data to extract meaning and make connections, we are applying a battery of digital techniques to connect similar, identical and related items within and across collections and other publications. In this paper we describe a framework to create a Linked Open Data knowledge graph (KG) from digital museum catalogues, connect entities within this graph to Wikidata, and create new connections in this graph from text. We focus on the use of machine learning to create these links at scale with a small amount of labelled data, on a mid-range laptop or a small cloud virtual machine. We publish open-source software providing tools to perform the tasks of KG creation, entity matching and named entity recognition under these constraints.
遗产连接器:一个机器学习框架,用于从博物馆藏品中构建链接的开放数据
与几乎所有的数据一样,博物馆藏品目录在很大程度上是非结构化的,一致性多变,而且绝大多数都是由单薄的记录组成的。这些目录的形式意味着跨多个集合和相关数据集的新形式的研究、获取和学术查询的潜力仍然处于休眠状态。在“遗产连接器:将文本转换为数据以提取含义并建立联系”项目中,我们正在应用一系列数字技术来连接馆藏和其他出版物内部和之间的相似、相同和相关项目。在本文中,我们描述了一个框架,用于从数字博物馆目录中创建一个链接开放数据知识图(KG),将该图中的实体连接到维基数据,并从文本中创建该图中的新连接。我们专注于使用机器学习,在中型笔记本电脑或小型云虚拟机上,通过少量标记数据大规模创建这些链接。我们发布了开源软件,提供在这些约束下执行kg创建、实体匹配和命名实体识别任务的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信