Semantic enrichment on large scanned collections through their “satellite texts”: the paradigm of Migne’s Patrologia Graeca

IF 2.1 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE
E. Varthis, Spyros Tzanavaris, I. Giarenis, S. Papavlasopoulos, Manolis Drakakis, M. Poulos
{"title":"Semantic enrichment on large scanned collections through their “satellite texts”: the paradigm of Migne’s Patrologia Graeca","authors":"E. Varthis, Spyros Tzanavaris, I. Giarenis, S. Papavlasopoulos, Manolis Drakakis, M. Poulos","doi":"10.1108/idd-03-2021-0021","DOIUrl":null,"url":null,"abstract":"\nPurpose\nThis paper aims to present a methodology for the semantic enrichment on the scanned collection of Migne’s Patrologia Graeca (PG), attempting to easily locate on the Web domain the scanned PG source, when a reference of this source is described and commented on another scanned or textual document, and to semantically enrich PG through related scanned or textual documents named “satellite texts” published by third people. The present enrichment of PG uses as satellite texts the Dorotheos Scholarios's Synoptic Index (DSSI) which act as metadata for PG.\n\n\nDesign/methodology/approach\nThe methodology consists of two parts. The first part addresses the DSSI transcription via a proper web tool. The second part is divided into two subsections: the accomplishment of interlinking the printed column numbers of each scanned PG page with its actual filename, which is the build of a matching function, and the build of a web interface for PG, based on the generated Uniform Resource Identifiers (URIs) of the above first subsection.\n\n\nFindings\nThe result of the implemented methodology is a Web portal, capable of providing server-less search of topics with direct (single click) navigation to sources. The produced system is static, scalable, easy to be managed and requires minimal cost to be completed and maintained. The produced data sets of transcribed DSSI and the JavaScript Object Notation (JSON) matching functions are available for personal use of students and scholars under Creative Commons license (CC-BY-NC-SA).\n\n\nSocial implications\nScholars or anyone interested in a particular subject can easily locate topics in PG and reference them, using URIs that are easy to remember. This fact contributes significantly to the related scientific dialogue.\n\n\nOriginality/value\nThe methodology uses the transcribed satellite texts of DSSI, which act as metadata for PG, to semantically enrich PG collection. Furthermore, the built PG Web interface can be used by other satellite texts as a reference basis to further enrich PG, as it provides a direct identification of sources. The presented methodology is general and can be applied to any scanned collection using its own satellite texts.\n","PeriodicalId":43488,"journal":{"name":"Information Discovery and Delivery","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Discovery and Delivery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/idd-03-2021-0021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 1

Abstract

Purpose This paper aims to present a methodology for the semantic enrichment on the scanned collection of Migne’s Patrologia Graeca (PG), attempting to easily locate on the Web domain the scanned PG source, when a reference of this source is described and commented on another scanned or textual document, and to semantically enrich PG through related scanned or textual documents named “satellite texts” published by third people. The present enrichment of PG uses as satellite texts the Dorotheos Scholarios's Synoptic Index (DSSI) which act as metadata for PG. Design/methodology/approach The methodology consists of two parts. The first part addresses the DSSI transcription via a proper web tool. The second part is divided into two subsections: the accomplishment of interlinking the printed column numbers of each scanned PG page with its actual filename, which is the build of a matching function, and the build of a web interface for PG, based on the generated Uniform Resource Identifiers (URIs) of the above first subsection. Findings The result of the implemented methodology is a Web portal, capable of providing server-less search of topics with direct (single click) navigation to sources. The produced system is static, scalable, easy to be managed and requires minimal cost to be completed and maintained. The produced data sets of transcribed DSSI and the JavaScript Object Notation (JSON) matching functions are available for personal use of students and scholars under Creative Commons license (CC-BY-NC-SA). Social implications Scholars or anyone interested in a particular subject can easily locate topics in PG and reference them, using URIs that are easy to remember. This fact contributes significantly to the related scientific dialogue. Originality/value The methodology uses the transcribed satellite texts of DSSI, which act as metadata for PG, to semantically enrich PG collection. Furthermore, the built PG Web interface can be used by other satellite texts as a reference basis to further enrich PG, as it provides a direct identification of sources. The presented methodology is general and can be applied to any scanned collection using its own satellite texts.
通过“卫星文本”对大型扫描藏品进行语义丰富:Migne的Patrologia Graeca范式
目的本文旨在提出一种对Migne的Patrologia Graeca(PG)扫描集进行语义丰富的方法,试图在Web域上轻松定位扫描的PG源,当该源的引用被描述并评论在另一个扫描或文本文档上时,并通过第三人发布的名为“卫星文本”的相关扫描或文本文档,在语义上丰富PG。目前PG的丰富使用Dorotheos Scholarios的天气指数(DSSI)作为卫星文本,该指数作为PG的元数据。设计/方法/方法。方法由两部分组成。第一部分通过适当的网络工具处理DSSI转录。第二部分分为两个小节:实现将每个扫描的PG页面的打印列号与其实际文件名链接,这是匹配函数的构建,以及基于上述第一小节生成的统一资源标识符(URI)为PG构建web界面。Findings实现的方法的结果是一个Web门户,能够通过直接(单击)导航到源来提供无服务器的主题搜索。所生产的系统是静态的、可扩展的、易于管理的,并且需要最低的成本来完成和维护。根据知识共享许可证(CC-BY-NC-SA),生成的转录DSSI和JavaScript对象符号(JSON)匹配函数的数据集可供学生和学者个人使用。社会含义学者或任何对特定主题感兴趣的人都可以使用易于记忆的URI,轻松地在PG中定位主题并引用它们。这一事实对相关的科学对话作出了重大贡献。原创性/价值该方法使用DSSI的转录卫星文本,作为PG的元数据,从语义上丰富PG集合。此外,所构建的PG Web界面可以被其他卫星文本用作进一步丰富PG的参考基础,因为它提供了对来源的直接识别。所提出的方法是通用的,可以应用于任何使用其自身卫星文本的扫描收集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Discovery and Delivery
Information Discovery and Delivery INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
5.40
自引率
4.80%
发文量
21
期刊介绍: Information Discovery and Delivery covers information discovery and access for digital information researchers. This includes educators, knowledge professionals in education and cultural organisations, knowledge managers in media, health care and government, as well as librarians. The journal publishes research and practice which explores the digital information supply chain ie transport, flows, tracking, exchange and sharing, including within and between libraries. It is also interested in digital information capture, packaging and storage by ‘collectors’ of all kinds. Information is widely defined, including but not limited to: Records, Documents, Learning objects, Visual and sound files, Data and metadata and , User-generated content.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信