Large multimedia archive for world languages

SSCS '10 Pub Date : 2010-10-29 DOI:10.1145/1878101.1878113
P. Wittenburg, Paul Trilsbeek, Przemek Lenkiewicz
{"title":"Large multimedia archive for world languages","authors":"P. Wittenburg, Paul Trilsbeek, Przemek Lenkiewicz","doi":"10.1145/1878101.1878113","DOIUrl":null,"url":null,"abstract":"In this paper, we describe the core pillars of a large archive of language material recorded worldwide partly about languages that are highly endangered. The bases for the documentation of these languages are audio/video recordings which are then annotated at several linguistic layers. The digital age completely changed the requirements of long-term preservation and it is discussed how the archive met these new challenges. An extensive solution for data replication has been worked out to guarantee bit-stream preservation. Due to an immediate conversion of the incoming data to standards-based formats and checks at upload time lifecycle management of all 50 Terabyte of data is widely simplified. A suitable metadata framework not only allowing users to describe and discover resources, but also allowing them to organize their resources is enabling the management of this amount of resources very efficiently. Finally, it is the Language Archiving Technology software suite which allows users to create, manipulate, access and enrich all archived resources given that they have access permissions.","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SSCS '10","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1878101.1878113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

In this paper, we describe the core pillars of a large archive of language material recorded worldwide partly about languages that are highly endangered. The bases for the documentation of these languages are audio/video recordings which are then annotated at several linguistic layers. The digital age completely changed the requirements of long-term preservation and it is discussed how the archive met these new challenges. An extensive solution for data replication has been worked out to guarantee bit-stream preservation. Due to an immediate conversion of the incoming data to standards-based formats and checks at upload time lifecycle management of all 50 Terabyte of data is widely simplified. A suitable metadata framework not only allowing users to describe and discover resources, but also allowing them to organize their resources is enabling the management of this amount of resources very efficiently. Finally, it is the Language Archiving Technology software suite which allows users to create, manipulate, access and enrich all archived resources given that they have access permissions.
世界语言的大型多媒体档案
在本文中,我们描述了世界范围内记录的部分高度濒危语言的大型语言材料档案的核心支柱。这些语言文档的基础是音频/视频记录,然后在几个语言层进行注释。数字时代彻底改变了长期保存的要求,讨论了档案如何应对这些新的挑战。一个广泛的数据复制解决方案已经制定,以保证比特流的保存。由于传入数据立即转换为基于标准的格式,并在上传时进行检查,因此大大简化了所有50tb数据的生命周期管理。合适的元数据框架不仅允许用户描述和发现资源,还允许他们组织资源,从而能够非常有效地管理大量资源。最后,它是语言存档技术软件套件,它允许用户创建、操作、访问和丰富所有存档资源,只要他们有访问权限。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信