Keynote 1: Big Data and Resource Sharing: A speech corpus and a Virtual Laboratory for facilitating human communication science research

D. Burnham
{"title":"Keynote 1: Big Data and Resource Sharing: A speech corpus and a Virtual Laboratory for facilitating human communication science research","authors":"D. Burnham","doi":"10.1109/ICSDA.2014.7051409","DOIUrl":null,"url":null,"abstract":"Information technology has always been an area of rapid change. Two recent developments in information technology have changed the nature of research across the spectrum of disciplines and also, to a large extent, developments in the commercial and government sector: Big Data and Resource Sharing. The growth in capacity for storing and accessing data has allowed the establishment of very large databases and corpora. In turn, this has prompted developments in methods for the collection of large data and its subsequent analysis. The growing move to open source and open access, in a wide range of settings and meanings, has led to a growing awareness of the benefits of data sharing. In these contexts, in this paper I describe two platforms that we have developed over the last 4 years: AusTalk, a 3000 hour auditory-visual corpus of Australian English, and Alveo, an extensible Virtual Laboratory housing corpora and analysis tools glued together by a versatile workflow engine. I will describe the genesis and operation of each of these in some detail and set out the advantages they and resources like these provide in (i) research in the wide ranges of disciplines in Human Communication Science, and (ii) facilitating collaboration across disciplines, across institutions, and across languages, and across national boundaries in our region, and well beyond.","PeriodicalId":91941,"journal":{"name":"Oriental COCOSDA ... International Conference on Speech Database and Assessments : [proceedings]. International Conference on Speech Database and Assessments","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Oriental COCOSDA ... International Conference on Speech Database and Assessments : [proceedings]. International Conference on Speech Database and Assessments","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSDA.2014.7051409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Information technology has always been an area of rapid change. Two recent developments in information technology have changed the nature of research across the spectrum of disciplines and also, to a large extent, developments in the commercial and government sector: Big Data and Resource Sharing. The growth in capacity for storing and accessing data has allowed the establishment of very large databases and corpora. In turn, this has prompted developments in methods for the collection of large data and its subsequent analysis. The growing move to open source and open access, in a wide range of settings and meanings, has led to a growing awareness of the benefits of data sharing. In these contexts, in this paper I describe two platforms that we have developed over the last 4 years: AusTalk, a 3000 hour auditory-visual corpus of Australian English, and Alveo, an extensible Virtual Laboratory housing corpora and analysis tools glued together by a versatile workflow engine. I will describe the genesis and operation of each of these in some detail and set out the advantages they and resources like these provide in (i) research in the wide ranges of disciplines in Human Communication Science, and (ii) facilitating collaboration across disciplines, across institutions, and across languages, and across national boundaries in our region, and well beyond.
主题演讲1:大数据与资源共享:促进人类传播科学研究的语音语料库和虚拟实验室
信息技术一直是一个快速变化的领域。信息技术的两个最新发展已经改变了跨学科研究的性质,也在很大程度上改变了商业和政府部门的发展:大数据和资源共享。存储和存取数据能力的增长使建立非常大的数据库和语料库成为可能。这反过来又促进了收集大数据及其后续分析方法的发展。越来越多的人转向开源和开放获取,在广泛的环境和意义上,导致越来越多的人意识到数据共享的好处。在这些背景下,在本文中,我描述了我们在过去4年里开发的两个平台:AusTalk,一个3000小时的澳大利亚英语视听语料库,以及Alveo,一个可扩展的虚拟实验室,它将语料库和分析工具通过一个通用的工作流引擎粘在一起。我将详细描述它们的起源和运作,并列出它们和这些资源在以下方面提供的优势:(1)在人类传播科学的广泛学科中进行研究;(2)促进跨学科、跨机构、跨语言、跨国界的合作,在我们的地区,甚至更远。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信