Linguistics - Cornell University

Kornelia Tancheva
{"title":"Linguistics - Cornell University","authors":"Kornelia Tancheva","doi":"10.5703/1288284315007","DOIUrl":null,"url":null,"abstract":"Currently the researcher is segmenting and transcribing her Cheyenne/English language audio files herself. The process is far from ideal; she is applying for a grant that would allow her to hire a lab technician, ideally with linguistic background or training that would help her do that; even if it is only for the parts that are in English. Once the files have been transcribed; the data needs to be cleaned up and normalized; and metadata applied, after which it should be searchable, so that the textual files can be searched for specific linguistic features and then call up the audio files. Ultimately the data will be ingested in a publically accessible searchable db, that allows for the download of segments. The db is searchable in the original language, in English and by the morphological gloss (which is the closest notion to a data dictionary in linguistics.) The data needs to be backed up and preserved indefinitely.","PeriodicalId":284498,"journal":{"name":"Data Curation Profiles Directory","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Curation Profiles Directory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5703/1288284315007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Currently the researcher is segmenting and transcribing her Cheyenne/English language audio files herself. The process is far from ideal; she is applying for a grant that would allow her to hire a lab technician, ideally with linguistic background or training that would help her do that; even if it is only for the parts that are in English. Once the files have been transcribed; the data needs to be cleaned up and normalized; and metadata applied, after which it should be searchable, so that the textual files can be searched for specific linguistic features and then call up the audio files. Ultimately the data will be ingested in a publically accessible searchable db, that allows for the download of segments. The db is searchable in the original language, in English and by the morphological gloss (which is the closest notion to a data dictionary in linguistics.) The data needs to be backed up and preserved indefinitely.
语言学-康奈尔大学
目前,研究人员正在自己分割和转录她的夏安语/英语音频文件。这个过程远非理想;她正在申请一笔资助,让她可以雇佣一名实验室技术人员,最好是有语言学背景或受过相关培训的;即使只有英文的部分。一旦文件被转录;数据需要清理和规范化;然后应用元数据,之后它应该是可搜索的,这样就可以搜索文本文件以查找特定的语言特征,然后调用音频文件。最终,这些数据将被吸收到一个可公开访问的可搜索数据库中,这允许下载片段。数据库可以用原始语言、英语和形态注释进行搜索(这是语言学中最接近数据字典的概念)。这些数据需要被无限期地备份和保存。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信