语言学-康奈尔大学

Data Curation Profiles Directory Pub Date : 1900-01-01 DOI:10.5703/1288284315007

Kornelia Tancheva

{"title":"语言学-康奈尔大学","authors":"Kornelia Tancheva","doi":"10.5703/1288284315007","DOIUrl":null,"url":null,"abstract":"Currently the researcher is segmenting and transcribing her Cheyenne/English language audio files herself. The process is far from ideal; she is applying for a grant that would allow her to hire a lab technician, ideally with linguistic background or training that would help her do that; even if it is only for the parts that are in English. Once the files have been transcribed; the data needs to be cleaned up and normalized; and metadata applied, after which it should be searchable, so that the textual files can be searched for specific linguistic features and then call up the audio files. Ultimately the data will be ingested in a publically accessible searchable db, that allows for the download of segments. The db is searchable in the original language, in English and by the morphological gloss (which is the closest notion to a data dictionary in linguistics.) The data needs to be backed up and preserved indefinitely.","PeriodicalId":284498,"journal":{"name":"Data Curation Profiles Directory","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Linguistics - Cornell University\",\"authors\":\"Kornelia Tancheva\",\"doi\":\"10.5703/1288284315007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently the researcher is segmenting and transcribing her Cheyenne/English language audio files herself. The process is far from ideal; she is applying for a grant that would allow her to hire a lab technician, ideally with linguistic background or training that would help her do that; even if it is only for the parts that are in English. Once the files have been transcribed; the data needs to be cleaned up and normalized; and metadata applied, after which it should be searchable, so that the textual files can be searched for specific linguistic features and then call up the audio files. Ultimately the data will be ingested in a publically accessible searchable db, that allows for the download of segments. The db is searchable in the original language, in English and by the morphological gloss (which is the closest notion to a data dictionary in linguistics.) The data needs to be backed up and preserved indefinitely.\",\"PeriodicalId\":284498,\"journal\":{\"name\":\"Data Curation Profiles Directory\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data Curation Profiles Directory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5703/1288284315007\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Curation Profiles Directory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5703/1288284315007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

目前，研究人员正在自己分割和转录她的夏安语/英语音频文件。这个过程远非理想;她正在申请一笔资助，让她可以雇佣一名实验室技术人员，最好是有语言学背景或受过相关培训的;即使只有英文的部分。一旦文件被转录;数据需要清理和规范化;然后应用元数据，之后它应该是可搜索的，这样就可以搜索文本文件以查找特定的语言特征，然后调用音频文件。最终，这些数据将被吸收到一个可公开访问的可搜索数据库中，这允许下载片段。数据库可以用原始语言、英语和形态注释进行搜索(这是语言学中最接近数据字典的概念)。这些数据需要被无限期地备份和保存。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Linguistics - Cornell University

Currently the researcher is segmenting and transcribing her Cheyenne/English language audio files herself. The process is far from ideal; she is applying for a grant that would allow her to hire a lab technician, ideally with linguistic background or training that would help her do that; even if it is only for the parts that are in English. Once the files have been transcribed; the data needs to be cleaned up and normalized; and metadata applied, after which it should be searchable, so that the textual files can be searched for specific linguistic features and then call up the audio files. Ultimately the data will be ingested in a publically accessible searchable db, that allows for the download of segments. The db is searchable in the original language, in English and by the morphological gloss (which is the closest notion to a data dictionary in linguistics.) The data needs to be backed up and preserved indefinitely.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Data Curation Profiles Directory

自引率

0.00%

发文量