{"title":"语言学-康奈尔大学","authors":"Kornelia Tancheva","doi":"10.5703/1288284315007","DOIUrl":null,"url":null,"abstract":"Currently the researcher is segmenting and transcribing her Cheyenne/English language audio files herself. The process is far from ideal; she is applying for a grant that would allow her to hire a lab technician, ideally with linguistic background or training that would help her do that; even if it is only for the parts that are in English. Once the files have been transcribed; the data needs to be cleaned up and normalized; and metadata applied, after which it should be searchable, so that the textual files can be searched for specific linguistic features and then call up the audio files. Ultimately the data will be ingested in a publically accessible searchable db, that allows for the download of segments. The db is searchable in the original language, in English and by the morphological gloss (which is the closest notion to a data dictionary in linguistics.) The data needs to be backed up and preserved indefinitely.","PeriodicalId":284498,"journal":{"name":"Data Curation Profiles Directory","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Linguistics - Cornell University\",\"authors\":\"Kornelia Tancheva\",\"doi\":\"10.5703/1288284315007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently the researcher is segmenting and transcribing her Cheyenne/English language audio files herself. The process is far from ideal; she is applying for a grant that would allow her to hire a lab technician, ideally with linguistic background or training that would help her do that; even if it is only for the parts that are in English. Once the files have been transcribed; the data needs to be cleaned up and normalized; and metadata applied, after which it should be searchable, so that the textual files can be searched for specific linguistic features and then call up the audio files. Ultimately the data will be ingested in a publically accessible searchable db, that allows for the download of segments. The db is searchable in the original language, in English and by the morphological gloss (which is the closest notion to a data dictionary in linguistics.) The data needs to be backed up and preserved indefinitely.\",\"PeriodicalId\":284498,\"journal\":{\"name\":\"Data Curation Profiles Directory\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data Curation Profiles Directory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5703/1288284315007\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Curation Profiles Directory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5703/1288284315007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Currently the researcher is segmenting and transcribing her Cheyenne/English language audio files herself. The process is far from ideal; she is applying for a grant that would allow her to hire a lab technician, ideally with linguistic background or training that would help her do that; even if it is only for the parts that are in English. Once the files have been transcribed; the data needs to be cleaned up and normalized; and metadata applied, after which it should be searchable, so that the textual files can be searched for specific linguistic features and then call up the audio files. Ultimately the data will be ingested in a publically accessible searchable db, that allows for the download of segments. The db is searchable in the original language, in English and by the morphological gloss (which is the closest notion to a data dictionary in linguistics.) The data needs to be backed up and preserved indefinitely.