{"title":"鄂-乌格尔语数据库:汉特语和曼斯语的语料库和词典数据库","authors":"Axel Wisiorek, Zsófia Schön","doi":"10.1556/2062.2017.64.3.4","DOIUrl":null,"url":null,"abstract":"In this paper we describe the data processing procedures and the preliminary results of the project Ob-Ugric database (OUDB), a web-based framework which aims at developing corpus-based descriptive resources of Khanty and Mansi dialects. Using established language documentation and annotation tools, OUDB provides interlinked corpus and lexicon data from digitized texts as well as recent fieldwork studies in an uniform IPA-transcription together with the corresponding audio recordings thus making these less described languages of the Ob-Ugric branch of the Finno-Ugric language family accessible for researchers as well as the language community and archiving the raw data for documentation, linguistic evaluation and possible future use in building resources for language technology applications.","PeriodicalId":54157,"journal":{"name":"Acta Linguistica Hungarica","volume":"64 1","pages":"383-396"},"PeriodicalIF":0.0000,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1556/2062.2017.64.3.4","citationCount":"0","resultStr":"{\"title\":\"Ob-Ugric database: Corpus and lexicon databases of Khanty and Mansi dialects\",\"authors\":\"Axel Wisiorek, Zsófia Schön\",\"doi\":\"10.1556/2062.2017.64.3.4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we describe the data processing procedures and the preliminary results of the project Ob-Ugric database (OUDB), a web-based framework which aims at developing corpus-based descriptive resources of Khanty and Mansi dialects. Using established language documentation and annotation tools, OUDB provides interlinked corpus and lexicon data from digitized texts as well as recent fieldwork studies in an uniform IPA-transcription together with the corresponding audio recordings thus making these less described languages of the Ob-Ugric branch of the Finno-Ugric language family accessible for researchers as well as the language community and archiving the raw data for documentation, linguistic evaluation and possible future use in building resources for language technology applications.\",\"PeriodicalId\":54157,\"journal\":{\"name\":\"Acta Linguistica Hungarica\",\"volume\":\"64 1\",\"pages\":\"383-396\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1556/2062.2017.64.3.4\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Linguistica Hungarica\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1556/2062.2017.64.3.4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Linguistica Hungarica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1556/2062.2017.64.3.4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ob-Ugric database: Corpus and lexicon databases of Khanty and Mansi dialects
In this paper we describe the data processing procedures and the preliminary results of the project Ob-Ugric database (OUDB), a web-based framework which aims at developing corpus-based descriptive resources of Khanty and Mansi dialects. Using established language documentation and annotation tools, OUDB provides interlinked corpus and lexicon data from digitized texts as well as recent fieldwork studies in an uniform IPA-transcription together with the corresponding audio recordings thus making these less described languages of the Ob-Ugric branch of the Finno-Ugric language family accessible for researchers as well as the language community and archiving the raw data for documentation, linguistic evaluation and possible future use in building resources for language technology applications.