{"title":"建立波斯语使用者可阅读的波斯语-英语OMProDat数据库","authors":"Mortaza Taheri-Ardali, D. Hirst","doi":"10.21437/speechprosody.2022-90","DOIUrl":null,"url":null,"abstract":"OMProDat is an open multilingual prosodic database, which aims to collect, archive and distribute recordings and annotations of directly comparable data from different languages. As part of the OMProDat project, this paper focuses on the creation of a bilingual Persian-English prosodic database read by native speakers of Persian. This collection contains 40 continuous, thematically connected paragraphs, each of five sentences, originally created during the European SAM project. Our collection was recorded by 5 male and 5 female speakers of standard Persian, all from monolingual families. The Persian texts were romanised and transcribed phonetically using the ASCII phonetic alphabet SAMPA. The database includes TextGrid annotations, which will be obtained semi-automatically from the sound and the orthographic transcription using the SPPAS alignment software. The Momel and INSINT algorithms will be used to provide prosodic annotation of the corpus. This considerable amount of data will allow us to compare the production of Persian and English as L1 and L2, respectively. In addition, a cross-linguistic comparison with other languages in OMProDat is easily feasible.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Building a Persian-English OMProDat Database Read by Persian Speakers\",\"authors\":\"Mortaza Taheri-Ardali, D. Hirst\",\"doi\":\"10.21437/speechprosody.2022-90\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"OMProDat is an open multilingual prosodic database, which aims to collect, archive and distribute recordings and annotations of directly comparable data from different languages. As part of the OMProDat project, this paper focuses on the creation of a bilingual Persian-English prosodic database read by native speakers of Persian. This collection contains 40 continuous, thematically connected paragraphs, each of five sentences, originally created during the European SAM project. Our collection was recorded by 5 male and 5 female speakers of standard Persian, all from monolingual families. The Persian texts were romanised and transcribed phonetically using the ASCII phonetic alphabet SAMPA. The database includes TextGrid annotations, which will be obtained semi-automatically from the sound and the orthographic transcription using the SPPAS alignment software. The Momel and INSINT algorithms will be used to provide prosodic annotation of the corpus. This considerable amount of data will allow us to compare the production of Persian and English as L1 and L2, respectively. In addition, a cross-linguistic comparison with other languages in OMProDat is easily feasible.\",\"PeriodicalId\":442842,\"journal\":{\"name\":\"Speech Prosody 2022\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Speech Prosody 2022\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/speechprosody.2022-90\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Speech Prosody 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/speechprosody.2022-90","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Building a Persian-English OMProDat Database Read by Persian Speakers
OMProDat is an open multilingual prosodic database, which aims to collect, archive and distribute recordings and annotations of directly comparable data from different languages. As part of the OMProDat project, this paper focuses on the creation of a bilingual Persian-English prosodic database read by native speakers of Persian. This collection contains 40 continuous, thematically connected paragraphs, each of five sentences, originally created during the European SAM project. Our collection was recorded by 5 male and 5 female speakers of standard Persian, all from monolingual families. The Persian texts were romanised and transcribed phonetically using the ASCII phonetic alphabet SAMPA. The database includes TextGrid annotations, which will be obtained semi-automatically from the sound and the orthographic transcription using the SPPAS alignment software. The Momel and INSINT algorithms will be used to provide prosodic annotation of the corpus. This considerable amount of data will allow us to compare the production of Persian and English as L1 and L2, respectively. In addition, a cross-linguistic comparison with other languages in OMProDat is easily feasible.