Lemlem Hagos, Million Meshesha, Solomon Atnafu, Solomon Teferra
{"title":"确定阿姆哈拉语-提格里尼亚语的共同特征:实现资源不足语言的优化实施","authors":"Lemlem Hagos, Million Meshesha, Solomon Atnafu, Solomon Teferra","doi":"10.4314/sinet.v46i2.5","DOIUrl":null,"url":null,"abstract":"In this article, exploratory research is conducted to analyze statistical overlap across Amharic and Tigrigna at different level of abstraction, namely, word level, CV syllable level, and at phoneme level. Amharic and Tigrigna are among the most widely spoken Ethiosemitic languages in Ethiopia, yet under resourced to be fully integrated into TTS applications that assist oral society in their day-to-day activities. Text to speech research requires linguistic resources involving intensive text analysis and acoustic resources that involve digital signal analysis. TTS researches for Ethiosemitic languages have been explored on monolingual basis which require fragmented research activities towards the resource intensive task. Investigating the level of overlap for Amharic and Tigrigna gives an insight to reuse shared acoustic and linguistic resources across these languages and reduce duplication of effort in the process of designing higher level applications such as TTS. According to our statistical analysis, Amharic and Tigrigna share 86.36% at phonemic level, 85.93% at CV syllable level, and encouraging level of overlap at the word level. The extent to which these languages overlap at different level of abstraction implies the opportunity to reduce duplication of effort in the design and development of bilingual and multilingual TTS for Ethiosemitic polyglots.","PeriodicalId":275075,"journal":{"name":"SINET: Ethiopian Journal of Science","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying Amharic-Tigrigna Shared Features: Towards Optimizing Implementation of Under Resourced Languages\",\"authors\":\"Lemlem Hagos, Million Meshesha, Solomon Atnafu, Solomon Teferra\",\"doi\":\"10.4314/sinet.v46i2.5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, exploratory research is conducted to analyze statistical overlap across Amharic and Tigrigna at different level of abstraction, namely, word level, CV syllable level, and at phoneme level. Amharic and Tigrigna are among the most widely spoken Ethiosemitic languages in Ethiopia, yet under resourced to be fully integrated into TTS applications that assist oral society in their day-to-day activities. Text to speech research requires linguistic resources involving intensive text analysis and acoustic resources that involve digital signal analysis. TTS researches for Ethiosemitic languages have been explored on monolingual basis which require fragmented research activities towards the resource intensive task. Investigating the level of overlap for Amharic and Tigrigna gives an insight to reuse shared acoustic and linguistic resources across these languages and reduce duplication of effort in the process of designing higher level applications such as TTS. According to our statistical analysis, Amharic and Tigrigna share 86.36% at phonemic level, 85.93% at CV syllable level, and encouraging level of overlap at the word level. The extent to which these languages overlap at different level of abstraction implies the opportunity to reduce duplication of effort in the design and development of bilingual and multilingual TTS for Ethiosemitic polyglots.\",\"PeriodicalId\":275075,\"journal\":{\"name\":\"SINET: Ethiopian Journal of Science\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SINET: Ethiopian Journal of Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4314/sinet.v46i2.5\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SINET: Ethiopian Journal of Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4314/sinet.v46i2.5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identifying Amharic-Tigrigna Shared Features: Towards Optimizing Implementation of Under Resourced Languages
In this article, exploratory research is conducted to analyze statistical overlap across Amharic and Tigrigna at different level of abstraction, namely, word level, CV syllable level, and at phoneme level. Amharic and Tigrigna are among the most widely spoken Ethiosemitic languages in Ethiopia, yet under resourced to be fully integrated into TTS applications that assist oral society in their day-to-day activities. Text to speech research requires linguistic resources involving intensive text analysis and acoustic resources that involve digital signal analysis. TTS researches for Ethiosemitic languages have been explored on monolingual basis which require fragmented research activities towards the resource intensive task. Investigating the level of overlap for Amharic and Tigrigna gives an insight to reuse shared acoustic and linguistic resources across these languages and reduce duplication of effort in the process of designing higher level applications such as TTS. According to our statistical analysis, Amharic and Tigrigna share 86.36% at phonemic level, 85.93% at CV syllable level, and encouraging level of overlap at the word level. The extent to which these languages overlap at different level of abstraction implies the opportunity to reduce duplication of effort in the design and development of bilingual and multilingual TTS for Ethiosemitic polyglots.