{"title":"句子对齐平行语料库","authors":"Miftah Nina, Ataa Allah Fadoua, Taghbalout Imane","doi":"10.1109/IACS.2017.7921946","DOIUrl":null,"url":null,"abstract":"Current research, in Natural Language Processing, shows more interest in the under-resourced languages, during last years. Amazigh language is the autochthon language of North Africa. However, until 2011 that it became a constitutionally official language in Morocco, after years of persecution. Amazigh language is still considered as one of the under resourced languages. The question is: “how can the Amazigh language reach advanced languages?” Motivated by these considerations, we describe our effort in the development of an Amazigh-English parallel corpus aimed to be used in linguistic research, teaching, and natural language processing application, primarily machine translation. To the best of our knowledge, this corpus is the first Amazigh-English parallel corpus. The built corpus is sentence aligned, including 20726 sentences. The alignment was done automatically, while the evaluation was done manually. The experimentation results are satisfactory, achieving more than 90%.","PeriodicalId":180504,"journal":{"name":"2017 8th International Conference on Information and Communication Systems (ICICS)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sentence-aligned parallel corpus Amazigh-English\",\"authors\":\"Miftah Nina, Ataa Allah Fadoua, Taghbalout Imane\",\"doi\":\"10.1109/IACS.2017.7921946\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current research, in Natural Language Processing, shows more interest in the under-resourced languages, during last years. Amazigh language is the autochthon language of North Africa. However, until 2011 that it became a constitutionally official language in Morocco, after years of persecution. Amazigh language is still considered as one of the under resourced languages. The question is: “how can the Amazigh language reach advanced languages?” Motivated by these considerations, we describe our effort in the development of an Amazigh-English parallel corpus aimed to be used in linguistic research, teaching, and natural language processing application, primarily machine translation. To the best of our knowledge, this corpus is the first Amazigh-English parallel corpus. The built corpus is sentence aligned, including 20726 sentences. The alignment was done automatically, while the evaluation was done manually. The experimentation results are satisfactory, achieving more than 90%.\",\"PeriodicalId\":180504,\"journal\":{\"name\":\"2017 8th International Conference on Information and Communication Systems (ICICS)\",\"volume\":\"98 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 8th International Conference on Information and Communication Systems (ICICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IACS.2017.7921946\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 8th International Conference on Information and Communication Systems (ICICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IACS.2017.7921946","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Current research, in Natural Language Processing, shows more interest in the under-resourced languages, during last years. Amazigh language is the autochthon language of North Africa. However, until 2011 that it became a constitutionally official language in Morocco, after years of persecution. Amazigh language is still considered as one of the under resourced languages. The question is: “how can the Amazigh language reach advanced languages?” Motivated by these considerations, we describe our effort in the development of an Amazigh-English parallel corpus aimed to be used in linguistic research, teaching, and natural language processing application, primarily machine translation. To the best of our knowledge, this corpus is the first Amazigh-English parallel corpus. The built corpus is sentence aligned, including 20726 sentences. The alignment was done automatically, while the evaluation was done manually. The experimentation results are satisfactory, achieving more than 90%.