{"title":"Morphological Analysis of Egyptian Children Corpus by KIDEVAL Program","authors":"H. Salama, S. Alansary, Amany Elshazly","doi":"10.1109/ESOLEC54569.2022.10009437","DOIUrl":null,"url":null,"abstract":"The aim of this study is to provide a morphological analysis of the Egyptian children corpus, which is a morphologically tagged and disambiguated in CHILDES. This allows the KIDEVAL program to be readily used on the corpus to address questions regarding the acquisition of Egyptian Arabic. KIDEVAL is one of the useful tools in CLAN program which has been particularly useful toolsets in the study of language acquisition in many languages. However, applications of corpus-based analyses to Egyptian children's language have not yet been conducted. This study describes how to use the KIDEVAL program for analyzing Egyptian children's language and study the development of word frequency patterns of parts of speech and order of development of grammatical morphemes in Egyptian Arabic. The output of morphological analysis enables researchers to study and answer many questions regarding the development of a grammatical morpheme in Egyptian Arabic, as well as a lot of questions that can readily be probed with KIDEVAL. The Egyptian Arabic corpus is downloaded from the Arabic part of the CHILDES database. It comprises 10transcripts from Egyptian-speaking children aged 1;7 to3;8 years, with a total of 25,645 words. The KIDEVAL program analysis profile for Egyptian Arabic children's corpus in this study reveals extensive and valuable analysis, displaying the number of occurrences of each part of speech for each child depends on his age which includes 54 categories and subcategories. The usage of the KIDEVAL tool is efficient because it reduces the time needed to label the corpus manually.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 20th International Conference on Language Engineering (ESOLEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESOLEC54569.2022.10009437","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The aim of this study is to provide a morphological analysis of the Egyptian children corpus, which is a morphologically tagged and disambiguated in CHILDES. This allows the KIDEVAL program to be readily used on the corpus to address questions regarding the acquisition of Egyptian Arabic. KIDEVAL is one of the useful tools in CLAN program which has been particularly useful toolsets in the study of language acquisition in many languages. However, applications of corpus-based analyses to Egyptian children's language have not yet been conducted. This study describes how to use the KIDEVAL program for analyzing Egyptian children's language and study the development of word frequency patterns of parts of speech and order of development of grammatical morphemes in Egyptian Arabic. The output of morphological analysis enables researchers to study and answer many questions regarding the development of a grammatical morpheme in Egyptian Arabic, as well as a lot of questions that can readily be probed with KIDEVAL. The Egyptian Arabic corpus is downloaded from the Arabic part of the CHILDES database. It comprises 10transcripts from Egyptian-speaking children aged 1;7 to3;8 years, with a total of 25,645 words. The KIDEVAL program analysis profile for Egyptian Arabic children's corpus in this study reveals extensive and valuable analysis, displaying the number of occurrences of each part of speech for each child depends on his age which includes 54 categories and subcategories. The usage of the KIDEVAL tool is efficient because it reduces the time needed to label the corpus manually.