Morphological Analysis of Egyptian Children Corpus by KIDEVAL Program

2022 20th International Conference on Language Engineering (ESOLEC) Pub Date : 2022-10-12 DOI:10.1109/ESOLEC54569.2022.10009437

H. Salama, S. Alansary, Amany Elshazly

{"title":"Morphological Analysis of Egyptian Children Corpus by KIDEVAL Program","authors":"H. Salama, S. Alansary, Amany Elshazly","doi":"10.1109/ESOLEC54569.2022.10009437","DOIUrl":null,"url":null,"abstract":"The aim of this study is to provide a morphological analysis of the Egyptian children corpus, which is a morphologically tagged and disambiguated in CHILDES. This allows the KIDEVAL program to be readily used on the corpus to address questions regarding the acquisition of Egyptian Arabic. KIDEVAL is one of the useful tools in CLAN program which has been particularly useful toolsets in the study of language acquisition in many languages. However, applications of corpus-based analyses to Egyptian children's language have not yet been conducted. This study describes how to use the KIDEVAL program for analyzing Egyptian children's language and study the development of word frequency patterns of parts of speech and order of development of grammatical morphemes in Egyptian Arabic. The output of morphological analysis enables researchers to study and answer many questions regarding the development of a grammatical morpheme in Egyptian Arabic, as well as a lot of questions that can readily be probed with KIDEVAL. The Egyptian Arabic corpus is downloaded from the Arabic part of the CHILDES database. It comprises 10transcripts from Egyptian-speaking children aged 1;7 to3;8 years, with a total of 25,645 words. The KIDEVAL program analysis profile for Egyptian Arabic children's corpus in this study reveals extensive and valuable analysis, displaying the number of occurrences of each part of speech for each child depends on his age which includes 54 categories and subcategories. The usage of the KIDEVAL tool is efficient because it reduces the time needed to label the corpus manually.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 20th International Conference on Language Engineering (ESOLEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESOLEC54569.2022.10009437","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The aim of this study is to provide a morphological analysis of the Egyptian children corpus, which is a morphologically tagged and disambiguated in CHILDES. This allows the KIDEVAL program to be readily used on the corpus to address questions regarding the acquisition of Egyptian Arabic. KIDEVAL is one of the useful tools in CLAN program which has been particularly useful toolsets in the study of language acquisition in many languages. However, applications of corpus-based analyses to Egyptian children's language have not yet been conducted. This study describes how to use the KIDEVAL program for analyzing Egyptian children's language and study the development of word frequency patterns of parts of speech and order of development of grammatical morphemes in Egyptian Arabic. The output of morphological analysis enables researchers to study and answer many questions regarding the development of a grammatical morpheme in Egyptian Arabic, as well as a lot of questions that can readily be probed with KIDEVAL. The Egyptian Arabic corpus is downloaded from the Arabic part of the CHILDES database. It comprises 10transcripts from Egyptian-speaking children aged 1;7 to3;8 years, with a total of 25,645 words. The KIDEVAL program analysis profile for Egyptian Arabic children's corpus in this study reveals extensive and valuable analysis, displaying the number of occurrences of each part of speech for each child depends on his age which includes 54 categories and subcategories. The usage of the KIDEVAL tool is efficient because it reduces the time needed to label the corpus manually.

查看原文本刊更多论文

用kidval程序对埃及儿童语料库进行形态学分析

本研究的目的是提供一个形态学分析的埃及儿童语料库，这是一个形态标记和消除歧义在CHILDES。这使得KIDEVAL程序可以很容易地在语料库上使用，以解决有关埃及阿拉伯语获取的问题。KIDEVAL是CLAN程序中非常有用的工具之一，在许多语言的语言习得研究中一直是非常有用的工具集。然而，基于语料库的分析尚未应用于埃及儿童语言。本研究描述了如何使用KIDEVAL程序分析埃及儿童的语言，研究埃及阿拉伯语词性词频模式的发展和语法语素的发展顺序。形态分析的输出使研究人员能够研究和回答关于埃及阿拉伯语语法语素发展的许多问题，以及许多可以很容易地用KIDEVAL探索的问题。埃及阿拉伯语语料库是从CHILDES数据库的阿拉伯语部分下载的。它包括10份来自说埃及语的1、7到3、8岁儿童的成绩单，共计25,645个单词。本研究中埃及阿拉伯语儿童语料库的KIDEVAL程序分析概况揭示了广泛而有价值的分析，显示了每个儿童的每个词性的出现次数取决于他的年龄，其中包括54个类别和子类别。使用KIDEVAL工具是有效的，因为它减少了手动标记语料库所需的时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 20th International Conference on Language Engineering (ESOLEC)

自引率

0.00%

发文量