基于规则的NLP概念实现印度古吉拉特语语法概念“变调”的方法

N. Patel, Dhiren R. Patel
{"title":"基于规则的NLP概念实现印度古吉拉特语语法概念“变调”的方法","authors":"N. Patel, Dhiren R. Patel","doi":"10.1109/INDIACom51348.2021.00085","DOIUrl":null,"url":null,"abstract":"The term ‘language’ in NLP has to be understood as natural languages like Gujarati, Hindi, English etc., which we use in daily life to communicate. Most of the NLP research has been centered on English & other European Languages. NLP research concerning the Indian language like Gujarati is commenced in the last few years. The centre of attention of this paper is to demonstrate the road map of implementation of Gujarati grammar's concept “sandhi ”. In our words sandhi is a word segmentation process & it is present in most of the South Asian language, such as Devnagri, Sanskrit, Hindi, and Gujarati & even in Chinese & Thai languages.” Sandhi leads to phonetic transformation at word boundaries of a written chunk (small part), and the sounds at the end of word join together to form a single chunk of the character sequence.” Our main spotlight is on rule-based implementation of “sandhi”. Similar to every Indian scripting language Gujarati language (Grammar) also has its own specified rules of composition for combining the consonants, vowels and modifiers. We have identified certain rules by which we accomplish the practical implementation of “sandhi ”. There are many sandhi rules available, each denoting a unique combination of phonetic transformations, documented in the grammatical tradition of Gujarati. The Sandhi does not make any syntactic or semantic changes to the words implicated. Sandhi is an elective operation that depends only on the alertness of the writer.","PeriodicalId":415594,"journal":{"name":"2021 8th International Conference on Computing for Sustainable Global Development (INDIACom)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Implementation Approach of Indian Language Gujarati Grammar's Concept “sandhi” using the Concepts of Rule-based NLP\",\"authors\":\"N. Patel, Dhiren R. Patel\",\"doi\":\"10.1109/INDIACom51348.2021.00085\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The term ‘language’ in NLP has to be understood as natural languages like Gujarati, Hindi, English etc., which we use in daily life to communicate. Most of the NLP research has been centered on English & other European Languages. NLP research concerning the Indian language like Gujarati is commenced in the last few years. The centre of attention of this paper is to demonstrate the road map of implementation of Gujarati grammar's concept “sandhi ”. In our words sandhi is a word segmentation process & it is present in most of the South Asian language, such as Devnagri, Sanskrit, Hindi, and Gujarati & even in Chinese & Thai languages.” Sandhi leads to phonetic transformation at word boundaries of a written chunk (small part), and the sounds at the end of word join together to form a single chunk of the character sequence.” Our main spotlight is on rule-based implementation of “sandhi”. Similar to every Indian scripting language Gujarati language (Grammar) also has its own specified rules of composition for combining the consonants, vowels and modifiers. We have identified certain rules by which we accomplish the practical implementation of “sandhi ”. There are many sandhi rules available, each denoting a unique combination of phonetic transformations, documented in the grammatical tradition of Gujarati. The Sandhi does not make any syntactic or semantic changes to the words implicated. Sandhi is an elective operation that depends only on the alertness of the writer.\",\"PeriodicalId\":415594,\"journal\":{\"name\":\"2021 8th International Conference on Computing for Sustainable Global Development (INDIACom)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 8th International Conference on Computing for Sustainable Global Development (INDIACom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INDIACom51348.2021.00085\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th International Conference on Computing for Sustainable Global Development (INDIACom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIACom51348.2021.00085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

NLP中的“语言”一词必须理解为自然语言,如古吉拉特语、印地语、英语等,我们在日常生活中使用这些语言进行交流。大多数NLP研究都集中在英语和其他欧洲语言上。关于古吉拉特语等印度语言的NLP研究是在过去几年开始的。本文的重点是展示古吉拉特语语法中“变调”概念的实施路线图。用我们的话说,sandhi是一个分词过程,它存在于大多数南亚语言中,比如Devnagri,梵语,印地语和古吉拉特语,甚至在汉语和泰语中。”变调导致在书写块(小部分)的单词边界处的语音转换,单词末尾的发音连接在一起形成字符序列的单个块。我们的主要焦点是基于规则的“sandhi”实施。与所有印度脚本语言类似,古吉拉特语(语法)也有自己特定的组合规则,用于组合辅音、元音和修饰语。我们已经确定了一些规则,通过这些规则我们可以完成“sandhi”的实际实施。有许多变调规则可用,每一个表示一个独特的组合的语音转换,记录在古吉拉特语的语法传统。变调对所涉及的词没有任何句法或语义上的改变。变调是一种选择性操作,只依赖于写作者的警觉性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Implementation Approach of Indian Language Gujarati Grammar's Concept “sandhi” using the Concepts of Rule-based NLP
The term ‘language’ in NLP has to be understood as natural languages like Gujarati, Hindi, English etc., which we use in daily life to communicate. Most of the NLP research has been centered on English & other European Languages. NLP research concerning the Indian language like Gujarati is commenced in the last few years. The centre of attention of this paper is to demonstrate the road map of implementation of Gujarati grammar's concept “sandhi ”. In our words sandhi is a word segmentation process & it is present in most of the South Asian language, such as Devnagri, Sanskrit, Hindi, and Gujarati & even in Chinese & Thai languages.” Sandhi leads to phonetic transformation at word boundaries of a written chunk (small part), and the sounds at the end of word join together to form a single chunk of the character sequence.” Our main spotlight is on rule-based implementation of “sandhi”. Similar to every Indian scripting language Gujarati language (Grammar) also has its own specified rules of composition for combining the consonants, vowels and modifiers. We have identified certain rules by which we accomplish the practical implementation of “sandhi ”. There are many sandhi rules available, each denoting a unique combination of phonetic transformations, documented in the grammatical tradition of Gujarati. The Sandhi does not make any syntactic or semantic changes to the words implicated. Sandhi is an elective operation that depends only on the alertness of the writer.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信