{"title":"A system for identification of idioms in Hindi","authors":"Priyanka M Tech, R M K Sinha","doi":"10.1109/IC3.2014.6897218","DOIUrl":null,"url":null,"abstract":"Idioms are extensively used in everyday language. They carry a metaphorical sense that makes their comprehension difficult as their meaning cannot be deduced from the meaning of their constituent parts. They pose a challenge for Natural language processing (NLP) applications like machine translation, information retrieval and question answering as their translation and meaning needs to be derived logically rather than literally. A lot of research work has been carried out into automatic extraction of multi-word expressions, but no comprehensive work has been reported on idioms in Hindi. In this paper, an attempt has been made to study the linguistic and morphological variations that are usually encountered in idioms in Hindi. Based on this study, a methodology for deriving rules for representation of idioms and their search has been developed. The rules representing the idioms are hand crafted. For the idiom identification, rule-base has been used to mark the input text for probable presence of idiom. Our system is limited to use only intra-sentential context. The experimental results demonstrate feasibility and scalability of our methodology.","PeriodicalId":444918,"journal":{"name":"2014 Seventh International Conference on Contemporary Computing (IC3)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Seventh International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2014.6897218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Idioms are extensively used in everyday language. They carry a metaphorical sense that makes their comprehension difficult as their meaning cannot be deduced from the meaning of their constituent parts. They pose a challenge for Natural language processing (NLP) applications like machine translation, information retrieval and question answering as their translation and meaning needs to be derived logically rather than literally. A lot of research work has been carried out into automatic extraction of multi-word expressions, but no comprehensive work has been reported on idioms in Hindi. In this paper, an attempt has been made to study the linguistic and morphological variations that are usually encountered in idioms in Hindi. Based on this study, a methodology for deriving rules for representation of idioms and their search has been developed. The rules representing the idioms are hand crafted. For the idiom identification, rule-base has been used to mark the input text for probable presence of idiom. Our system is limited to use only intra-sentential context. The experimental results demonstrate feasibility and scalability of our methodology.