{"title":"僧伽罗语和泰米尔语扩展到上下文感知校正的通用环境","authors":"Lakshikka Sithamparanathan, T. Uthayasanker","doi":"10.1109/NITC48475.2019.9114399","DOIUrl":null,"url":null,"abstract":"There are several types of research available on spell checkers for European languages and Indian languages. However, low resourced languages like Tamil & Sinhala have limited research in this problem space, maybe, because of its highly inflectional and morphologically rich nature. There is no fully functional context-aware spell-checking system, especially as an open source. A Generic Environment for context-aware spell correction approach is extended for resource-scarce languages: Sinhala and Tamil in this paper. Experimental results show that our system detects the error in spelling well and provides the most suitable suggestions for correcting the misspelled words with a minimum of 85% accuracy for Tamil and 70% for the Sinhala Language. This is the first ever context-aware spell corrector for the Sinhala language. Compared to prior Tamil language context-aware spell correctors this leaps in 1) modularized architecture and 2) increased coverage and accuracy. Moreover, this study produced a Tamil and Sinhala spell correction benchmark dataset. Both the dataset and the tools are available for public use.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"36 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Sinhala and Tamil Extension to Generic Environment for Context-aware Correction\",\"authors\":\"Lakshikka Sithamparanathan, T. Uthayasanker\",\"doi\":\"10.1109/NITC48475.2019.9114399\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There are several types of research available on spell checkers for European languages and Indian languages. However, low resourced languages like Tamil & Sinhala have limited research in this problem space, maybe, because of its highly inflectional and morphologically rich nature. There is no fully functional context-aware spell-checking system, especially as an open source. A Generic Environment for context-aware spell correction approach is extended for resource-scarce languages: Sinhala and Tamil in this paper. Experimental results show that our system detects the error in spelling well and provides the most suitable suggestions for correcting the misspelled words with a minimum of 85% accuracy for Tamil and 70% for the Sinhala Language. This is the first ever context-aware spell corrector for the Sinhala language. Compared to prior Tamil language context-aware spell correctors this leaps in 1) modularized architecture and 2) increased coverage and accuracy. Moreover, this study produced a Tamil and Sinhala spell correction benchmark dataset. Both the dataset and the tools are available for public use.\",\"PeriodicalId\":386923,\"journal\":{\"name\":\"2019 National Information Technology Conference (NITC)\",\"volume\":\"36 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 National Information Technology Conference (NITC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NITC48475.2019.9114399\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 National Information Technology Conference (NITC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NITC48475.2019.9114399","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Sinhala and Tamil Extension to Generic Environment for Context-aware Correction
There are several types of research available on spell checkers for European languages and Indian languages. However, low resourced languages like Tamil & Sinhala have limited research in this problem space, maybe, because of its highly inflectional and morphologically rich nature. There is no fully functional context-aware spell-checking system, especially as an open source. A Generic Environment for context-aware spell correction approach is extended for resource-scarce languages: Sinhala and Tamil in this paper. Experimental results show that our system detects the error in spelling well and provides the most suitable suggestions for correcting the misspelled words with a minimum of 85% accuracy for Tamil and 70% for the Sinhala Language. This is the first ever context-aware spell corrector for the Sinhala language. Compared to prior Tamil language context-aware spell correctors this leaps in 1) modularized architecture and 2) increased coverage and accuracy. Moreover, this study produced a Tamil and Sinhala spell correction benchmark dataset. Both the dataset and the tools are available for public use.