{"title":"屈折规则归纳的效率分析","authors":"G. Szabó, L. Kovács","doi":"10.1109/CARPATHIANCC.2015.7145135","DOIUrl":null,"url":null,"abstract":"The world inflection is an important area of computerized linguistics for the agglutinative languages. The presented paper provides an overview of the two main algorithms for learning of inflection rules. The TASR and OSTIA methods are implemented and analyzed with real life data from the Hungarian language. The main novelty of the research work is the development of a robust method to generate training and test data from the documents available on the Internet. The implementation language is Java as Java 8 has great features for parallel and functional programming that could be leveraged in this big data analysis task. The performed tests show that current methods cannot provide both high accuracy and high cost efficiency on the same time.","PeriodicalId":187762,"journal":{"name":"Proceedings of the 2015 16th International Carpathian Control Conference (ICCC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Efficiency analysis of inflection rule induction\",\"authors\":\"G. Szabó, L. Kovács\",\"doi\":\"10.1109/CARPATHIANCC.2015.7145135\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The world inflection is an important area of computerized linguistics for the agglutinative languages. The presented paper provides an overview of the two main algorithms for learning of inflection rules. The TASR and OSTIA methods are implemented and analyzed with real life data from the Hungarian language. The main novelty of the research work is the development of a robust method to generate training and test data from the documents available on the Internet. The implementation language is Java as Java 8 has great features for parallel and functional programming that could be leveraged in this big data analysis task. The performed tests show that current methods cannot provide both high accuracy and high cost efficiency on the same time.\",\"PeriodicalId\":187762,\"journal\":{\"name\":\"Proceedings of the 2015 16th International Carpathian Control Conference (ICCC)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2015 16th International Carpathian Control Conference (ICCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CARPATHIANCC.2015.7145135\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 16th International Carpathian Control Conference (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CARPATHIANCC.2015.7145135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The world inflection is an important area of computerized linguistics for the agglutinative languages. The presented paper provides an overview of the two main algorithms for learning of inflection rules. The TASR and OSTIA methods are implemented and analyzed with real life data from the Hungarian language. The main novelty of the research work is the development of a robust method to generate training and test data from the documents available on the Internet. The implementation language is Java as Java 8 has great features for parallel and functional programming that could be leveraged in this big data analysis task. The performed tests show that current methods cannot provide both high accuracy and high cost efficiency on the same time.