Diana Balc, Anamaria Beleiu, R. Potolea, C. Lemnaru
{"title":"基于学习的罗马尼亚语音节和重音分配方法","authors":"Diana Balc, Anamaria Beleiu, R. Potolea, C. Lemnaru","doi":"10.1109/ICCP.2015.7312603","DOIUrl":null,"url":null,"abstract":"This paper tackles the Romanian syllabification and stress assignment problems, and proposes an efficient machine learning based solution. We show that by designing the appropriate feature sets for each specific problem, learning algorithms achieve satisfactory accuracy rates for both problems (~92% for syllabification, ~85% for stress assignment), even for relatively small training set sizes. We have found that unigram-based features are powerful enough to characterize these problems, and therefore the introduction of bi-gram or tri-gram features (often utilized in syllabification problems for other languages) is unnecessary.","PeriodicalId":158453,"journal":{"name":"2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A learning-based approach for Romanian syllabification and stress assignment\",\"authors\":\"Diana Balc, Anamaria Beleiu, R. Potolea, C. Lemnaru\",\"doi\":\"10.1109/ICCP.2015.7312603\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper tackles the Romanian syllabification and stress assignment problems, and proposes an efficient machine learning based solution. We show that by designing the appropriate feature sets for each specific problem, learning algorithms achieve satisfactory accuracy rates for both problems (~92% for syllabification, ~85% for stress assignment), even for relatively small training set sizes. We have found that unigram-based features are powerful enough to characterize these problems, and therefore the introduction of bi-gram or tri-gram features (often utilized in syllabification problems for other languages) is unnecessary.\",\"PeriodicalId\":158453,\"journal\":{\"name\":\"2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCP.2015.7312603\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCP.2015.7312603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A learning-based approach for Romanian syllabification and stress assignment
This paper tackles the Romanian syllabification and stress assignment problems, and proposes an efficient machine learning based solution. We show that by designing the appropriate feature sets for each specific problem, learning algorithms achieve satisfactory accuracy rates for both problems (~92% for syllabification, ~85% for stress assignment), even for relatively small training set sizes. We have found that unigram-based features are powerful enough to characterize these problems, and therefore the introduction of bi-gram or tri-gram features (often utilized in syllabification problems for other languages) is unnecessary.