{"title":"What affects the difficulty of Chinese syntax?","authors":"Yueming Du, Lijiao Yang","doi":"10.1109/IALP48816.2019.9037724","DOIUrl":null,"url":null,"abstract":"The traditional measurement of sentence difficulty only focuses on lexical features but neglects syntactic features. This paper takes 800 sentences in primary school Chinese textbooks published by People's Education Press as the research object and studies their syntactic features. We use random forest to select the top five important features and then employed SVM to do the classification experiment. The precision rate, recall rate and F-scored for the classification of 5 levels are respectively 50.42%, 50.40% and 50.41%, which indicates that the features we selected has practical value for the related research.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Asian Language Processing (IALP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP48816.2019.9037724","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The traditional measurement of sentence difficulty only focuses on lexical features but neglects syntactic features. This paper takes 800 sentences in primary school Chinese textbooks published by People's Education Press as the research object and studies their syntactic features. We use random forest to select the top five important features and then employed SVM to do the classification experiment. The precision rate, recall rate and F-scored for the classification of 5 levels are respectively 50.42%, 50.40% and 50.41%, which indicates that the features we selected has practical value for the related research.