基于多模型SVR的CEFR语法项特征熟练程度评估

Brendan Flanagan, S. Hirokawa, Emiko Kaneko, Emi Izumi, H. Ogata
{"title":"基于多模型SVR的CEFR语法项特征熟练程度评估","authors":"Brendan Flanagan, S. Hirokawa, Emiko Kaneko, Emi Izumi, H. Ogata","doi":"10.1109/IIAI-AAI.2017.169","DOIUrl":null,"url":null,"abstract":"Analysis of publicly available language learning corpora can be useful for extracting characteristic features of learners from different proficiency levels. This can then be used to support language learning research and the creation of educational resources. In this paper, we classify the words and parts of speech of transcripts from different speaking proficiency levels found in the NICT-JLE corpus. The characteristic features of learners who have the equivalent spoken proficiency of CEFR levels A1 through to B2 were extracted by analyzing the data with the support vector machine method. In particular, we apply feature selection to find a set of characteristic features that achieve optimal classification performance, which can be used to predict spoken learner proficiency.","PeriodicalId":281712,"journal":{"name":"2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Multi-model SVR Approach to Estimating the CEFR Proficiency Level of Grammar Item Features\",\"authors\":\"Brendan Flanagan, S. Hirokawa, Emiko Kaneko, Emi Izumi, H. Ogata\",\"doi\":\"10.1109/IIAI-AAI.2017.169\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Analysis of publicly available language learning corpora can be useful for extracting characteristic features of learners from different proficiency levels. This can then be used to support language learning research and the creation of educational resources. In this paper, we classify the words and parts of speech of transcripts from different speaking proficiency levels found in the NICT-JLE corpus. The characteristic features of learners who have the equivalent spoken proficiency of CEFR levels A1 through to B2 were extracted by analyzing the data with the support vector machine method. In particular, we apply feature selection to find a set of characteristic features that achieve optimal classification performance, which can be used to predict spoken learner proficiency.\",\"PeriodicalId\":281712,\"journal\":{\"name\":\"2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IIAI-AAI.2017.169\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIAI-AAI.2017.169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

对公开可用的语言学习语料库进行分析,有助于提取不同熟练程度学习者的特征。这可以用来支持语言学习研究和教育资源的创建。在本文中,我们对来自NICT-JLE语料库中不同口语熟练程度的文本的单词和词性进行了分类。采用支持向量机方法对口语水平达到CEFR A1 ~ B2级的学习者进行数据分析,提取其特征特征。特别是,我们应用特征选择来找到一组达到最佳分类性能的特征特征,这些特征特征可用于预测口语学习者的熟练程度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Multi-model SVR Approach to Estimating the CEFR Proficiency Level of Grammar Item Features
Analysis of publicly available language learning corpora can be useful for extracting characteristic features of learners from different proficiency levels. This can then be used to support language learning research and the creation of educational resources. In this paper, we classify the words and parts of speech of transcripts from different speaking proficiency levels found in the NICT-JLE corpus. The characteristic features of learners who have the equivalent spoken proficiency of CEFR levels A1 through to B2 were extracted by analyzing the data with the support vector machine method. In particular, we apply feature selection to find a set of characteristic features that achieve optimal classification performance, which can be used to predict spoken learner proficiency.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信