{"title":"Improving Chinese Grammatical Error Correction with Corpus Augmentation and Hierarchical Phrase-based Statistical Machine Translation","authors":"Yinchen Zhao, Mamoru Komachi, H. Ishikawa","doi":"10.18653/v1/W15-4417","DOIUrl":"https://doi.org/10.18653/v1/W15-4417","url":null,"abstract":"In this study, we describe our system submitted to the 2nd Workshop on Natural Language Processing Techniques for Educational Applications (NLP-TEA-2) shared task on Chinese grammatical error diagnosis (CGED). We use a statistical machine translation method already applied to several similar tasks (Brockett et al., 2006; Chiu et al., 2013; Zhao et al., 2014). In this research, we examine corpus-augmentation and explore alternative translation models including syntaxbased and hierarchical phrase-based models. Finally, we show variations using different combinations of these factors.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126293639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ralph Vincent J. Regalado, Michael Louie Boñon, Nadine Chua, Rene Rose Piñera, Shannen Rose, Dr. Geraldin B. Dela Cruz
{"title":"Salinlahi III: An Intelligent Tutoring System for Filipino Heritage Language Learners","authors":"Ralph Vincent J. Regalado, Michael Louie Boñon, Nadine Chua, Rene Rose Piñera, Shannen Rose, Dr. Geraldin B. Dela Cruz","doi":"10.18653/v1/W15-4413","DOIUrl":"https://doi.org/10.18653/v1/W15-4413","url":null,"abstract":"Heritage language learners are learners of the primary language of their parents which they might have been exposed to but have not learned it as a language they can fluently use to communicate with other people. Salinlahi, an Interactive Learning Environment, was developed to teach these young Filipino heritage learners about basic Filipino vocabulary while Salinlahi II included a support for collaborative learning. With the aim of teaching learners with basic knowledge in Filipino we developed Salinlahi III to teach higher level lessons focusing on Filipino grammar and sentence construction. An internal evaluation of the system has shown that the user interface and feedback of the tutor was appropriate. Moreover, in an external evaluation of the system, experimental and controlled field tests were done and results showed that there is a positive learning gain after using the system.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114637450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A System for Generating Multiple Choice Questions: With a Novel Approach for Sentence Selection","authors":"Mukta Majumder, S. Saha","doi":"10.18653/v1/W15-4410","DOIUrl":"https://doi.org/10.18653/v1/W15-4410","url":null,"abstract":"Multiple Choice Question (MCQ) plays a major role in educational assessment as well as in active learning. In this paper we present a system that generates MCQs automatically using a sports domain text as input. All the sentences in a text are not capable of generating MCQs; the first step of the system is to select the informative sentences. We propose a novel technique to select informative sentences by using topic modeling and parse structure similarity. The parse structure similarity is computed between the parse structure of an input sentence and a set of reference parse structures. In order to compile the reference set we use a number of existing MCQs collected from the web. Keyword selection is done with the help of occurrence of domain specific word and named entity word in the sentence. Distractors are generated using a set of rules and name dictionary. Experimental results demonstrate that the proposed technique is quite accurate.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127730034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Collocation Assistant for Learners of Japanese as a Second Language","authors":"L. Pereira, Yuji Matsumoto","doi":"10.18653/v1/W15-4404","DOIUrl":"https://doi.org/10.18653/v1/W15-4404","url":null,"abstract":"We present Collocation Assistant, a prototype of a collocational aid designed to promote the collocational competence of learners of Japanese as a second language (JSL). Focusing on noun-verb constructions, the tool automatically flags possible collocation errors and suggests better collocations by using corrections extracted from a large annotated Japanese language learner corpus. Each suggestion includes several usage examples to help learners choose the best candidate. In a preliminary user study with JSL learners, Collocation Assistant received positive feedback, and the results indicate that the system is helpful to assist learners in choosing correct word combinations in Japanese.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"238 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123360209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The “News Web Easy” news service as a resource for teaching and learning Japanese: An assessment of the comprehension difficulty of Japanese sentence-end expressions","authors":"Hideki Tanaka, T. Kumano, Isao Goto","doi":"10.18653/v1/W15-4411","DOIUrl":"https://doi.org/10.18653/v1/W15-4411","url":null,"abstract":"Japan’s public broadcasting corporation, NHK, launched “News Web Easy” in April 2012 1 . It provides users with five simplified news scripts (easy Japanese news) on a daily basis. This web service provides users with five daily simplified news scripts of “easy” Japanese news. Since its inception, this service has been favorably received both in Japan and overseas. Users particularly appreciate its value as a Japanese learning and teaching resource. In this paper, we discuss this service and its possible contribution to language education. We focus on difficulty levels of sentence-end expressions, compiled from the news, that create ambiguity and problems when rewriting news items. These are analyzed and compared within regular news and News Web Easy, and their difficulty is assessed based on Japanese learners’ reading comprehension levels. Our results revealed that current rewriting of sentence-end expressions in News Web Easy is appropriate. We further identified features of these expressions that contribute to difficulty in comprehension.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130081857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Renlong Ai, Sebastian Krause, W. Kasper, Feiyu Xu, H. Uszkoreit
{"title":"Semi-automatic Generation of Multiple-Choice Tests from Mentions of Semantic Relations","authors":"Renlong Ai, Sebastian Krause, W. Kasper, Feiyu Xu, H. Uszkoreit","doi":"10.18653/v1/W15-4405","DOIUrl":"https://doi.org/10.18653/v1/W15-4405","url":null,"abstract":"We propose a strategy for the semiautomatic generation of learning material for reading-comprehension tests, guided by semantic relations embedded in expository texts. Our approach combines methods from the areas of information extraction and paraphrasing in order to present a language teacher with a set of candidate multiple-choice questions and answers that can be used for verifying a language learners reading capabilities. We implemented a web-based prototype showing the feasibility of our approach and carried out a pilot user evaluation that resulted in encouraging feedback but also pointed out aspects of the strategy and prototype implementation which need improvements.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121755948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shih-Hung Wu, Po-Lin Chen, Liang-Pu Chen, Ping-Che Yang, Ren-Dar Yang
{"title":"Chinese Grammatical Error Diagnosis by Conditional Random Fields","authors":"Shih-Hung Wu, Po-Lin Chen, Liang-Pu Chen, Ping-Che Yang, Ren-Dar Yang","doi":"10.18653/v1/W15-4402","DOIUrl":"https://doi.org/10.18653/v1/W15-4402","url":null,"abstract":"This paper reports how to build a Chinese Grammatical Error Diagnosis system based on the conditional random fields (CRF). The system can find four types of grammatical errors in learners’ essays. The four types or errors are redundant words, missing words, bad word selection, and disorder words. Our system presents the best false positive rate in 2015 NLP-TEA-2 CGED shared task, and also the best precision rate in three diagnosis levels.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122969104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Finite State Transducers for Helping Foreign Language Learning","authors":"H. Kaya, Gülşen Eryiğit","doi":"10.18653/v1/W15-4414","DOIUrl":"https://doi.org/10.18653/v1/W15-4414","url":null,"abstract":"The interest and demand to foreign language learning are increased tremendously along with the globalization and freedom of movement in the world. Today, the technological developments allow the creation of supportive materials for foreign language learners. However, the language acquisition between languages with high typological differences still poses challenges for this area and the learning task it self. This paper introduces our preliminary study for building an educational application to help foreign language learning between Turkish and English. The paper presents the use of finite state technology for building a Turkish word synthesis system (which allows to choose word-related features among predefined grammatical affix categories such as tense, modality and polarity etc...) and a wordlevel translation system between the languages in focus. The developed system is observed to outperform the popular online translation systems for word-level translation in terms of grammatically correct outputs.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124199233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Min-Ah Cheon, Hyeong-Won Seo, Jae-Hoon Kim, Eun-Hee Noh, Kyung-Hee Sung, EunYong Lim
{"title":"An Automated Scoring Tool for Korean Supply-type Items Based on Semi-Supervised Learning","authors":"Min-Ah Cheon, Hyeong-Won Seo, Jae-Hoon Kim, Eun-Hee Noh, Kyung-Hee Sung, EunYong Lim","doi":"10.18653/v1/W15-4409","DOIUrl":"https://doi.org/10.18653/v1/W15-4409","url":null,"abstract":"Scoring short-answer questions has disadvantages that may take long time to grade and may be an issue on consistency in scoring. To alleviate the disadvantages, automated scoring systems are widely used in America or Europe, but, in Korea, there has been researches regarding the automated scoring. In this paper, we propose an automated scoring tool for Korean short-answer questions using a semisupervised learning method. The answers of students are analyzed and processed through natural language processing and unmarked-answers are automatically scored by machine learning methods. Then scored answers with high reliability are added in the training corpus iteratively and incrementally. Through the pilot experiment, the proposed system is evaluated for Korean and social subjects in Programme for National Student Assessment. We have showed that the processing time and the consistency of grades are promisingly improved. Using the proposed tool, various assessment methods have got to be development before applying to school test fields.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127593583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Xiang, Xiaolong Wang, Wenying Han, Qinghua Hong
{"title":"Chinese Grammatical Error Diagnosis Using Ensemble Learning","authors":"Yang Xiang, Xiaolong Wang, Wenying Han, Qinghua Hong","doi":"10.18653/v1/W15-4415","DOIUrl":"https://doi.org/10.18653/v1/W15-4415","url":null,"abstract":"Automatic grammatical error detection for Chinese has been a big challenge for NLP researchers for a long time, mostly due to the flexible and irregular ways in the expressing of this language. Strictly speaking, there is no evidence of a series of formal and strict grammar rules for Chinese, especially for the spoken Chinese, making it hard for foreigners to master this language. The CFL shared task provides a platform for the researchers to develop automatic engines to detect grammatical errors based on a number of manually annotated Chinese spoken sentences. This paper introduces HITSZ’s system for this year’s Chinese grammatical error diagnosis (CGED) task. Similar to the last year’s task, we put our emphasis mostly on the error detection level and error type identification level but did little for the position level. For all our models, we simply use supervised machine learning methods constrained to the given training corpus, with neither any heuristic rules nor any other referenced materials (except for the last years’ data). Among the three runs of results we submitted, the one using the ensemble classifier Random Feature Subspace (HITSZ_Run1) gained the best performance, with an optimal F1 of 0.6648 for the detection level and 0.2675 for the identification level.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129904437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}