{"title":"Unsupervised Off-Topic Essay Detection Based on Target and Reference Prompts","authors":"xia li, Qifan Wen, Kongxin Pan","doi":"10.1109/CIS.2017.00108","DOIUrl":null,"url":null,"abstract":"Off-topic essay detection is an important part of the automatic essay scoring systems. Prior works mainly focused on the semantic similarity between the essay and the target prompt without considering the similarities between the essay and the reference prompts, while the latter can provide more semantic information on detecting off-topic essays. In this paper, an improved on-topic scores calculation method is proposed to improve the accuracy of off-topic essay detection. In our approach, we use the semantic difference between the similarities of the essay with the target prompt and that of with the reference prompts to on-topic score calculation, which is used to better distinguish the on-topic essays and the off-topic essays. Based on our new on-topic score method, we realize an unsupervised off-topic essay detection system without large scale of training data. Several experiments on six datasets of Kaggle competition show that our method can effectively detect off-topic essay.","PeriodicalId":304958,"journal":{"name":"2017 13th International Conference on Computational Intelligence and Security (CIS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 13th International Conference on Computational Intelligence and Security (CIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS.2017.00108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Off-topic essay detection is an important part of the automatic essay scoring systems. Prior works mainly focused on the semantic similarity between the essay and the target prompt without considering the similarities between the essay and the reference prompts, while the latter can provide more semantic information on detecting off-topic essays. In this paper, an improved on-topic scores calculation method is proposed to improve the accuracy of off-topic essay detection. In our approach, we use the semantic difference between the similarities of the essay with the target prompt and that of with the reference prompts to on-topic score calculation, which is used to better distinguish the on-topic essays and the off-topic essays. Based on our new on-topic score method, we realize an unsupervised off-topic essay detection system without large scale of training data. Several experiments on six datasets of Kaggle competition show that our method can effectively detect off-topic essay.