基于目标和参考提示的无监督离题文章检测

xia li, Qifan Wen, Kongxin Pan
{"title":"基于目标和参考提示的无监督离题文章检测","authors":"xia li, Qifan Wen, Kongxin Pan","doi":"10.1109/CIS.2017.00108","DOIUrl":null,"url":null,"abstract":"Off-topic essay detection is an important part of the automatic essay scoring systems. Prior works mainly focused on the semantic similarity between the essay and the target prompt without considering the similarities between the essay and the reference prompts, while the latter can provide more semantic information on detecting off-topic essays. In this paper, an improved on-topic scores calculation method is proposed to improve the accuracy of off-topic essay detection. In our approach, we use the semantic difference between the similarities of the essay with the target prompt and that of with the reference prompts to on-topic score calculation, which is used to better distinguish the on-topic essays and the off-topic essays. Based on our new on-topic score method, we realize an unsupervised off-topic essay detection system without large scale of training data. Several experiments on six datasets of Kaggle competition show that our method can effectively detect off-topic essay.","PeriodicalId":304958,"journal":{"name":"2017 13th International Conference on Computational Intelligence and Security (CIS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Unsupervised Off-Topic Essay Detection Based on Target and Reference Prompts\",\"authors\":\"xia li, Qifan Wen, Kongxin Pan\",\"doi\":\"10.1109/CIS.2017.00108\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Off-topic essay detection is an important part of the automatic essay scoring systems. Prior works mainly focused on the semantic similarity between the essay and the target prompt without considering the similarities between the essay and the reference prompts, while the latter can provide more semantic information on detecting off-topic essays. In this paper, an improved on-topic scores calculation method is proposed to improve the accuracy of off-topic essay detection. In our approach, we use the semantic difference between the similarities of the essay with the target prompt and that of with the reference prompts to on-topic score calculation, which is used to better distinguish the on-topic essays and the off-topic essays. Based on our new on-topic score method, we realize an unsupervised off-topic essay detection system without large scale of training data. Several experiments on six datasets of Kaggle competition show that our method can effectively detect off-topic essay.\",\"PeriodicalId\":304958,\"journal\":{\"name\":\"2017 13th International Conference on Computational Intelligence and Security (CIS)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 13th International Conference on Computational Intelligence and Security (CIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIS.2017.00108\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 13th International Conference on Computational Intelligence and Security (CIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS.2017.00108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

跑题作文检测是作文自动评分系统的重要组成部分。先前的研究主要集中在文章与目标提示的语义相似度上,没有考虑文章与参考提示的相似度,而后者可以为检测偏离主题的文章提供更多的语义信息。本文提出了一种改进的离题分数计算方法,以提高离题作文检测的准确性。在我们的方法中,我们使用文章与目标提示的相似度和与参考提示的相似度之间的语义差异来计算主题分数,从而更好地区分主题文章和非主题文章。在此基础上,我们实现了一个没有大规模训练数据的无监督离题作文检测系统。在Kaggle竞赛的6个数据集上进行的实验表明,我们的方法可以有效地检测出跑题文章。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Unsupervised Off-Topic Essay Detection Based on Target and Reference Prompts
Off-topic essay detection is an important part of the automatic essay scoring systems. Prior works mainly focused on the semantic similarity between the essay and the target prompt without considering the similarities between the essay and the reference prompts, while the latter can provide more semantic information on detecting off-topic essays. In this paper, an improved on-topic scores calculation method is proposed to improve the accuracy of off-topic essay detection. In our approach, we use the semantic difference between the similarities of the essay with the target prompt and that of with the reference prompts to on-topic score calculation, which is used to better distinguish the on-topic essays and the off-topic essays. Based on our new on-topic score method, we realize an unsupervised off-topic essay detection system without large scale of training data. Several experiments on six datasets of Kaggle competition show that our method can effectively detect off-topic essay.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信