{"title":"一个新的手写论文数据集,用于新基准的自动论文评分","authors":"Shiyu Hu, Qichuan Yang, Yibing Yang","doi":"10.1145/3579654.3579684","DOIUrl":null,"url":null,"abstract":"The study of algorithms for Automatic Essay Scoring (AES) currently is motivated by textual essay-scoring datasets constructed by anonymous teachers from schools. We propose VisEssay, the first essay-scoring dataset containing handwriting images. VisEssay consists of over 13,000 visual essays originating from 25+ professional in-service teachers whose personal scoring accuracy are recorded by his/her scoring history, together with crowdsourced OCR result per handwriting image. VisEssay differs from the many existing AES datasets because 1) handwriting images are captured from non-native speakers with complementary essay types for existing datasets, 2) teachers scoring these essays are with personal profiles and score accuracy, and 3) corresponding text is checked to keep the consistency. Evaluation of modern algorithms for AES and text classification reveals that the proposed VisEssay is a challenging dataset. In the cause of encouraging a larger community to develop more generalized educational algorithms, we introduce three novel AES systems together with VisEssay and analysis the result as a new benchmark.","PeriodicalId":146783,"journal":{"name":"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A New Handwritten Essay Dataset for Automatic Essay Scoring with A New Benchmark\",\"authors\":\"Shiyu Hu, Qichuan Yang, Yibing Yang\",\"doi\":\"10.1145/3579654.3579684\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The study of algorithms for Automatic Essay Scoring (AES) currently is motivated by textual essay-scoring datasets constructed by anonymous teachers from schools. We propose VisEssay, the first essay-scoring dataset containing handwriting images. VisEssay consists of over 13,000 visual essays originating from 25+ professional in-service teachers whose personal scoring accuracy are recorded by his/her scoring history, together with crowdsourced OCR result per handwriting image. VisEssay differs from the many existing AES datasets because 1) handwriting images are captured from non-native speakers with complementary essay types for existing datasets, 2) teachers scoring these essays are with personal profiles and score accuracy, and 3) corresponding text is checked to keep the consistency. Evaluation of modern algorithms for AES and text classification reveals that the proposed VisEssay is a challenging dataset. In the cause of encouraging a larger community to develop more generalized educational algorithms, we introduce three novel AES systems together with VisEssay and analysis the result as a new benchmark.\",\"PeriodicalId\":146783,\"journal\":{\"name\":\"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence\",\"volume\":\"65 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3579654.3579684\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579654.3579684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A New Handwritten Essay Dataset for Automatic Essay Scoring with A New Benchmark
The study of algorithms for Automatic Essay Scoring (AES) currently is motivated by textual essay-scoring datasets constructed by anonymous teachers from schools. We propose VisEssay, the first essay-scoring dataset containing handwriting images. VisEssay consists of over 13,000 visual essays originating from 25+ professional in-service teachers whose personal scoring accuracy are recorded by his/her scoring history, together with crowdsourced OCR result per handwriting image. VisEssay differs from the many existing AES datasets because 1) handwriting images are captured from non-native speakers with complementary essay types for existing datasets, 2) teachers scoring these essays are with personal profiles and score accuracy, and 3) corresponding text is checked to keep the consistency. Evaluation of modern algorithms for AES and text classification reveals that the proposed VisEssay is a challenging dataset. In the cause of encouraging a larger community to develop more generalized educational algorithms, we introduce three novel AES systems together with VisEssay and analysis the result as a new benchmark.