CNN-BLSTM-CRF网络对学生在线手写作业的语义标注

Amirali Darvishzadeh, T. Stahovich, Amir H. Feghahati, Negin Entezari, Shaghayegh Gharghabi, Reed Kanemaru, C. Shelton
{"title":"CNN-BLSTM-CRF网络对学生在线手写作业的语义标注","authors":"Amirali Darvishzadeh, T. Stahovich, Amir H. Feghahati, Negin Entezari, Shaghayegh Gharghabi, Reed Kanemaru, C. Shelton","doi":"10.1109/ICDAR.2019.00169","DOIUrl":null,"url":null,"abstract":"Automatic semantic labeling of strokes in online handwritten documents is a crucial task for many applications such as diagram interpretation, text recognition, and search. We formulate this task as a stroke classification problem in which each stroke is classified as a cross-out, free body diagram, or text. Separating free body diagram and text in this work is different than the traditional text/non-text separation problem because these two classes contain both text and graphics. The text class includes textual notes, mathematical symbols/equations, and graphics such as arrows that connect other elements. The free body diagram class also contains graphics and various alphanumeric characters and symbols that mark or explain the graphical objects. In this work, we present a novel deep neural network model for classification of strokes in online handwritten documents. There are two input sequences to the network. The first sequence contains the trajectories of the pen strokes while the second contains features of the strokes. Each of these sequences is fed to its own CNN-BLSTM channel to extract features and encode relationships between nearby strokes. The output of the two channels is concatenated and used as the input to a CRF layer that predicts the best sequence of labels for given input sequences. We evaluated our model on a dataset of 1,060 pages written by 132 students in an undergraduate statics course. Our model achieved an overall classification accuracy of 94.70% on this dataset.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"CNN-BLSTM-CRF Network for Semantic Labeling of Students' Online Handwritten Assignments\",\"authors\":\"Amirali Darvishzadeh, T. Stahovich, Amir H. Feghahati, Negin Entezari, Shaghayegh Gharghabi, Reed Kanemaru, C. Shelton\",\"doi\":\"10.1109/ICDAR.2019.00169\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic semantic labeling of strokes in online handwritten documents is a crucial task for many applications such as diagram interpretation, text recognition, and search. We formulate this task as a stroke classification problem in which each stroke is classified as a cross-out, free body diagram, or text. Separating free body diagram and text in this work is different than the traditional text/non-text separation problem because these two classes contain both text and graphics. The text class includes textual notes, mathematical symbols/equations, and graphics such as arrows that connect other elements. The free body diagram class also contains graphics and various alphanumeric characters and symbols that mark or explain the graphical objects. In this work, we present a novel deep neural network model for classification of strokes in online handwritten documents. There are two input sequences to the network. The first sequence contains the trajectories of the pen strokes while the second contains features of the strokes. Each of these sequences is fed to its own CNN-BLSTM channel to extract features and encode relationships between nearby strokes. The output of the two channels is concatenated and used as the input to a CRF layer that predicts the best sequence of labels for given input sequences. We evaluated our model on a dataset of 1,060 pages written by 132 students in an undergraduate statics course. Our model achieved an overall classification accuracy of 94.70% on this dataset.\",\"PeriodicalId\":325437,\"journal\":{\"name\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2019.00169\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

在线手写文档笔画的自动语义标注是许多应用程序(如图表解释、文本识别和搜索)的关键任务。我们将此任务表述为笔画分类问题,其中每个笔画被分类为划线、自由体图或文本。在这项工作中,分离自由体图和文本与传统的文本/非文本分离问题不同,因为这两个类既包含文本又包含图形。text类包括文本注释、数学符号/方程和图形(如连接其他元素的箭头)。自由体图类还包含图形和各种字母数字字符以及标记或解释图形对象的符号。在这项工作中,我们提出了一种新的深度神经网络模型,用于在线手写文档的笔画分类。网络有两个输入序列。第一个序列包含笔画的轨迹,而第二个序列包含笔画的特征。每个序列都被送入自己的CNN-BLSTM通道,以提取特征并编码附近笔画之间的关系。两个通道的输出被连接起来并用作CRF层的输入,该层预测给定输入序列的最佳标签序列。我们在一个1060页的数据集上评估了我们的模型,该数据集由132名本科生在统计学课程中编写。我们的模型在该数据集上实现了94.70%的总体分类准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CNN-BLSTM-CRF Network for Semantic Labeling of Students' Online Handwritten Assignments
Automatic semantic labeling of strokes in online handwritten documents is a crucial task for many applications such as diagram interpretation, text recognition, and search. We formulate this task as a stroke classification problem in which each stroke is classified as a cross-out, free body diagram, or text. Separating free body diagram and text in this work is different than the traditional text/non-text separation problem because these two classes contain both text and graphics. The text class includes textual notes, mathematical symbols/equations, and graphics such as arrows that connect other elements. The free body diagram class also contains graphics and various alphanumeric characters and symbols that mark or explain the graphical objects. In this work, we present a novel deep neural network model for classification of strokes in online handwritten documents. There are two input sequences to the network. The first sequence contains the trajectories of the pen strokes while the second contains features of the strokes. Each of these sequences is fed to its own CNN-BLSTM channel to extract features and encode relationships between nearby strokes. The output of the two channels is concatenated and used as the input to a CRF layer that predicts the best sequence of labels for given input sequences. We evaluated our model on a dataset of 1,060 pages written by 132 students in an undergraduate statics course. Our model achieved an overall classification accuracy of 94.70% on this dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信