连接主义时态分类中标签内动态建模

Ashkan Sadeghi Lotfabadi, Kamaledin Ghiasi-Shirazi, A. Harati
{"title":"连接主义时态分类中标签内动态建模","authors":"Ashkan Sadeghi Lotfabadi, Kamaledin Ghiasi-Shirazi, A. Harati","doi":"10.1109/ICCKE.2017.8167906","DOIUrl":null,"url":null,"abstract":"Most sequence processing tasks can be cast as a problem of mapping a sequence of observations into a sequence of labels. This is a very difficult problem since the association between input data sequences and output label sequences is not given at the frame level. Recurrent neural networks (RNNs) equipped with connectionist temporal classification (CTC) are among the best tools devised to handle this problem and have been used to achieve state of the art results in many handwritten and speech recognition tasks. The reason that RNNs are used instead of feedforward networks in combination with CTC is that CTC does not model the dynamics of sequences. Specifically, the long short term memory (LSTM) RNN, which is excellent at memorizing information for a long time, is used in combination with CTC to overcome the limitations of CTC in modeling the dynamics of sequences. In this paper, we propose to model each label with a sequence of hidden sub-labels at the CTC level. The proposed framework allows CTC to learn the intra-label relations which transfers part of the load of learning dynamical sequences from RNN to CTC. Our experiments on handwriting recognition tasks show that the proposed method outperforms standard CTC in terms of accuracy.","PeriodicalId":151934,"journal":{"name":"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Modeling intra-label dynamics in connectionist temporal classification\",\"authors\":\"Ashkan Sadeghi Lotfabadi, Kamaledin Ghiasi-Shirazi, A. Harati\",\"doi\":\"10.1109/ICCKE.2017.8167906\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most sequence processing tasks can be cast as a problem of mapping a sequence of observations into a sequence of labels. This is a very difficult problem since the association between input data sequences and output label sequences is not given at the frame level. Recurrent neural networks (RNNs) equipped with connectionist temporal classification (CTC) are among the best tools devised to handle this problem and have been used to achieve state of the art results in many handwritten and speech recognition tasks. The reason that RNNs are used instead of feedforward networks in combination with CTC is that CTC does not model the dynamics of sequences. Specifically, the long short term memory (LSTM) RNN, which is excellent at memorizing information for a long time, is used in combination with CTC to overcome the limitations of CTC in modeling the dynamics of sequences. In this paper, we propose to model each label with a sequence of hidden sub-labels at the CTC level. The proposed framework allows CTC to learn the intra-label relations which transfers part of the load of learning dynamical sequences from RNN to CTC. Our experiments on handwriting recognition tasks show that the proposed method outperforms standard CTC in terms of accuracy.\",\"PeriodicalId\":151934,\"journal\":{\"name\":\"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCKE.2017.8167906\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2017.8167906","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

大多数序列处理任务都可以看作是将一系列观察结果映射到一系列标签的问题。这是一个非常困难的问题,因为输入数据序列和输出标签序列之间的关联并没有在帧级别给出。配备连接主义时间分类(CTC)的递归神经网络(rnn)是处理这一问题的最佳工具之一,并已被用于在许多手写和语音识别任务中获得最先进的结果。使用rnn而不是前馈网络与CTC结合的原因是CTC没有对序列的动态建模。其中,长短期记忆(LSTM) RNN在长时间记忆信息方面表现优异,与CTC相结合,克服了CTC在序列动态建模方面的局限性。在本文中,我们建议在CTC级别用隐藏子标签序列对每个标签进行建模。该框架允许CTC学习标签内关系,从而将学习动态序列的部分负荷从RNN转移到CTC。我们在手写识别任务上的实验表明,该方法在准确率方面优于标准CTC。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Modeling intra-label dynamics in connectionist temporal classification
Most sequence processing tasks can be cast as a problem of mapping a sequence of observations into a sequence of labels. This is a very difficult problem since the association between input data sequences and output label sequences is not given at the frame level. Recurrent neural networks (RNNs) equipped with connectionist temporal classification (CTC) are among the best tools devised to handle this problem and have been used to achieve state of the art results in many handwritten and speech recognition tasks. The reason that RNNs are used instead of feedforward networks in combination with CTC is that CTC does not model the dynamics of sequences. Specifically, the long short term memory (LSTM) RNN, which is excellent at memorizing information for a long time, is used in combination with CTC to overcome the limitations of CTC in modeling the dynamics of sequences. In this paper, we propose to model each label with a sequence of hidden sub-labels at the CTC level. The proposed framework allows CTC to learn the intra-label relations which transfers part of the load of learning dynamical sequences from RNN to CTC. Our experiments on handwriting recognition tasks show that the proposed method outperforms standard CTC in terms of accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信