Zihao Lin , Jinrong Li , Gang Dai , Tianshui Chen , Shuangping Huang , Jianmin Lin
{"title":"Contrastive representation enhancement and learning for handwritten mathematical expression recognition","authors":"Zihao Lin , Jinrong Li , Gang Dai , Tianshui Chen , Shuangping Huang , Jianmin Lin","doi":"10.1016/j.patrec.2024.08.021","DOIUrl":null,"url":null,"abstract":"<div><p>Handwritten mathematical expression recognition (HMER) is an appealing task due to its wide applications and research challenges. Previous deep learning-based methods used string decoder to emphasize on expression symbol awareness and achieved considerable recognition performance. However, these methods still meet an obstacle in recognizing handwritten symbols with varying appearance, in which huge appearance variations significantly lead to the ambiguity of symbol representation. To this end, our intuition is to employ printed expressions with unified appearance to serve as the template of handwritten expressions, alleviating the effects brought by varying symbol appearance. In this paper, we propose a contrastive learning method, where handwritten symbols with identical semantic are clustered together through the guidance of printed symbols, leading model to enhance the robustness of symbol semantic representations. Specifically, we propose an anchor generation scheme to obtain printed expression images corresponding with handwritten expressions. We propose a contrastive learning objective, termed Semantic-NCE Loss, to pull together printed and handwritten symbols with identical semantic. Moreover, we employ a string decoder to parse the calibrated semantic representations, outputting satisfactory expression symbols. The experiment results on benchmark datasets CROHME 14/16/19 demonstrate that our method noticeably improves recognition accuracy of handwritten expressions and outperforms the standard string decoder methods.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"186 ","pages":"Pages 14-20"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524002538","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Handwritten mathematical expression recognition (HMER) is an appealing task due to its wide applications and research challenges. Previous deep learning-based methods used string decoder to emphasize on expression symbol awareness and achieved considerable recognition performance. However, these methods still meet an obstacle in recognizing handwritten symbols with varying appearance, in which huge appearance variations significantly lead to the ambiguity of symbol representation. To this end, our intuition is to employ printed expressions with unified appearance to serve as the template of handwritten expressions, alleviating the effects brought by varying symbol appearance. In this paper, we propose a contrastive learning method, where handwritten symbols with identical semantic are clustered together through the guidance of printed symbols, leading model to enhance the robustness of symbol semantic representations. Specifically, we propose an anchor generation scheme to obtain printed expression images corresponding with handwritten expressions. We propose a contrastive learning objective, termed Semantic-NCE Loss, to pull together printed and handwritten symbols with identical semantic. Moreover, we employ a string decoder to parse the calibrated semantic representations, outputting satisfactory expression symbols. The experiment results on benchmark datasets CROHME 14/16/19 demonstrate that our method noticeably improves recognition accuracy of handwritten expressions and outperforms the standard string decoder methods.
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.