{"title":"MPOCSR: optical chemical structure recognition based on multi-path Vision Transformer","authors":"Fan Lin, Jianhua Li","doi":"10.1007/s40747-024-01561-6","DOIUrl":null,"url":null,"abstract":"<p>Optical chemical structure recognition (OCSR) is a fundamental and crucial task in the field of chemistry, which aims at transforming intricate chemical structure images into machine-readable formats. Current deep learning-based OCSR methods typically use image feature extractors to extract visual features and employ encoder-decoder architectures for chemical structure recognition. However, the performance of these methods is limited by their image feature extractors and the class imbalance of elements in chemical structure representation. This paper proposes MPOCSR (multi-path optical chemical structure recognition), which introduces the multi-path Vision Transformer (MPViT) and the class-balanced (CB) loss function to address these two challenges. MPOCSR uses MPViT as an image feature extractor, combining the advantages of convolutional neural networks and Vision Transformers. This strategy enables the provision of richer visual information for subsequent decoding processes. Furthermore, MPOCSR incorporates CB loss function to rebalance the loss weights among different categories. For training and validation of our method, we constructed a dataset that includes both Markush and non-Markush structures. Experimental results show that MPOCSR achieves an accuracy of 90.95% on the test set, surpassing other existing methods.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01561-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Optical chemical structure recognition (OCSR) is a fundamental and crucial task in the field of chemistry, which aims at transforming intricate chemical structure images into machine-readable formats. Current deep learning-based OCSR methods typically use image feature extractors to extract visual features and employ encoder-decoder architectures for chemical structure recognition. However, the performance of these methods is limited by their image feature extractors and the class imbalance of elements in chemical structure representation. This paper proposes MPOCSR (multi-path optical chemical structure recognition), which introduces the multi-path Vision Transformer (MPViT) and the class-balanced (CB) loss function to address these two challenges. MPOCSR uses MPViT as an image feature extractor, combining the advantages of convolutional neural networks and Vision Transformers. This strategy enables the provision of richer visual information for subsequent decoding processes. Furthermore, MPOCSR incorporates CB loss function to rebalance the loss weights among different categories. For training and validation of our method, we constructed a dataset that includes both Markush and non-Markush structures. Experimental results show that MPOCSR achieves an accuracy of 90.95% on the test set, surpassing other existing methods.
期刊介绍:
Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.