构建最优卷积神经网络模型求解复杂字符识别问题

Q4 Materials Science

Radioelektronika, Nanosistemy, Informacionnye Tehnologii Pub Date : 2023-02-20 DOI:10.17587/it.29.84-90

A. E. Trubin, A. V. Batishchev, A. N. Aleksahin, A. E. Zubanova, A. Morozov

{"title":"构建最优卷积神经网络模型求解复杂字符识别问题","authors":"A. E. Trubin, A. V. Batishchev, A. N. Aleksahin, A. E. Zubanova, A. Morozov","doi":"10.17587/it.29.84-90","DOIUrl":null,"url":null,"abstract":"The purpose of the study is aimed at developing a lighter architecture of a convolutional neural network model that will cope with the narrowly focused task of recognizing complex characters better than large-scale and well-known ones. As the source data, the characters of the Japanese language are used, consisting of two syllabic alphabets: hiragana and katakana, which are the most complex, since their writing style is characterized by a large number of features and similarity of characters, which greatly complicates the task of their classification and recognition. The author's model of a convolutional neural network is designed in the article, consisting of four convolutional layers, three layers of subdiscretization and three layers of exclusion. The developed model was compared with one of the most popular models of the EfficientNetBO neural network from the point of view of their architecture and the results of work on the same data. To implement its own convolutional neural network model, the classic Keras + Tensorflow bundle was used, since these libraries provide the most convenient tools for working in the field of machine learning. The result of the conducted research is the developed technology of fast and accurate recognition of complex symbols based on a convolutional neural network, which can become the basis of a software product in the field of computer vision.","PeriodicalId":37476,"journal":{"name":"Radioelektronika, Nanosistemy, Informacionnye Tehnologii","volume":"498 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Building an Optimal Convolutional Neural Network Model for Solving Complex Character Recognition Problems\",\"authors\":\"A. E. Trubin, A. V. Batishchev, A. N. Aleksahin, A. E. Zubanova, A. Morozov\",\"doi\":\"10.17587/it.29.84-90\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The purpose of the study is aimed at developing a lighter architecture of a convolutional neural network model that will cope with the narrowly focused task of recognizing complex characters better than large-scale and well-known ones. As the source data, the characters of the Japanese language are used, consisting of two syllabic alphabets: hiragana and katakana, which are the most complex, since their writing style is characterized by a large number of features and similarity of characters, which greatly complicates the task of their classification and recognition. The author's model of a convolutional neural network is designed in the article, consisting of four convolutional layers, three layers of subdiscretization and three layers of exclusion. The developed model was compared with one of the most popular models of the EfficientNetBO neural network from the point of view of their architecture and the results of work on the same data. To implement its own convolutional neural network model, the classic Keras + Tensorflow bundle was used, since these libraries provide the most convenient tools for working in the field of machine learning. The result of the conducted research is the developed technology of fast and accurate recognition of complex symbols based on a convolutional neural network, which can become the basis of a software product in the field of computer vision.\",\"PeriodicalId\":37476,\"journal\":{\"name\":\"Radioelektronika, Nanosistemy, Informacionnye Tehnologii\",\"volume\":\"498 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Radioelektronika, Nanosistemy, Informacionnye Tehnologii\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17587/it.29.84-90\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Materials Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radioelektronika, Nanosistemy, Informacionnye Tehnologii","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17587/it.29.84-90","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Materials Science","Score":null,"Total":0}

引用次数: 0

摘要

该研究的目的是开发一种更轻的卷积神经网络模型架构，该模型将比大规模和知名的字符更好地处理识别复杂字符的狭窄聚焦任务。作为源数据，使用的是日语文字，由平假名和片假名两个音节字母组成，平假名和片假名是最复杂的，因为它们的写作风格具有大量的特征和字符相似性，这大大增加了分类和识别的难度。本文设计了一个由四层卷积层、三层子离散层和三层排除层组成的卷积神经网络模型。从结构和对相同数据的工作结果的角度，将开发的模型与最流行的effentnetbo神经网络模型之一进行了比较。为了实现自己的卷积神经网络模型，使用了经典的Keras + Tensorflow包，因为这些库为机器学习领域的工作提供了最方便的工具。研究成果是基于卷积神经网络的复杂符号快速准确识别技术的发展，可以成为计算机视觉领域软件产品的基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Building an Optimal Convolutional Neural Network Model for Solving Complex Character Recognition Problems

The purpose of the study is aimed at developing a lighter architecture of a convolutional neural network model that will cope with the narrowly focused task of recognizing complex characters better than large-scale and well-known ones. As the source data, the characters of the Japanese language are used, consisting of two syllabic alphabets: hiragana and katakana, which are the most complex, since their writing style is characterized by a large number of features and similarity of characters, which greatly complicates the task of their classification and recognition. The author's model of a convolutional neural network is designed in the article, consisting of four convolutional layers, three layers of subdiscretization and three layers of exclusion. The developed model was compared with one of the most popular models of the EfficientNetBO neural network from the point of view of their architecture and the results of work on the same data. To implement its own convolutional neural network model, the classic Keras + Tensorflow bundle was used, since these libraries provide the most convenient tools for working in the field of machine learning. The result of the conducted research is the developed technology of fast and accurate recognition of complex symbols based on a convolutional neural network, which can become the basis of a software product in the field of computer vision.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Radioelektronika, Nanosistemy, Informacionnye Tehnologii Materials Science-Materials Science (miscellaneous)

CiteScore

0.60

自引率

0.00%

发文量

期刊介绍： Journal “Radioelectronics. Nanosystems. Information Technologies” (abbr RENSIT) publishes original articles, reviews and brief reports, not previously published, on topical problems in radioelectronics (including biomedical) and fundamentals of information, nano- and biotechnologies and adjacent areas of physics and mathematics. The authors of the journal are academicians, corresponding members and foreign members of the Russian Academy of Natural Sciences (RANS) and their colleagues, as well as other russian and foreign authors on the proposal of the members of RANS, which can be obtained by the author before sending articles to the editor or after its arrival on the recommendation of a member of the editorial board or another member of the RANS, who gave the opinion on the article at the request of the editior. The editors will accept articles in both Russian and English languages. Articles are internally peer reviewed (double-blind peer review) by members of the Editorial Board. Some articles undergo external review, if necessary. Designed for researchers, graduate students, physics students of senior courses and teachers. It turns out 2 times a year (that includes 2 rooms)