Pranav P Nair, Ajay James, Philomina Simon, Bhagyasree P V
{"title":"马拉雅拉姆手写字符识别使用CNN架构","authors":"Pranav P Nair, Ajay James, Philomina Simon, Bhagyasree P V","doi":"10.52549/ijeei.v11i3.4829","DOIUrl":null,"url":null,"abstract":"The process of encoding an input text image into a machine-readable format is called optical character recognition (OCR). The difference in characteristics of each language makes it difficult to develop a universal method that will have high accuracy for all languages. A method that produces good results for one language may not necessarily produce the same results for another language. OCR for printed characters is easier than handwritten characters because of the uniformity that exists in printed characters. While conventional methods find it hard to improve the existing methods, Convolutional Neural Networks (CNN) has shown drastic improvement in classification and recognition of other languages. However, there is no OCR model using CNN for Malayalam characters. Our proposed system uses a new CNN architecture for feature extraction and softmax layer for classification of characters. This eliminates manual designing of features that is used in the conventional methods. P-ARTS Kayyezhuthu dataset is used for training the CNN and an accuracy of 99.75% is obtained for the testing dataset meanwhile a collection of 40 real time input images yielded an accuracy of 95%.","PeriodicalId":37618,"journal":{"name":"Indonesian Journal of Electrical Engineering and Informatics","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Malayalam Handwritten Character Recognition using CNN Architecture\",\"authors\":\"Pranav P Nair, Ajay James, Philomina Simon, Bhagyasree P V\",\"doi\":\"10.52549/ijeei.v11i3.4829\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The process of encoding an input text image into a machine-readable format is called optical character recognition (OCR). The difference in characteristics of each language makes it difficult to develop a universal method that will have high accuracy for all languages. A method that produces good results for one language may not necessarily produce the same results for another language. OCR for printed characters is easier than handwritten characters because of the uniformity that exists in printed characters. While conventional methods find it hard to improve the existing methods, Convolutional Neural Networks (CNN) has shown drastic improvement in classification and recognition of other languages. However, there is no OCR model using CNN for Malayalam characters. Our proposed system uses a new CNN architecture for feature extraction and softmax layer for classification of characters. This eliminates manual designing of features that is used in the conventional methods. P-ARTS Kayyezhuthu dataset is used for training the CNN and an accuracy of 99.75% is obtained for the testing dataset meanwhile a collection of 40 real time input images yielded an accuracy of 95%.\",\"PeriodicalId\":37618,\"journal\":{\"name\":\"Indonesian Journal of Electrical Engineering and Informatics\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Indonesian Journal of Electrical Engineering and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.52549/ijeei.v11i3.4829\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Indonesian Journal of Electrical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.52549/ijeei.v11i3.4829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
Malayalam Handwritten Character Recognition using CNN Architecture
The process of encoding an input text image into a machine-readable format is called optical character recognition (OCR). The difference in characteristics of each language makes it difficult to develop a universal method that will have high accuracy for all languages. A method that produces good results for one language may not necessarily produce the same results for another language. OCR for printed characters is easier than handwritten characters because of the uniformity that exists in printed characters. While conventional methods find it hard to improve the existing methods, Convolutional Neural Networks (CNN) has shown drastic improvement in classification and recognition of other languages. However, there is no OCR model using CNN for Malayalam characters. Our proposed system uses a new CNN architecture for feature extraction and softmax layer for classification of characters. This eliminates manual designing of features that is used in the conventional methods. P-ARTS Kayyezhuthu dataset is used for training the CNN and an accuracy of 99.75% is obtained for the testing dataset meanwhile a collection of 40 real time input images yielded an accuracy of 95%.
期刊介绍:
The journal publishes original papers in the field of electrical, computer and informatics engineering which covers, but not limited to, the following scope: Electronics: Electronic Materials, Microelectronic System, Design and Implementation of Application Specific Integrated Circuits (ASIC), VLSI Design, System-on-a-Chip (SoC) and Electronic Instrumentation Using CAD Tools, digital signal & data Processing, , Biomedical Transducers and instrumentation. Electrical: Electrical Engineering Materials, Electric Power Generation, Transmission and Distribution, Power Electronics, Power Quality, Power Economic, FACTS, Renewable Energy, Electric Traction. Telecommunication: Modulation and Signal Processing for Telecommunication, Information Theory and Coding, Antenna and Wave Propagation, Wireless and Mobile Communications, Radio Communication, Communication Electronics and Microwave, Radar Imaging. Control: Optimal, Robust and Adaptive Controls, Non Linear and Stochastic Controls, Modeling and Identification, Robotics, Image Based Control, Hybrid and Switching Control, Process Optimization and Scheduling, Control and Intelligent Systems. Computer and Informatics: Computer Architecture, Parallel and Distributed Computer, Pervasive Computing, Computer Network, Embedded System, Human—Computer Interaction, Virtual/Augmented Reality, Computer Security, Software Engineering (Software: Lifecycle, Management, Engineering Process, Engineering Tools and Methods), Programming (Programming Methodology and Paradigm), Data Engineering (Data and Knowledge level Modeling, Information Management (DB) practices, Knowledge Based Management System, Knowledge Discovery in Data).