{"title":"基于卷积神经网络的多语言手写数字识别","authors":"Yihan Wang","doi":"10.1109/CDS52072.2021.00082","DOIUrl":null,"url":null,"abstract":"Compared with handcrafted features, convolutional neural network (CNN) is a more effective model to solve the handwritten numeral recognition problem. In recent years, many different datasets have appeared, but there is a lack of a collection of multi-language handwritten numeral datasets, and the evaluation of multi-language handwritten numeral recognition for CNN is lacking. In this paper, we collect and present the biggest dataset for the multi-language handwritten numeral recognition problem ever, consisting of 15 different languages. We also contribute two baseline CNNs and evaluate them in this newly combined dataset. We found that LeNet is more effective than a more complex CNN. We also found that Devanagari and Telugu are the most difficult to distinguish when mixed with other similar languages.","PeriodicalId":380426,"journal":{"name":"2021 2nd International Conference on Computing and Data Science (CDS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-language Handwritten Numeral Recognition with Convolutional Neural Network\",\"authors\":\"Yihan Wang\",\"doi\":\"10.1109/CDS52072.2021.00082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Compared with handcrafted features, convolutional neural network (CNN) is a more effective model to solve the handwritten numeral recognition problem. In recent years, many different datasets have appeared, but there is a lack of a collection of multi-language handwritten numeral datasets, and the evaluation of multi-language handwritten numeral recognition for CNN is lacking. In this paper, we collect and present the biggest dataset for the multi-language handwritten numeral recognition problem ever, consisting of 15 different languages. We also contribute two baseline CNNs and evaluate them in this newly combined dataset. We found that LeNet is more effective than a more complex CNN. We also found that Devanagari and Telugu are the most difficult to distinguish when mixed with other similar languages.\",\"PeriodicalId\":380426,\"journal\":{\"name\":\"2021 2nd International Conference on Computing and Data Science (CDS)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 2nd International Conference on Computing and Data Science (CDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CDS52072.2021.00082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 2nd International Conference on Computing and Data Science (CDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDS52072.2021.00082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-language Handwritten Numeral Recognition with Convolutional Neural Network
Compared with handcrafted features, convolutional neural network (CNN) is a more effective model to solve the handwritten numeral recognition problem. In recent years, many different datasets have appeared, but there is a lack of a collection of multi-language handwritten numeral datasets, and the evaluation of multi-language handwritten numeral recognition for CNN is lacking. In this paper, we collect and present the biggest dataset for the multi-language handwritten numeral recognition problem ever, consisting of 15 different languages. We also contribute two baseline CNNs and evaluate them in this newly combined dataset. We found that LeNet is more effective than a more complex CNN. We also found that Devanagari and Telugu are the most difficult to distinguish when mixed with other similar languages.