{"title":"端到端印度文字OCR系统的性能分析","authors":"P. P. Kumar, C. Bhagvati, A. Agarwal","doi":"10.1145/2432553.2432577","DOIUrl":null,"url":null,"abstract":"Performance evaluation of End-to-End OCR systems of Indic scripts requires matching of UNICODE sequences of OCR output and ground truth. In the literature, Levenshtein edit distance has been used to compute error rates of OCR systems but the accuracies are not explicitly reported. In the present work, we have proposed an accuracy measure based on edit distance and used it in conjunction with error rate to report the performance of an OCR system. We have analyzed the relationship between accuracy and error rates in a quantitative manner. Our analysis has shown that accuracy and error rate are independent of each other and so both are needed to report complete performance of an OCR system. Proposed approach is applicable to all the Indic scripts and the experimental results on different scripts like Devanagari, Telugu, Kannada etc. are shown.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"On performance analysis of end-to-end OCR systems of Indic scripts\",\"authors\":\"P. P. Kumar, C. Bhagvati, A. Agarwal\",\"doi\":\"10.1145/2432553.2432577\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Performance evaluation of End-to-End OCR systems of Indic scripts requires matching of UNICODE sequences of OCR output and ground truth. In the literature, Levenshtein edit distance has been used to compute error rates of OCR systems but the accuracies are not explicitly reported. In the present work, we have proposed an accuracy measure based on edit distance and used it in conjunction with error rate to report the performance of an OCR system. We have analyzed the relationship between accuracy and error rates in a quantitative manner. Our analysis has shown that accuracy and error rate are independent of each other and so both are needed to report complete performance of an OCR system. Proposed approach is applicable to all the Indic scripts and the experimental results on different scripts like Devanagari, Telugu, Kannada etc. are shown.\",\"PeriodicalId\":410986,\"journal\":{\"name\":\"DAR '12\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"DAR '12\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2432553.2432577\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"DAR '12","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2432553.2432577","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On performance analysis of end-to-end OCR systems of Indic scripts
Performance evaluation of End-to-End OCR systems of Indic scripts requires matching of UNICODE sequences of OCR output and ground truth. In the literature, Levenshtein edit distance has been used to compute error rates of OCR systems but the accuracies are not explicitly reported. In the present work, we have proposed an accuracy measure based on edit distance and used it in conjunction with error rate to report the performance of an OCR system. We have analyzed the relationship between accuracy and error rates in a quantitative manner. Our analysis has shown that accuracy and error rate are independent of each other and so both are needed to report complete performance of an OCR system. Proposed approach is applicable to all the Indic scripts and the experimental results on different scripts like Devanagari, Telugu, Kannada etc. are shown.