{"title":"基于MSER的自然图像多方向文本识别与分类","authors":"R. P, Shamjiith, R. K","doi":"10.1109/incet49848.2020.9154142","DOIUrl":null,"url":null,"abstract":"Text recognition is a vast field of research and experimentation under image processing domain. It is a process by which the system locates the area whichever any kind of text is present and to extract them. The extracted text must be converted to human readable form after several processing and to classify them into meaningful classes based on the content. The platform used here is MATLAB R2018a. Firstly, Pre-processing is done on the ICDAR 2017 dataset in order to remove noise content. Then Segmentation is done to get a rough idea of the textual content present. Needful features are extracted using MSER (Maximally stable extremal regions). The obtained result is then processed with Stroke width transform. Geometrical features of text are matched with the regions. Finally, all of the processed regions are merged to obtain the exact text and extract them with OCR (Optical Character Recognition). Classifying these into meaningful attributes makes more sense to the extracted text.","PeriodicalId":174411,"journal":{"name":"2020 International Conference for Emerging Technology (INCET)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Multi-Oriented Text Recognition and Classification in Natural Images using MSER\",\"authors\":\"R. P, Shamjiith, R. K\",\"doi\":\"10.1109/incet49848.2020.9154142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text recognition is a vast field of research and experimentation under image processing domain. It is a process by which the system locates the area whichever any kind of text is present and to extract them. The extracted text must be converted to human readable form after several processing and to classify them into meaningful classes based on the content. The platform used here is MATLAB R2018a. Firstly, Pre-processing is done on the ICDAR 2017 dataset in order to remove noise content. Then Segmentation is done to get a rough idea of the textual content present. Needful features are extracted using MSER (Maximally stable extremal regions). The obtained result is then processed with Stroke width transform. Geometrical features of text are matched with the regions. Finally, all of the processed regions are merged to obtain the exact text and extract them with OCR (Optical Character Recognition). Classifying these into meaningful attributes makes more sense to the extracted text.\",\"PeriodicalId\":174411,\"journal\":{\"name\":\"2020 International Conference for Emerging Technology (INCET)\",\"volume\":\"92 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference for Emerging Technology (INCET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/incet49848.2020.9154142\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference for Emerging Technology (INCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/incet49848.2020.9154142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-Oriented Text Recognition and Classification in Natural Images using MSER
Text recognition is a vast field of research and experimentation under image processing domain. It is a process by which the system locates the area whichever any kind of text is present and to extract them. The extracted text must be converted to human readable form after several processing and to classify them into meaningful classes based on the content. The platform used here is MATLAB R2018a. Firstly, Pre-processing is done on the ICDAR 2017 dataset in order to remove noise content. Then Segmentation is done to get a rough idea of the textual content present. Needful features are extracted using MSER (Maximally stable extremal regions). The obtained result is then processed with Stroke width transform. Geometrical features of text are matched with the regions. Finally, all of the processed regions are merged to obtain the exact text and extract them with OCR (Optical Character Recognition). Classifying these into meaningful attributes makes more sense to the extracted text.