C. Strouthopoulos, N. Papamarkos, A. Atsalakis, C. Chamzas
{"title":"在彩色文档中定位文本","authors":"C. Strouthopoulos, N. Papamarkos, A. Atsalakis, C. Chamzas","doi":"10.1109/ICIP.2001.959233","DOIUrl":null,"url":null,"abstract":"In complex color documents, text, drawings and graphics occur with millions of different colors. In many cases, text regions are overlaid onto drawings or graphics. A new method is proposed for the automatic detection and extraction of text in mixed type color documents. The proposed method is based on a combination of an adaptive color reduction (ACR) technique and a page layout analysis (PLA) approach. The ACR technique is used to obtain the optimal number of colors. Then, the image is split to separable binary images, each one corresponding to every principal color. The PLA technique is applied independently to each one of the color planes and identifies the text regions. A merging procedure is applied in the final stage to merge the text regions derived from the color planes and to produce the final document.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Locating text in color documents\",\"authors\":\"C. Strouthopoulos, N. Papamarkos, A. Atsalakis, C. Chamzas\",\"doi\":\"10.1109/ICIP.2001.959233\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In complex color documents, text, drawings and graphics occur with millions of different colors. In many cases, text regions are overlaid onto drawings or graphics. A new method is proposed for the automatic detection and extraction of text in mixed type color documents. The proposed method is based on a combination of an adaptive color reduction (ACR) technique and a page layout analysis (PLA) approach. The ACR technique is used to obtain the optimal number of colors. Then, the image is split to separable binary images, each one corresponding to every principal color. The PLA technique is applied independently to each one of the color planes and identifies the text regions. A merging procedure is applied in the final stage to merge the text regions derived from the color planes and to produce the final document.\",\"PeriodicalId\":291827,\"journal\":{\"name\":\"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIP.2001.959233\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP.2001.959233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In complex color documents, text, drawings and graphics occur with millions of different colors. In many cases, text regions are overlaid onto drawings or graphics. A new method is proposed for the automatic detection and extraction of text in mixed type color documents. The proposed method is based on a combination of an adaptive color reduction (ACR) technique and a page layout analysis (PLA) approach. The ACR technique is used to obtain the optimal number of colors. Then, the image is split to separable binary images, each one corresponding to every principal color. The PLA technique is applied independently to each one of the color planes and identifies the text regions. A merging procedure is applied in the final stage to merge the text regions derived from the color planes and to produce the final document.