{"title":"Page segmentation and classification","authors":"Theo Pavlidis, Jiangying Zhou","doi":"10.1016/1049-9652(92)90068-9","DOIUrl":null,"url":null,"abstract":"<div><p>Page segmentation is the process by which a scanned page is divided into columns and blocks which are then classified as halftones, graphics, or text. Past techniques have used the fact that such parts form right rectangles for most printed material. This property is not true when the page is tilted, and the heuristics based on it fail in such cases unless a rather expensive tilt angle estimation is performed. We describe a class of techniques based on smeared run length codes that divide a page into gray and nearly white parts. Segmentation is then performed by finding connected components either by the gray elements or of the white, the latter forming white streams that partition a page into blocks of printed material. Such techniques appear quite robust in the presence of severe tilt (even greater than 10 °) and are also quite fast (about a second a page on a SPARC station for gray element aggregation). Further classification into text or halftones is based mostly on properties of the across scanlines correlation. For text correlation of adjacent scanlines tends to be quite high, but then it drops rapidly. For halftones, the correlation of adjacent scanlines is usually well below that for text, but it does not change much with distance.</p></div>","PeriodicalId":100349,"journal":{"name":"CVGIP: Graphical Models and Image Processing","volume":"54 6","pages":"Pages 484-496"},"PeriodicalIF":0.0000,"publicationDate":"1992-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/1049-9652(92)90068-9","citationCount":"291","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CVGIP: Graphical Models and Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/1049965292900689","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 291
Abstract
Page segmentation is the process by which a scanned page is divided into columns and blocks which are then classified as halftones, graphics, or text. Past techniques have used the fact that such parts form right rectangles for most printed material. This property is not true when the page is tilted, and the heuristics based on it fail in such cases unless a rather expensive tilt angle estimation is performed. We describe a class of techniques based on smeared run length codes that divide a page into gray and nearly white parts. Segmentation is then performed by finding connected components either by the gray elements or of the white, the latter forming white streams that partition a page into blocks of printed material. Such techniques appear quite robust in the presence of severe tilt (even greater than 10 °) and are also quite fast (about a second a page on a SPARC station for gray element aggregation). Further classification into text or halftones is based mostly on properties of the across scanlines correlation. For text correlation of adjacent scanlines tends to be quite high, but then it drops rapidly. For halftones, the correlation of adjacent scanlines is usually well below that for text, but it does not change much with distance.