{"title":"基于匹配过滤和自顶向下分组的手写文档文本行分割","authors":"Youbao Tang, Xiangqian Wu, Wei Bu","doi":"10.1109/DAS.2014.14","DOIUrl":null,"url":null,"abstract":"This paper presents a novel text line segmentation method based on matched filtering and top-down grouping for handwritten documents. The proposed method consists of three distinct steps. Firstly, the foreground pixel density (FPD) of handwritten document image (HDI) is estimated, then FPD is used to decide the size of the generated filter which is the convolution of a band-shape filter and an isotropic LoG filter. Secondly, the centers of the text lines (CTLs) are extracted by performing filtering, binarizing, thinning and top-down grouping operation on HDI. Finally, the overlapping connected-components (OCCs) which travel through multiple text lines are separated, and then all OCCs are assigned to a label of CTLs by the nearest neighbor principle. The proposed method is tested on two public databases, and the experimental results show that the proposed method outperforms the state-of-the-art text line segmentation approaches in both of these databases.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Text Line Segmentation Based on Matched Filtering and Top-Down Grouping for Handwritten Documents\",\"authors\":\"Youbao Tang, Xiangqian Wu, Wei Bu\",\"doi\":\"10.1109/DAS.2014.14\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a novel text line segmentation method based on matched filtering and top-down grouping for handwritten documents. The proposed method consists of three distinct steps. Firstly, the foreground pixel density (FPD) of handwritten document image (HDI) is estimated, then FPD is used to decide the size of the generated filter which is the convolution of a band-shape filter and an isotropic LoG filter. Secondly, the centers of the text lines (CTLs) are extracted by performing filtering, binarizing, thinning and top-down grouping operation on HDI. Finally, the overlapping connected-components (OCCs) which travel through multiple text lines are separated, and then all OCCs are assigned to a label of CTLs by the nearest neighbor principle. The proposed method is tested on two public databases, and the experimental results show that the proposed method outperforms the state-of-the-art text line segmentation approaches in both of these databases.\",\"PeriodicalId\":220495,\"journal\":{\"name\":\"2014 11th IAPR International Workshop on Document Analysis Systems\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 11th IAPR International Workshop on Document Analysis Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DAS.2014.14\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 11th IAPR International Workshop on Document Analysis Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DAS.2014.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Text Line Segmentation Based on Matched Filtering and Top-Down Grouping for Handwritten Documents
This paper presents a novel text line segmentation method based on matched filtering and top-down grouping for handwritten documents. The proposed method consists of three distinct steps. Firstly, the foreground pixel density (FPD) of handwritten document image (HDI) is estimated, then FPD is used to decide the size of the generated filter which is the convolution of a band-shape filter and an isotropic LoG filter. Secondly, the centers of the text lines (CTLs) are extracted by performing filtering, binarizing, thinning and top-down grouping operation on HDI. Finally, the overlapping connected-components (OCCs) which travel through multiple text lines are separated, and then all OCCs are assigned to a label of CTLs by the nearest neighbor principle. The proposed method is tested on two public databases, and the experimental results show that the proposed method outperforms the state-of-the-art text line segmentation approaches in both of these databases.