BrailleSegNet: A novel methodology for Braille dataset generation and character segmentation

IF 3.4 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays Pub Date : 2025-07-08 DOI:10.1016/j.displa.2025.103145

Shana Sherin M., Shyna A., Jini Raju, Reena Mary George

{"title":"BrailleSegNet: A novel methodology for Braille dataset generation and character segmentation","authors":"Shana Sherin M., Shyna A., Jini Raju, Reena Mary George","doi":"10.1016/j.displa.2025.103145","DOIUrl":null,"url":null,"abstract":"<div><div>Recent research in the field of Braille learning has highlighted vital role of accurately segmenting Braille letters from Braille documents to improve accessibility and educational opportunities for visually impaired children. A novel methodology, BrailleSegNet, is proposed for Braille Dataset Generation and Braille character segmentation, structured into six distinct phases: Image Acquisition, Image Preprocessing, Fixed-Sized Square Conversion, Rows Extraction, Zonal Operations, and Braille Character Extraction. The initial phase involves acquiring images from the Braille-TextStory dataset, followed by preprocessing steps including grayscale conversion, binary conversion, Gaussian filtering for noise removal, and image inversion. Subsequently, the method standardizes the varying sizes and shapes of Braille dots into fixed-sized squares, extracts rows containing Braille characters, and performs zonal operations such as vertical dilation, zone identification, full zone conversion, and space zone addition to accurately segment and recognize Braille characters. The final phase extracts Braille characters from the designated full zones. Addressing the scarcity of datasets with appropriate ground truth reflecting real-world Braille document scenarios, a new dataset, Braille-TextStory, was created as part of this work. This dataset includes short stories in English, generated using the Braille-PageMap algorithm for evaluating Braille character segmentation techniques. The Braille-TextStory dataset maps English letters to their corresponding Braille images, accurately placing them on plain pages with proper management of parameters such as letter spacing, word spacing, and line spacing to preserve the integrity and readability of Braille documents. The proposed segmentation methodology was tested using this dataset, demonstrating a high level of effectiveness and accuracy compared to state-of-the-art methods.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"90 ","pages":"Article 103145"},"PeriodicalIF":3.4000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225001829","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Recent research in the field of Braille learning has highlighted vital role of accurately segmenting Braille letters from Braille documents to improve accessibility and educational opportunities for visually impaired children. A novel methodology, BrailleSegNet, is proposed for Braille Dataset Generation and Braille character segmentation, structured into six distinct phases: Image Acquisition, Image Preprocessing, Fixed-Sized Square Conversion, Rows Extraction, Zonal Operations, and Braille Character Extraction. The initial phase involves acquiring images from the Braille-TextStory dataset, followed by preprocessing steps including grayscale conversion, binary conversion, Gaussian filtering for noise removal, and image inversion. Subsequently, the method standardizes the varying sizes and shapes of Braille dots into fixed-sized squares, extracts rows containing Braille characters, and performs zonal operations such as vertical dilation, zone identification, full zone conversion, and space zone addition to accurately segment and recognize Braille characters. The final phase extracts Braille characters from the designated full zones. Addressing the scarcity of datasets with appropriate ground truth reflecting real-world Braille document scenarios, a new dataset, Braille-TextStory, was created as part of this work. This dataset includes short stories in English, generated using the Braille-PageMap algorithm for evaluating Braille character segmentation techniques. The Braille-TextStory dataset maps English letters to their corresponding Braille images, accurately placing them on plain pages with proper management of parameters such as letter spacing, word spacing, and line spacing to preserve the integrity and readability of Braille documents. The proposed segmentation methodology was tested using this dataset, demonstrating a high level of effectiveness and accuracy compared to state-of-the-art methods.

查看原文本刊更多论文

BrailleSegNet：一种新的盲文数据集生成和字符分割方法

最近在盲文学习领域的研究强调了准确地从盲文文件中分割盲文字母对于改善视障儿童的可及性和教育机会的重要作用。提出了一种用于盲文数据集生成和盲文字符分割的新方法BrailleSegNet，该方法分为六个不同的阶段：图像采集、图像预处理、固定大小的正方形转换、行提取、区域操作和盲文字符提取。初始阶段包括从Braille-TextStory数据集获取图像，然后是预处理步骤，包括灰度转换、二值转换、高斯滤波去除噪声和图像反演。随后，该方法将盲文点的不同大小和形状标准化为固定大小的正方形，提取包含盲文字符的行，并进行垂直扩张、区域识别、全区域转换、空间区域添加等区域操作，实现盲文字符的准确分割和识别。最后一个阶段是从指定的完整区域提取盲文字符。为了解决数据集的稀缺性问题，我们创建了一个新的数据集，Braille- textstory，作为这项工作的一部分。该数据集包括英语短篇小说，使用Braille- pagemap算法生成，用于评估盲文字符分割技术。Braille- textstory数据集将英文字母映射到相应的盲文图像，通过适当管理字母间距、单词间距和行间距等参数，准确地将它们放置在普通页面上，以保持盲文文档的完整性和可读性。使用该数据集对提出的分割方法进行了测试，与最先进的方法相比，显示出高水平的有效性和准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Displays 工程技术-工程：电子与电气

CiteScore

4.60

自引率

25.60%

发文量

138

审稿时长

92 days

期刊介绍： Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.