Enhancing Chinese–Braille translation: A two-part approach with token prediction and segmentation labeling

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays Pub Date : 2024-09-01 DOI:10.1016/j.displa.2024.102819

Hailong Yu , Wei Su , Lei Liu , Jing Zhang , Chuan Cai , Cunlu Xu , Huajiu Quan , Yingchun Xie

{"title":"Enhancing Chinese–Braille translation: A two-part approach with token prediction and segmentation labeling","authors":"Hailong Yu , Wei Su , Lei Liu , Jing Zhang , Chuan Cai , Cunlu Xu , Huajiu Quan , Yingchun Xie","doi":"10.1016/j.displa.2024.102819","DOIUrl":null,"url":null,"abstract":"<div><p>Visually assistive systems for the visually impaired play a pivotal role in enhancing the quality of life for the visually impaired. Assistive technologies for the visually impaired have undergone a remarkable transformation with the advent of deep learning and sophisticated assistive devices. In particular, the paper utilizes the latest machine translation models and techniques to accomplish the Chinese–Braille translation task, providing convenience for visually impaired individuals. The Traditional end-to-end Chinese–Braille translation approach incorporates Braille dots and Braille word segmentation symbols as tokens within the model’s vocabulary. However, our findings reveal that Braille word segmentation is significantly more complex than Braille dot prediction. The paper proposes a novel Two-Part Loss (TPL) method that treats these tasks distinctly, leading to significant accuracy improvements. To enhance translation performance further, we introduce a BERT-Enhanced Segmentation Transformer (BEST) method. BEST leverages knowledge distillation techniques to transfer knowledge from a pre-trained BERT model to the translate model, mitigating its limitations in word segmentation. Additionally, soft label distillation is employed to improve overall efficacy further. The TPL approach achieves an average BLEU score improvement of 1.16 and 5.42 for Transformer and GPT models on four datasets, respectively. In addition, The work presents a two-stage deep learning-based translation approach that outperforms traditional multi-step and end-to-end methods. The proposed two-stage translation method achieves an average BLEU score improvement of 0.85 across four datasets.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102819"},"PeriodicalIF":3.7000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001835","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Visually assistive systems for the visually impaired play a pivotal role in enhancing the quality of life for the visually impaired. Assistive technologies for the visually impaired have undergone a remarkable transformation with the advent of deep learning and sophisticated assistive devices. In particular, the paper utilizes the latest machine translation models and techniques to accomplish the Chinese–Braille translation task, providing convenience for visually impaired individuals. The Traditional end-to-end Chinese–Braille translation approach incorporates Braille dots and Braille word segmentation symbols as tokens within the model’s vocabulary. However, our findings reveal that Braille word segmentation is significantly more complex than Braille dot prediction. The paper proposes a novel Two-Part Loss (TPL) method that treats these tasks distinctly, leading to significant accuracy improvements. To enhance translation performance further, we introduce a BERT-Enhanced Segmentation Transformer (BEST) method. BEST leverages knowledge distillation techniques to transfer knowledge from a pre-trained BERT model to the translate model, mitigating its limitations in word segmentation. Additionally, soft label distillation is employed to improve overall efficacy further. The TPL approach achieves an average BLEU score improvement of 1.16 and 5.42 for Transformer and GPT models on four datasets, respectively. In addition, The work presents a two-stage deep learning-based translation approach that outperforms traditional multi-step and end-to-end methods. The proposed two-stage translation method achieves an average BLEU score improvement of 0.85 across four datasets.

查看原文本刊更多论文

加强中文-盲文翻译：由标记预测和分段标记两部分组成的方法

视障人士视觉辅助系统在提高视障人士生活质量方面发挥着举足轻重的作用。随着深度学习和精密辅助设备的出现，视障人士辅助技术发生了显著变化。本文特别利用最新的机器翻译模型和技术来完成中文-盲文翻译任务，为视障人士提供便利。传统的端到端中文-盲文翻译方法将盲文点和盲文分词符号作为标记纳入模型词汇。然而，我们的研究结果表明，盲文单词分割比盲文点预测复杂得多。本文提出了一种新颖的两部分损失（TPL）方法，将这些任务区别对待，从而显著提高了准确性。为了进一步提高翻译性能，我们引入了 BERT 增强分割转换器 (BEST) 方法。BEST 利用知识蒸馏技术将知识从预先训练好的 BERT 模型转移到翻译模型，从而减轻了其在单词分割方面的局限性。此外，还采用了软标签蒸馏技术来进一步提高整体效率。在四个数据集上，TPL 方法使 Transformer 和 GPT 模型的平均 BLEU 分数分别提高了 1.16 和 5.42。此外，该作品还提出了一种基于深度学习的两阶段翻译方法，其性能优于传统的多步骤和端到端方法。所提出的两阶段翻译方法在四个数据集上的平均 BLEU 分数提高了 0.85。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Displays 工程技术-工程：电子与电气

CiteScore

4.60

自引率

25.60%

发文量

138

审稿时长

92 days

期刊介绍： Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.