Displays最新文献

筛选
英文 中文
Nighttime large-field video image change detection based on adaptive superpixel reconstruction and multi-scale singular value decomposition fusion 基于自适应超像素重建和多尺度奇异值分解融合的夜间大视场视频图像变化检测
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-17 DOI: 10.1016/j.displa.2024.102840
Tianyu Ren , Jia He , Zhenhong Jia , Xiaohui Huang , Sensen Song , Jiajia Wang , Gang Zhou , Fei Shi , Ming Lv
{"title":"Nighttime large-field video image change detection based on adaptive superpixel reconstruction and multi-scale singular value decomposition fusion","authors":"Tianyu Ren ,&nbsp;Jia He ,&nbsp;Zhenhong Jia ,&nbsp;Xiaohui Huang ,&nbsp;Sensen Song ,&nbsp;Jiajia Wang ,&nbsp;Gang Zhou ,&nbsp;Fei Shi ,&nbsp;Ming Lv","doi":"10.1016/j.displa.2024.102840","DOIUrl":"10.1016/j.displa.2024.102840","url":null,"abstract":"<div><p>With the development of technology and the needs of social governance, surveillance equipment has been widely used. It is very mature to detect the change of surveillance video images in conventional scenes through video image change detection algorithms. However, in the large field of view environment at night, there are complex random noise and low signal-to-noise ratio in surveillance video images, which makes it difficult for people to find small moving targets. To this end, we propose a new method for nighttime large-field surveillance video image change detection based on adaptive superpixel reconstruction and multi-scale singular value decomposition fusion. The proposed method consists of two parts. On the one hand, an adaptive superpixel reconstruction method is used to reconstruct the two denoised difference images by selecting different segmentation parameters, and the edge information of the two reconstructed difference images is significantly enhanced. On the other hand, a multi-scale singular value decomposition fusion method is used to fuse the two difference images. The multi-scale singular value decomposition fusion obtains a robust difference image by selecting fusion rules at different scales and using the complementary information of different difference images, and the Fuzzy c-means (FCM) clustering algorithm is used to obtain the final changed image. Experimental results on a self-built nighttime large-field video image dataset with two resolutions show that the proposed method is superior to other algorithms in terms of detection accuracy and robustness.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102840"},"PeriodicalIF":3.7,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical techniques for digital pre-processing of computed tomography medical images: A current review 计算机断层扫描医学影像数字预处理的统计技术:最新综述
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-14 DOI: 10.1016/j.displa.2024.102835
Oscar Valbuena Prada , Miguel Ángel Vera , Guillermo Ramirez , Ricardo Barrientos Rojel , David Mojica Maldonado
{"title":"Statistical techniques for digital pre-processing of computed tomography medical images: A current review","authors":"Oscar Valbuena Prada ,&nbsp;Miguel Ángel Vera ,&nbsp;Guillermo Ramirez ,&nbsp;Ricardo Barrientos Rojel ,&nbsp;David Mojica Maldonado","doi":"10.1016/j.displa.2024.102835","DOIUrl":"10.1016/j.displa.2024.102835","url":null,"abstract":"<div><p>Digital pre-processing is a vital stage in the processing of the information contained in multilayer computed tomography images. The purpose of digital pre-processing is the minimization of the effect of image imperfections, which are associated with the noise and artifacts that affect the quality of the images during acquisition, storage, and/or transmission processes. Likewise, there is a wide variety of techniques in specialized literature that address the problem of imperfections, noise, and artifacts present in images. In this study, a comprehensive review of specialized literature on statistical techniques used in the pre-processing of digital images was conducted. The review summarizes updated information from 56 studies conducted over the last 5 years (2018–2022) on the main statistical techniques used for the digital processing of medical images obtained under different modalities, with a special focus on computed tomography. Additionally, the most often used statistical metrics for measuring the performance of pre-processing techniques in the field of medical imaging are described. The most often used pre-processing techniques in the field of medical imaging were found to be statistical filters based on median, neural networks, Gaussian filters based on deep learning, mean, and machine learning applied to multilayer computed tomography images and magnetic resonance images of the brain, abdomen, lungs, and heart, among other organs of the body.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102835"},"PeriodicalIF":3.7,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0141938224001999/pdfft?md5=e9830193894fc73d2f7b5d4f1a47d064&pid=1-s2.0-S0141938224001999-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142232852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human body features recognition based adaptive user interface for extra-large touch screens 基于人体特征识别的超大触摸屏自适应用户界面
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-13 DOI: 10.1016/j.displa.2024.102838
Junfeng Wang, Jialin Li
{"title":"Human body features recognition based adaptive user interface for extra-large touch screens","authors":"Junfeng Wang,&nbsp;Jialin Li","doi":"10.1016/j.displa.2024.102838","DOIUrl":"10.1016/j.displa.2024.102838","url":null,"abstract":"<div><p>With the widespread use of extra-large touch screens (eLTS) in various settings such as work and education, interaction efficiency and user experience have garnered increased attention. The current user interface (UI) layouts of eLTS are primarily categorized into two modes: fixed position and manual adjustment. The fixed UI layout fails to accommodate users of different heights and sizes, while the manual adjustment mode involves cumbersome steps and lacks sufficient flexibility. This study proposes an adaptive UI for eLTS. The optimal operational area on the eLTS is determined based on users’ height, eye level, arm length, face orientation, and distance from the screen. The eLTS menu is then positioned and displayed within this optimal area. Simulations involving users of various heights (P1 female, P50 male and female, and P99 male) were conducted to evaluate fatigue using the rapid upper limb assessment (RULA) method. The results indicate that the proposed adaptive UI significantly reduces user fatigue.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102838"},"PeriodicalIF":3.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Underwater target detection network based on differential routing assistance and bilateral attention synergy 基于差分路由辅助和双边注意力协同的水下目标探测网络
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-13 DOI: 10.1016/j.displa.2024.102836
Zhiwei Chen, Suting Chen
{"title":"Underwater target detection network based on differential routing assistance and bilateral attention synergy","authors":"Zhiwei Chen,&nbsp;Suting Chen","doi":"10.1016/j.displa.2024.102836","DOIUrl":"10.1016/j.displa.2024.102836","url":null,"abstract":"<div><p>Underwater target detection technology holds significant importance in both military and civilian applications of ocean exploration. However, due to the complex underwater environment, most targets are small and often obscured, leading to low detection accuracy and missed detections in existing target detection algorithms. To address these issues, we propose an underwater target detection algorithm that balances accuracy and speed. Specifically, we first propose the Differentiable Routing Assistance Sampling Network named (DRASN), where differentiable routing participates in training the sampling network but not in the inference process. It replaces the down-sampling network composed of Maxpool and convolution fusion in the backbone network, reducing the feature loss of small and occluded targets. Secondly, we proposed the Bilateral Attention Synergistic Network (BASN), which establishes connections between the backbone and neck with fine-grained information from both channel and spatial perspectives, thereby further enhancing the detection capability of targets in complex backgrounds. Finally, considering the characteristics of real frames, we proposed a scale approximation auxiliary loss function named (Aux-Loss) and modify the allocation strategy of positive and negative samples to enable the network to selectively learn high-quality anchors, thereby improving the convergence capability of the network. Compared with mainstream algorithms, our detection network achieves 82.9% in [email protected] on the URPC2021 dataset, which is 9.5%, 5.7%, and 2.8% higher than YOLOv8s, RT-DETR, and SDBB respectively. The speed reaches 75 FPS and meets the requirements for real-time performance.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102836"},"PeriodicalIF":3.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142239516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of SeeColors filters for color vision correction and comparative analysis with EnChroma glasses 用于色觉矫正的 SeeColors 滤镜的评估以及与 EnChroma 眼镜的比较分析
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-10 DOI: 10.1016/j.displa.2024.102831
Fangli Fan , Yifeng Wu , Danyan Tang , Yujie Shu , Zhen Deng , Hai Xin , Xiqiang Liu
{"title":"Evaluation of SeeColors filters for color vision correction and comparative analysis with EnChroma glasses","authors":"Fangli Fan ,&nbsp;Yifeng Wu ,&nbsp;Danyan Tang ,&nbsp;Yujie Shu ,&nbsp;Zhen Deng ,&nbsp;Hai Xin ,&nbsp;Xiqiang Liu","doi":"10.1016/j.displa.2024.102831","DOIUrl":"10.1016/j.displa.2024.102831","url":null,"abstract":"<div><p>The purpose of this study was to evaluate the ability of SeeColors filters to enhance color vision test results using the electronic version of the Farnsworth D-15 (E-D15) and Ishihara tests on a Samsung TV, and to compare their effectiveness with EnChroma glasses. Sixty-six subjects with congenital red–green color vision deficiency were tested. For both protan and deutan groups, the confusion angle shifted to negative values, with SeeColors filters exhibiting a greater effect than EnChroma glasses. In the deutan group, the TES, S-index, and C-index of the E-D15 test decreased, with the SeeColors D30 filter having a more pronounced effect than EnChroma deutan glasses. In the protan group, while EnChroma protan glasses tended to decrease the TES, S-index, and C-index, the SeeColors P30 filter increased them. For both groups, TS and TN of the Ishihara tests improved, with the SeeColors D30 filter demonstrating a stronger effect than EnChroma deutan glasses. The study concluded that both the SeeColors D30 filter and EnChroma deutan glasses were beneficial for deutans, albeit the SeeColors D30 filter was superior. In protans, neither the SeeColors P30 filter nor EnChroma protan glasses showed significant effectiveness, but the SeeColors P30 filter did improve performance in the pseudoisochromatic task.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102831"},"PeriodicalIF":3.7,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0141938224001951/pdfft?md5=ac0d10063902d0bded9a2e66cc83aeea&pid=1-s2.0-S0141938224001951-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vehicle trajectory extraction and integration from multi-direction video on urban intersection 从城市交叉路口的多向视频中提取和整合车辆轨迹
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-07 DOI: 10.1016/j.displa.2024.102834
Jinjun Tang, Weihe Wang
{"title":"Vehicle trajectory extraction and integration from multi-direction video on urban intersection","authors":"Jinjun Tang,&nbsp;Weihe Wang","doi":"10.1016/j.displa.2024.102834","DOIUrl":"10.1016/j.displa.2024.102834","url":null,"abstract":"<div><p>With the gradual maturity of computer vision technology, using intersection surveillance videos for vehicle trajectory extraction has become a popular method to analyze vehicle conflicts and safety in urban intersection. However, many intersection surveillance videos have blind spots, failing to fully cover the entire intersection. Vehicles may also obstruct each other, resulting in incomplete vehicle trajectories. The angle of surveillance videos can also lead to inaccurate trajectory extraction. In response to these challenges, this study proposes an vehicle trajectory extraction and integration framework using surveillance videos collected from four entrance of urban intersection. The framework first employs the improved YOLOv5s model to detect the positions of vehicles. Then, we proposed an object tracking model MS-SORT to extract the trajectories in each surveillance video. Subsequently, the trajectories of each surveillance video are mapped into the same coordinate system. Then the integration of trajectories is achieved using space–time information and re-identification (ReID) methods. The framework extracts and integrates trajectories from four intersection surveillance videos, obtaining trajectories with significantly broader temporal and spatial coverage compared to those obtained from any single direction of surveillance video. Our detection model improved mAP by 1.3 percentage points compared to the basic YOLOv5s, and our object tracking model improved MOTA and IDF1 by 2.6 and 2.1 percentage points compared to DeepSORT. The trojectory integration method achieved 94.7 % of F1-Score and RMSE of 0.51 m. The average length and number of the extracted trajectories has increased by at least 47.6 % and 24.2 % respectively compared to trajectories extracted from a single video.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102834"},"PeriodicalIF":3.7,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel noiselayer-decoder driven blind watermarking network 由噪声层解码器驱动的新型盲水印网络
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-04 DOI: 10.1016/j.displa.2024.102823
Xiaorui Zhang , Rui Jiang , Wei Sun , Sunil Kr. Jha
{"title":"A novel noiselayer-decoder driven blind watermarking network","authors":"Xiaorui Zhang ,&nbsp;Rui Jiang ,&nbsp;Wei Sun ,&nbsp;Sunil Kr. Jha","doi":"10.1016/j.displa.2024.102823","DOIUrl":"10.1016/j.displa.2024.102823","url":null,"abstract":"<div><p>Most blind watermarking methods adopt the Encode-Noiselayer-Decoder network architecture, called END. However, there are issues that impact the imperceptibility and robustness of the watermarking, such as the encoder blindly embedding redundant features, adversarial training failing to simulate unknown noise effectively, and the limited capability of single-scale feature extraction. To address these challenges, we propose a new Noiselayer-Decoder-driven blind watermarking network, called ND-END, which leverages prior knowledge of the noise layer and features extracted by the decoder to guide the encoder for generating images with fewer redundant modifications, enhancing the imperceptibility. To effectively simulate the unknown noise caused during adversarial training, we introduce an unknown noise layer based on the guided denoising diffusion probabilistic model, which gradually modifies the mean value of the predicted noise during the image generation process. It produces unknown noise images that closely resemble the encoded images but can mislead the decoder. Moreover, we propose a multi-scale spatial-channel feature extraction method for extracting multi-scale message features from the noised image, which aids in message extraction. Experimental results demonstrate the effectiveness of our model, ND-END achieves a lower bit error rate while improving the peak signal-to-noise ratio by approximately 6 dB (from about 33.5 dB to 39.5 dB).</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102823"},"PeriodicalIF":3.7,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic identification of breech face impressions based on deep local features 基于深层局部特征自动识别臀面印模
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-02 DOI: 10.1016/j.displa.2024.102822
Baohong Li, Hao Zhang, Ashraf Uz Zaman Robin, Qianqian Yu
{"title":"Automatic identification of breech face impressions based on deep local features","authors":"Baohong Li,&nbsp;Hao Zhang,&nbsp;Ashraf Uz Zaman Robin,&nbsp;Qianqian Yu","doi":"10.1016/j.displa.2024.102822","DOIUrl":"10.1016/j.displa.2024.102822","url":null,"abstract":"<div><p>Breech face impressions are an essential type of physical evidence in forensic investigations. However, their surface morphology is complex and varies based on the machining method used on the gun’s breech face, making traditional handcrafted local feature-based methods exhibit high false rates and are unsuitable for striated impressions. We proposed a deep local feature-based method for firearm identification utilizing Detector-Free Local Feature Matching with Transformers (LoFTR). This method removes the module of feature point detection and directly utilizes self and cross-attention layers in the Transformer to transform the convolved coarse-level feature maps into a series of dense feature descriptors. Subsequently, matches with high confidence scores are filtered based on the score matrix calculated from the dense descriptors. Finally, the screened initial matches are refined into the convolved fine-level features, and a correlation-based approach is used to obtain the exact location of the match. Validation tests were conducted using three authoritative sets of the breech face impressions datasets provided by the National Institute of Standards and Technology (NIST). The validation results show that, compared with the traditional handcrafted local-feature based methods, the proposed method in this paper yields a lower identification error rate. Notably, the method can not only deal with granular impressions, but can also be applied to the striated impressions. The results indicate that the method proposed in this paper can be utilized for comparative analysis of breech face impressions, and provide a new automatic identification method for forensic investigations.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102822"},"PeriodicalIF":3.7,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142148482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial awareness enhancement based single-stage anchor-free 3D object detection for autonomous driving 基于单级无锚三维物体检测的空间感知增强技术,用于自动驾驶
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-02 DOI: 10.1016/j.displa.2024.102821
Xinyu Sun , Lisheng Jin , Huanhuan Wang , Zhen Huo , Yang He , Guangqi Wang
{"title":"Spatial awareness enhancement based single-stage anchor-free 3D object detection for autonomous driving","authors":"Xinyu Sun ,&nbsp;Lisheng Jin ,&nbsp;Huanhuan Wang ,&nbsp;Zhen Huo ,&nbsp;Yang He ,&nbsp;Guangqi Wang","doi":"10.1016/j.displa.2024.102821","DOIUrl":"10.1016/j.displa.2024.102821","url":null,"abstract":"<div><p>The real-time and accurate detection of three-dimensional (3D) objects based on LiDAR is a focal problem in the field of autonomous driving environment perception. Compared to two-stage and anchor-based 3D object detection methods that suffer from inference latency challenges, single-stage anchor-free 3D object detection approaches are more suitable for deployment in autonomous driving vehicles with the strict real-time requirement. However, they face the issue of insufficient spatial awareness, which can result in detection errors such as false positives and false negatives, thereby increasing the potential risks of autonomous driving. In response to this, we focus on enhancing the spatial awareness of CenterPoint, a widely used single-stage anchor-free 3D object detector in the industry. Considering the limited allocation of computational resources and the performance bottleneck caused by pillar encoder, we propose an efficient SSDCM backbone to strengthen feature representation and extraction. Furthermore, a simple BGC neck is devised to weight and exchange contextual information in order to deeply fuse multi-scale features. Combining improved backbone and neck networks, we construct a single-stage anchor-free 3D object detection model with spatial awareness enhancement, named CenterPoint-Spatial Awareness Enhancement (CenterPoint-SAE). We evaluate CenterPoint-SAE on two large-scale and challenging autonomous driving datasets, nuScenes and Waymo. It achieves 53.3% mAP and 62.5% NDS on nuScenes detection benchmark, and runs inference at a speed of 11.1 FPS. Compared to the baseline, the upgraded networks deliver a performance improvement of 1.6% mAP and 1.2% NDS at minor cost. Notably, on Waymo dataset, our method achieves competitive detection performance compared to two-stage and point-based methods.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102821"},"PeriodicalIF":3.7,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Chinese–Braille translation: A two-part approach with token prediction and segmentation labeling 加强中文-盲文翻译:由标记预测和分段标记两部分组成的方法
IF 3.7 2区 工程技术
Displays Pub Date : 2024-09-01 DOI: 10.1016/j.displa.2024.102819
Hailong Yu , Wei Su , Lei Liu , Jing Zhang , Chuan Cai , Cunlu Xu , Huajiu Quan , Yingchun Xie
{"title":"Enhancing Chinese–Braille translation: A two-part approach with token prediction and segmentation labeling","authors":"Hailong Yu ,&nbsp;Wei Su ,&nbsp;Lei Liu ,&nbsp;Jing Zhang ,&nbsp;Chuan Cai ,&nbsp;Cunlu Xu ,&nbsp;Huajiu Quan ,&nbsp;Yingchun Xie","doi":"10.1016/j.displa.2024.102819","DOIUrl":"10.1016/j.displa.2024.102819","url":null,"abstract":"<div><p>Visually assistive systems for the visually impaired play a pivotal role in enhancing the quality of life for the visually impaired. Assistive technologies for the visually impaired have undergone a remarkable transformation with the advent of deep learning and sophisticated assistive devices. In particular, the paper utilizes the latest machine translation models and techniques to accomplish the Chinese–Braille translation task, providing convenience for visually impaired individuals. The Traditional end-to-end Chinese–Braille translation approach incorporates Braille dots and Braille word segmentation symbols as tokens within the model’s vocabulary. However, our findings reveal that Braille word segmentation is significantly more complex than Braille dot prediction. The paper proposes a novel Two-Part Loss (TPL) method that treats these tasks distinctly, leading to significant accuracy improvements. To enhance translation performance further, we introduce a BERT-Enhanced Segmentation Transformer (BEST) method. BEST leverages knowledge distillation techniques to transfer knowledge from a pre-trained BERT model to the translate model, mitigating its limitations in word segmentation. Additionally, soft label distillation is employed to improve overall efficacy further. The TPL approach achieves an average BLEU score improvement of 1.16 and 5.42 for Transformer and GPT models on four datasets, respectively. In addition, The work presents a two-stage deep learning-based translation approach that outperforms traditional multi-step and end-to-end methods. The proposed two-stage translation method achieves an average BLEU score improvement of 0.85 across four datasets.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102819"},"PeriodicalIF":3.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信