CTT-Net: A Multi-view Cross-token Transformer for Cataract Postoperative Visual Acuity Prediction

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Pub Date : 2022-12-06 DOI:10.1109/BIBM55620.2022.9995392

Jinhong Wang, Jingwen Wang, Tingting Chen, Wenhao Zheng, Zhe Xu, Xingdi Wu, Wendeng Xu, Haochao Ying, D. Chen, Jian Wu

{"title":"CTT-Net: A Multi-view Cross-token Transformer for Cataract Postoperative Visual Acuity Prediction","authors":"Jinhong Wang, Jingwen Wang, Tingting Chen, Wenhao Zheng, Zhe Xu, Xingdi Wu, Wendeng Xu, Haochao Ying, D. Chen, Jian Wu","doi":"10.1109/BIBM55620.2022.9995392","DOIUrl":null,"url":null,"abstract":"Surgery is the only viable treatment for cataract patients with visual acuity (VA) impairment. Clinically, to assess the necessity of cataract surgery, accurately predicting postoperative VA before surgery by analyzing multi-view optical coherence tomography (OCT) images is crucially needed. Unfortunately, due to complicated fundus conditions, determining postoperative VA remains difficult for medical experts. Deep learning methods for this problem were developed in recent years. Although effective, these methods still face several issues, such as not efficiently exploring potential relations between multi-view OCT images, neglecting the key role of clinical prior knowledge (e.g., preoperative VA value), and using only regression-based metrics which are lacking reference. In this paper, we propose a novel Cross-token Transformer Network (CTT-Net) for postoperative VA prediction by analyzing both the multi-view OCT images and preoperative VA. To effectively fuse multi-view features of OCT images, we develop cross-token attention that could restrict redundant/unnecessary attention flow. Further, we utilize the preoperative VA value to provide more information for postoperative VA prediction and facilitate fusion between views. Moreover, we design an auxiliary classification loss to improve model performance and assess VA recovery more sufficiently, avoiding the limitation by only using the regression metrics. To evaluate CTT-Net, we build a multi-view OCT image dataset collected from our collaborative hospital. A set of extensive experiments validate the effectiveness of our model compared to existing methods in various metrics. Code is available at: https://github.con wjh892521292/Cataract-OCT.","PeriodicalId":210337,"journal":{"name":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM55620.2022.9995392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Surgery is the only viable treatment for cataract patients with visual acuity (VA) impairment. Clinically, to assess the necessity of cataract surgery, accurately predicting postoperative VA before surgery by analyzing multi-view optical coherence tomography (OCT) images is crucially needed. Unfortunately, due to complicated fundus conditions, determining postoperative VA remains difficult for medical experts. Deep learning methods for this problem were developed in recent years. Although effective, these methods still face several issues, such as not efficiently exploring potential relations between multi-view OCT images, neglecting the key role of clinical prior knowledge (e.g., preoperative VA value), and using only regression-based metrics which are lacking reference. In this paper, we propose a novel Cross-token Transformer Network (CTT-Net) for postoperative VA prediction by analyzing both the multi-view OCT images and preoperative VA. To effectively fuse multi-view features of OCT images, we develop cross-token attention that could restrict redundant/unnecessary attention flow. Further, we utilize the preoperative VA value to provide more information for postoperative VA prediction and facilitate fusion between views. Moreover, we design an auxiliary classification loss to improve model performance and assess VA recovery more sufficiently, avoiding the limitation by only using the regression metrics. To evaluate CTT-Net, we build a multi-view OCT image dataset collected from our collaborative hospital. A set of extensive experiments validate the effectiveness of our model compared to existing methods in various metrics. Code is available at: https://github.con wjh892521292/Cataract-OCT.

查看原文本刊更多论文

CTT-Net:用于白内障术后视力预测的多视点交叉令牌转换器

手术是唯一可行的治疗白内障患者的视力(VA)损害。临床上，为了评估白内障手术的必要性，在手术前通过多视点光学相干断层扫描(OCT)图像准确预测术后VA至关重要。不幸的是，由于复杂的眼底情况，确定术后VA对医学专家来说仍然很困难。针对这一问题的深度学习方法是近年来发展起来的。虽然这些方法是有效的，但仍然面临一些问题，如不能有效地探索多视图OCT图像之间的潜在关系，忽视临床先验知识(如术前VA值)的关键作用，以及仅使用基于回归的指标，缺乏参考。在本文中，我们通过分析多视图OCT图像和术前VA，提出了一种新的跨令牌变压器网络(CTT-Net)用于术后VA预测。为了有效融合OCT图像的多视图特征，我们开发了交叉令牌注意力，可以限制冗余/不必要的注意力流。进一步，我们利用术前VA值为术后VA预测提供更多信息，促进视图间融合。此外，我们设计了一个辅助分类损失来提高模型性能，更充分地评估VA恢复，避免了仅使用回归指标的局限性。为了评估CTT-Net，我们建立了从我们的合作医院收集的多视图OCT图像数据集。一组广泛的实验验证了我们的模型在各种指标上与现有方法相比的有效性。代码可在:https://github.con wjh892521292/Cataract-OCT。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

自引率

0.00%

发文量