{"title":"一种利用掩码自动编码器和分形维度预训练的新型变换器方法,用于糖尿病视网膜病变分类","authors":"YAOMING YANG, ZHAO ZHA, CHENNAN ZHOU, LIDA ZHANG, SHUXIA QIU, PENG XU","doi":"10.1142/s0218348x24500609","DOIUrl":null,"url":null,"abstract":"<p>Diabetic retinopathy (DR) is one of the leading causes of blindness in a significant portion of the working population, and its damage on vision is irreversible. Therefore, rapid diagnosis on DR is crucial for saving the patient’s eyesight. Since Transformer shows superior performance in the field of computer vision compared with Convolutional Neural Networks (CNNs), it has been proposed and applied in computer aided diagnosis of DR. However, a large number of images should be used for training due to the lack of inductive bias in Transformers. It has been demonstrated that the retinal vessels follow self-similar fractal scaling law, and the fractal dimension of DR patients shows an evident difference from that of normal people. Based on this, the fractal dimension is introduced as a prior into Transformers to mitigate the adverse influence of lack of inductive bias on model performance. A new Transformer method pretrained with Masked Autoencoders and fractal dimension (MAEFD) is developed and proposed in this paper. The experiments on the APTOS dataset show that the classification performance for DR by the proposed MAEFD can be substantially improved. Additionally, the present model pretrained with 100,000 retinal images outperforms that pretrained with 1 million natural images in terms of DR classification performance.</p>","PeriodicalId":501262,"journal":{"name":"Fractals","volume":"46 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A NOVEL TRANSFORMER METHOD PRETRAINED WITH MASKED AUTOENCODERS AND FRACTAL DIMENSION FOR DIABETIC RETINOPATHY CLASSIFICATION\",\"authors\":\"YAOMING YANG, ZHAO ZHA, CHENNAN ZHOU, LIDA ZHANG, SHUXIA QIU, PENG XU\",\"doi\":\"10.1142/s0218348x24500609\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Diabetic retinopathy (DR) is one of the leading causes of blindness in a significant portion of the working population, and its damage on vision is irreversible. Therefore, rapid diagnosis on DR is crucial for saving the patient’s eyesight. Since Transformer shows superior performance in the field of computer vision compared with Convolutional Neural Networks (CNNs), it has been proposed and applied in computer aided diagnosis of DR. However, a large number of images should be used for training due to the lack of inductive bias in Transformers. It has been demonstrated that the retinal vessels follow self-similar fractal scaling law, and the fractal dimension of DR patients shows an evident difference from that of normal people. Based on this, the fractal dimension is introduced as a prior into Transformers to mitigate the adverse influence of lack of inductive bias on model performance. A new Transformer method pretrained with Masked Autoencoders and fractal dimension (MAEFD) is developed and proposed in this paper. The experiments on the APTOS dataset show that the classification performance for DR by the proposed MAEFD can be substantially improved. Additionally, the present model pretrained with 100,000 retinal images outperforms that pretrained with 1 million natural images in terms of DR classification performance.</p>\",\"PeriodicalId\":501262,\"journal\":{\"name\":\"Fractals\",\"volume\":\"46 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fractals\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/s0218348x24500609\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fractals","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0218348x24500609","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
糖尿病视网膜病变(DR)是导致相当一部分劳动人口失明的主要原因之一,它对视力的损害是不可逆的。因此,快速诊断糖尿病视网膜病变对挽救患者视力至关重要。与卷积神经网络(CNN)相比,变形器在计算机视觉领域表现出更优越的性能,因此它被提出并应用于 DR 的计算机辅助诊断。然而,由于变形器缺乏感应偏差,因此需要使用大量图像进行训练。研究表明,视网膜血管遵循自相似分形缩放规律,DR 患者的分形维度与正常人有明显差异。在此基础上,分形维度作为先验维度被引入变换器,以减轻缺乏感应偏差对模型性能的不利影响。本文开发并提出了一种使用掩码自动编码器和分形维度(MAEFD)预训练的新型变换器方法。在 APTOS 数据集上进行的实验表明,使用所提出的 MAEFD 可以大幅提高 DR 的分类性能。此外,用 10 万张视网膜图像预训练的模型在 DR 分类性能方面优于用 100 万张自然图像预训练的模型。
A NOVEL TRANSFORMER METHOD PRETRAINED WITH MASKED AUTOENCODERS AND FRACTAL DIMENSION FOR DIABETIC RETINOPATHY CLASSIFICATION
Diabetic retinopathy (DR) is one of the leading causes of blindness in a significant portion of the working population, and its damage on vision is irreversible. Therefore, rapid diagnosis on DR is crucial for saving the patient’s eyesight. Since Transformer shows superior performance in the field of computer vision compared with Convolutional Neural Networks (CNNs), it has been proposed and applied in computer aided diagnosis of DR. However, a large number of images should be used for training due to the lack of inductive bias in Transformers. It has been demonstrated that the retinal vessels follow self-similar fractal scaling law, and the fractal dimension of DR patients shows an evident difference from that of normal people. Based on this, the fractal dimension is introduced as a prior into Transformers to mitigate the adverse influence of lack of inductive bias on model performance. A new Transformer method pretrained with Masked Autoencoders and fractal dimension (MAEFD) is developed and proposed in this paper. The experiments on the APTOS dataset show that the classification performance for DR by the proposed MAEFD can be substantially improved. Additionally, the present model pretrained with 100,000 retinal images outperforms that pretrained with 1 million natural images in terms of DR classification performance.