{"title":"DS-ViT:用于阿尔茨海默氏症早期诊断中跨任务蒸馏的双流视觉转换器","authors":"Ke Chen, Yifeng Wang, Yufei Zhou, Haohan Wang","doi":"arxiv-2409.07584","DOIUrl":null,"url":null,"abstract":"In the field of Alzheimer's disease diagnosis, segmentation and\nclassification tasks are inherently interconnected. Sharing knowledge between\nmodels for these tasks can significantly improve training efficiency,\nparticularly when training data is scarce. However, traditional knowledge\ndistillation techniques often struggle to bridge the gap between segmentation\nand classification due to the distinct nature of tasks and different model\narchitectures. To address this challenge, we propose a dual-stream pipeline\nthat facilitates cross-task and cross-architecture knowledge sharing. Our\napproach introduces a dual-stream embedding module that unifies feature\nrepresentations from segmentation and classification models, enabling\ndimensional integration of these features to guide the classification model. We\nvalidated our method on multiple 3D datasets for Alzheimer's disease diagnosis,\ndemonstrating significant improvements in classification performance,\nespecially on small datasets. Furthermore, we extended our pipeline with a\nresidual temporal attention mechanism for early diagnosis, utilizing images\ntaken before the atrophy of patients' brain mass. This advancement shows\npromise in enabling diagnosis approximately six months earlier in mild and\nasymptomatic stages, offering critical time for intervention.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DS-ViT: Dual-Stream Vision Transformer for Cross-Task Distillation in Alzheimer's Early Diagnosis\",\"authors\":\"Ke Chen, Yifeng Wang, Yufei Zhou, Haohan Wang\",\"doi\":\"arxiv-2409.07584\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the field of Alzheimer's disease diagnosis, segmentation and\\nclassification tasks are inherently interconnected. Sharing knowledge between\\nmodels for these tasks can significantly improve training efficiency,\\nparticularly when training data is scarce. However, traditional knowledge\\ndistillation techniques often struggle to bridge the gap between segmentation\\nand classification due to the distinct nature of tasks and different model\\narchitectures. To address this challenge, we propose a dual-stream pipeline\\nthat facilitates cross-task and cross-architecture knowledge sharing. Our\\napproach introduces a dual-stream embedding module that unifies feature\\nrepresentations from segmentation and classification models, enabling\\ndimensional integration of these features to guide the classification model. We\\nvalidated our method on multiple 3D datasets for Alzheimer's disease diagnosis,\\ndemonstrating significant improvements in classification performance,\\nespecially on small datasets. Furthermore, we extended our pipeline with a\\nresidual temporal attention mechanism for early diagnosis, utilizing images\\ntaken before the atrophy of patients' brain mass. This advancement shows\\npromise in enabling diagnosis approximately six months earlier in mild and\\nasymptomatic stages, offering critical time for intervention.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07584\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DS-ViT: Dual-Stream Vision Transformer for Cross-Task Distillation in Alzheimer's Early Diagnosis
In the field of Alzheimer's disease diagnosis, segmentation and
classification tasks are inherently interconnected. Sharing knowledge between
models for these tasks can significantly improve training efficiency,
particularly when training data is scarce. However, traditional knowledge
distillation techniques often struggle to bridge the gap between segmentation
and classification due to the distinct nature of tasks and different model
architectures. To address this challenge, we propose a dual-stream pipeline
that facilitates cross-task and cross-architecture knowledge sharing. Our
approach introduces a dual-stream embedding module that unifies feature
representations from segmentation and classification models, enabling
dimensional integration of these features to guide the classification model. We
validated our method on multiple 3D datasets for Alzheimer's disease diagnosis,
demonstrating significant improvements in classification performance,
especially on small datasets. Furthermore, we extended our pipeline with a
residual temporal attention mechanism for early diagnosis, utilizing images
taken before the atrophy of patients' brain mass. This advancement shows
promise in enabling diagnosis approximately six months earlier in mild and
asymptomatic stages, offering critical time for intervention.