{"title":"医学图像分析中深度学习泛化的复杂性度量评价","authors":"Aleksandar Vakanski, Min Xian","doi":"10.1109/MLSP52302.2021.9596501","DOIUrl":null,"url":null,"abstract":"The generalization error of deep learning models for medical image analysis often increases on images collected with different devices for data acquisition, device settings, or patient population. A better understanding of the generalization capacity on new images is crucial for clinicians' trustworthiness. Although significant efforts have been recently directed toward establishing generalization bounds and complexity measures, there is still a significant discrepancy between the predicted and actual generalization performance. As well, related large empirical studies have been primarily based on validation with general-purpose image datasets. This paper presents an empirical study that investigates the correlation between 25 complexity measures and the generalization abilities of deep learning classifiers for breast ultrasound images. The results indicate that PAC-Bayes flatness and path norm measures produce the most consistent explanation for the combination of models and data. We also report that multi-task classification and segmentation approach for breast images is conducive toward improved generalization.","PeriodicalId":156116,"journal":{"name":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Evaluation of Complexity Measures for Deep Learning Generalization in Medical Image Analysis\",\"authors\":\"Aleksandar Vakanski, Min Xian\",\"doi\":\"10.1109/MLSP52302.2021.9596501\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The generalization error of deep learning models for medical image analysis often increases on images collected with different devices for data acquisition, device settings, or patient population. A better understanding of the generalization capacity on new images is crucial for clinicians' trustworthiness. Although significant efforts have been recently directed toward establishing generalization bounds and complexity measures, there is still a significant discrepancy between the predicted and actual generalization performance. As well, related large empirical studies have been primarily based on validation with general-purpose image datasets. This paper presents an empirical study that investigates the correlation between 25 complexity measures and the generalization abilities of deep learning classifiers for breast ultrasound images. The results indicate that PAC-Bayes flatness and path norm measures produce the most consistent explanation for the combination of models and data. We also report that multi-task classification and segmentation approach for breast images is conducive toward improved generalization.\",\"PeriodicalId\":156116,\"journal\":{\"name\":\"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MLSP52302.2021.9596501\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLSP52302.2021.9596501","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluation of Complexity Measures for Deep Learning Generalization in Medical Image Analysis
The generalization error of deep learning models for medical image analysis often increases on images collected with different devices for data acquisition, device settings, or patient population. A better understanding of the generalization capacity on new images is crucial for clinicians' trustworthiness. Although significant efforts have been recently directed toward establishing generalization bounds and complexity measures, there is still a significant discrepancy between the predicted and actual generalization performance. As well, related large empirical studies have been primarily based on validation with general-purpose image datasets. This paper presents an empirical study that investigates the correlation between 25 complexity measures and the generalization abilities of deep learning classifiers for breast ultrasound images. The results indicate that PAC-Bayes flatness and path norm measures produce the most consistent explanation for the combination of models and data. We also report that multi-task classification and segmentation approach for breast images is conducive toward improved generalization.