{"title":"通过生成数据打破可靠预测的限制","authors":"Zhen Cheng, Fei Zhu, Xu-Yao Zhang, Cheng-Lin Liu","doi":"10.1007/s11263-024-02221-5","DOIUrl":null,"url":null,"abstract":"<p>In open-world recognition of safety-critical applications, providing reliable prediction for deep neural networks has become a critical requirement. Many methods have been proposed for reliable prediction related tasks such as confidence calibration, misclassification detection, and out-of-distribution detection. Recently, pre-training has been shown to be one of the most effective methods for improving reliable prediction, particularly for modern networks like ViT, which require a large amount of training data. However, collecting data manually is time-consuming. In this paper, taking advantage of the breakthrough of generative models, we investigate whether and how expanding the training set using generated data can improve reliable prediction. Our experiments reveal that training with a large quantity of generated data can eliminate overfitting in reliable prediction, leading to significantly improved performance. Surprisingly, classical networks like ResNet-18, when trained on a notably extensive volume of generated data, can sometimes exhibit performance competitive to pre-training ViT with a substantial real dataset.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"18 1","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Breaking the Limits of Reliable Prediction via Generated Data\",\"authors\":\"Zhen Cheng, Fei Zhu, Xu-Yao Zhang, Cheng-Lin Liu\",\"doi\":\"10.1007/s11263-024-02221-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In open-world recognition of safety-critical applications, providing reliable prediction for deep neural networks has become a critical requirement. Many methods have been proposed for reliable prediction related tasks such as confidence calibration, misclassification detection, and out-of-distribution detection. Recently, pre-training has been shown to be one of the most effective methods for improving reliable prediction, particularly for modern networks like ViT, which require a large amount of training data. However, collecting data manually is time-consuming. In this paper, taking advantage of the breakthrough of generative models, we investigate whether and how expanding the training set using generated data can improve reliable prediction. Our experiments reveal that training with a large quantity of generated data can eliminate overfitting in reliable prediction, leading to significantly improved performance. Surprisingly, classical networks like ResNet-18, when trained on a notably extensive volume of generated data, can sometimes exhibit performance competitive to pre-training ViT with a substantial real dataset.</p>\",\"PeriodicalId\":13752,\"journal\":{\"name\":\"International Journal of Computer Vision\",\"volume\":\"18 1\",\"pages\":\"\"},\"PeriodicalIF\":11.6000,\"publicationDate\":\"2024-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Vision\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11263-024-02221-5\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-024-02221-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
在安全关键应用的开放世界识别中,为深度神经网络提供可靠的预测已成为一项关键要求。针对可信度校准、误分类检测和分布外检测等与可靠预测相关的任务,已经提出了许多方法。最近,预训练被证明是提高可靠预测的最有效方法之一,特别是对于像 ViT 这样需要大量训练数据的现代网络。然而,手动收集数据非常耗时。在本文中,我们利用生成模型的突破,研究使用生成数据扩展训练集是否以及如何提高预测的可靠性。我们的实验表明,使用大量生成数据进行训练可以消除可靠预测中的过拟合,从而显著提高性能。令人惊讶的是,像 ResNet-18 这样的经典网络,在使用大量生成数据进行训练时,有时会表现出与使用大量真实数据集对 ViT 进行预训练时相当的性能。
Breaking the Limits of Reliable Prediction via Generated Data
In open-world recognition of safety-critical applications, providing reliable prediction for deep neural networks has become a critical requirement. Many methods have been proposed for reliable prediction related tasks such as confidence calibration, misclassification detection, and out-of-distribution detection. Recently, pre-training has been shown to be one of the most effective methods for improving reliable prediction, particularly for modern networks like ViT, which require a large amount of training data. However, collecting data manually is time-consuming. In this paper, taking advantage of the breakthrough of generative models, we investigate whether and how expanding the training set using generated data can improve reliable prediction. Our experiments reveal that training with a large quantity of generated data can eliminate overfitting in reliable prediction, leading to significantly improved performance. Surprisingly, classical networks like ResNet-18, when trained on a notably extensive volume of generated data, can sometimes exhibit performance competitive to pre-training ViT with a substantial real dataset.
期刊介绍:
The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs.
Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision.
Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community.
Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas.
In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives.
The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research.
Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.