一种计算超大图像cnn的新方法

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management Pub Date : 2017-11-06 DOI:10.1145/3132847.3132872

Sai Wu, Mengdan Zhang, Gang Chen, Ke Chen

{"title":"一种计算超大图像cnn的新方法","authors":"Sai Wu, Mengdan Zhang, Gang Chen, Ke Chen","doi":"10.1145/3132847.3132872","DOIUrl":null,"url":null,"abstract":"CNN (Convolution Neural Network) is widely used in visual analysis and achieves exceptionally high performances in image classification, face detection, object recognition, image recoloring, and other learning jobs. Using deep learning frameworks, such as Torch and Tensorflow, CNN can be efficiently computed by leveraging the power of GPU. However, one drawback of GPU is its limited memory which prohibits us from handling large images. Passing a 4K resolution image to the VGG network will result in an exception of out-of-memory for Titan-X GPU. In this paper, we propose a new approach that adopts the BSP (bulk synchronization parallel) model to compute CNNs for images of any size. Before fed to a specific CNN layer, the image is split into smaller pieces which go through the neural network separately. Then, a specific padding and normalization technique is adopted to merge sub-images back into one image. Our approach can be easily extended to support distributed multi-GPUs. In this paper, we use neural style network as our example to illustrate the effectiveness of our approach. We show that using one Titan-X GPU, we can transfer the style of an image with 10,000×10,000 pixels within 1 minute.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"7 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"A New Approach to Compute CNNs for Extremely Large Images\",\"authors\":\"Sai Wu, Mengdan Zhang, Gang Chen, Ke Chen\",\"doi\":\"10.1145/3132847.3132872\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"CNN (Convolution Neural Network) is widely used in visual analysis and achieves exceptionally high performances in image classification, face detection, object recognition, image recoloring, and other learning jobs. Using deep learning frameworks, such as Torch and Tensorflow, CNN can be efficiently computed by leveraging the power of GPU. However, one drawback of GPU is its limited memory which prohibits us from handling large images. Passing a 4K resolution image to the VGG network will result in an exception of out-of-memory for Titan-X GPU. In this paper, we propose a new approach that adopts the BSP (bulk synchronization parallel) model to compute CNNs for images of any size. Before fed to a specific CNN layer, the image is split into smaller pieces which go through the neural network separately. Then, a specific padding and normalization technique is adopted to merge sub-images back into one image. Our approach can be easily extended to support distributed multi-GPUs. In this paper, we use neural style network as our example to illustrate the effectiveness of our approach. We show that using one Titan-X GPU, we can transfer the style of an image with 10,000×10,000 pixels within 1 minute.\",\"PeriodicalId\":20449,\"journal\":{\"name\":\"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management\",\"volume\":\"7 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3132847.3132872\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132847.3132872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

摘要

卷积神经网络(convolutional Neural Network, CNN)广泛应用于视觉分析领域，在图像分类、人脸检测、物体识别、图像重着色等学习工作中都取得了非常高的性能。使用Torch和Tensorflow等深度学习框架，可以利用GPU的强大功能高效地计算CNN。然而，GPU的一个缺点是它有限的内存，这使我们无法处理大图像。通过4K分辨率的图像到VGG网络将导致Titan-X GPU内存不足的异常。在本文中，我们提出了一种采用BSP(批量同步并行)模型计算任意大小图像cnn的新方法。在输入到特定的CNN层之前，图像被分成更小的部分，分别通过神经网络。然后，采用特定的填充和归一化技术将子图像合并回一幅图像。我们的方法可以很容易地扩展到支持分布式多gpu。在本文中，我们以神经风格网络为例来说明我们的方法的有效性。我们表明，使用一个Titan-X GPU，我们可以在1分钟内传输具有10,000×10,000像素的图像的样式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A New Approach to Compute CNNs for Extremely Large Images

CNN (Convolution Neural Network) is widely used in visual analysis and achieves exceptionally high performances in image classification, face detection, object recognition, image recoloring, and other learning jobs. Using deep learning frameworks, such as Torch and Tensorflow, CNN can be efficiently computed by leveraging the power of GPU. However, one drawback of GPU is its limited memory which prohibits us from handling large images. Passing a 4K resolution image to the VGG network will result in an exception of out-of-memory for Titan-X GPU. In this paper, we propose a new approach that adopts the BSP (bulk synchronization parallel) model to compute CNNs for images of any size. Before fed to a specific CNN layer, the image is split into smaller pieces which go through the neural network separately. Then, a specific padding and normalization technique is adopted to merge sub-images back into one image. Our approach can be easily extended to support distributed multi-GPUs. In this paper, we use neural style network as our example to illustrate the effectiveness of our approach. We show that using one Titan-X GPU, we can transfer the style of an image with 10,000×10,000 pixels within 1 minute.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

自引率

0.00%

发文量