ToonNet: a cartoon image dataset and a DNN-based semantic classification system

Proceedings of the 16th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry Pub Date : 2018-12-02 DOI:10.1145/3284398.3284403

Yanqing Zhou, Yongxu Jin, Anqi Luo, Szeyu Chan, Xiangyun Xiao, Xubo Yang

{"title":"ToonNet: a cartoon image dataset and a DNN-based semantic classification system","authors":"Yanqing Zhou, Yongxu Jin, Anqi Luo, Szeyu Chan, Xiangyun Xiao, Xubo Yang","doi":"10.1145/3284398.3284403","DOIUrl":null,"url":null,"abstract":"Cartoon-style pictures can be seen almost everywhere in our daily life. Numerous applications try to deal with cartoon pictures, a dataset of cartoon pictures will be valuable for these applications. In this paper, we first present ToonNet: a cartoon-style image recognition dataset. We construct our benchmark set by 4000 images in 12 different classes collected from the Internet with little manual filtration. We extend the basal dataset to 10000 images by adopting several methods, including snapshots of rendered 3D models with a cartoon shader, a 2D-3D-2D converting procedure using a cartoon-modeling method and a hand-drawing stylization filter. Then, we describe how to build an effective neural network for image semantic classification based on ToonNet. We present three techniques for building the Deep Neural Network (DNN), namely, IUS: Inputs Unified Stylization, stylizing the inputs to reduce the complexity of hand-drawn cartoon images; FIN: Feature Inserted Network, inserting intuitionistic and valuable global features into the network; NPN: Network Plus Network, using multiple single networks as a new mixed network. We show the efficacy and generality of our network strategies in our experiments. By utilizing these techniques, the classification accuracy can reach 78% (top-1) and 93%(top-3), which has an improvement of about 5% (top-1) compared with classical DNNs.","PeriodicalId":340366,"journal":{"name":"Proceedings of the 16th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3284398.3284403","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Cartoon-style pictures can be seen almost everywhere in our daily life. Numerous applications try to deal with cartoon pictures, a dataset of cartoon pictures will be valuable for these applications. In this paper, we first present ToonNet: a cartoon-style image recognition dataset. We construct our benchmark set by 4000 images in 12 different classes collected from the Internet with little manual filtration. We extend the basal dataset to 10000 images by adopting several methods, including snapshots of rendered 3D models with a cartoon shader, a 2D-3D-2D converting procedure using a cartoon-modeling method and a hand-drawing stylization filter. Then, we describe how to build an effective neural network for image semantic classification based on ToonNet. We present three techniques for building the Deep Neural Network (DNN), namely, IUS: Inputs Unified Stylization, stylizing the inputs to reduce the complexity of hand-drawn cartoon images; FIN: Feature Inserted Network, inserting intuitionistic and valuable global features into the network; NPN: Network Plus Network, using multiple single networks as a new mixed network. We show the efficacy and generality of our network strategies in our experiments. By utilizing these techniques, the classification accuracy can reach 78% (top-1) and 93%(top-3), which has an improvement of about 5% (top-1) compared with classical DNNs.

查看原文本刊更多论文

ToonNet：卡通图像数据集和基于 DNN 的语义分类系统

在我们的日常生活中，卡通风格的图片几乎随处可见。许多应用程序都在尝试处理卡通图片，一个卡通图片数据集将对这些应用程序非常有价值。在本文中，我们首先介绍了卡通图片识别数据集 ToonNet。我们从互联网上收集了 12 个不同类别的 4000 张图片，并进行了少量人工过滤，从而构建了我们的基准集。我们采用多种方法将基础数据集扩展到 10000 张图片，其中包括使用卡通着色器渲染三维模型的快照、使用卡通建模方法的 2D-3D-2D 转换过程以及手绘风格化过滤器。然后，我们介绍了如何在 ToonNet 的基础上构建有效的神经网络来进行图像语义分类。我们介绍了构建深度神经网络（DNN）的三种技术，即：IUS：输入统一风格化，对输入进行风格化处理，以降低手绘卡通图像的复杂性；FIN：特征插入网络，在网络中插入直观且有价值的全局特征；NPN：网络加网络，使用多个单一网络：NPN：Network Plus Network，使用多个单一网络作为新的混合网络。我们在实验中展示了网络策略的有效性和通用性。利用这些技术，分类准确率可达 78%（top-1）和 93%（top-3），与经典 DNN 相比提高了约 5%（top-1）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 16th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

自引率

0.00%

发文量