2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)最新文献

Joint Estimation of Age and Gender from Unconstrained Face Images Using Lightweight Multi-Task CNN for Mobile Applications 基于轻量级多任务CNN的无约束人脸图像年龄和性别联合估计

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-06-06 DOI: 10.1109/MIPR.2018.00036

Jia-Hong Lee, Yi-Ming Chan, Ting-Yen Chen, Chu-Song Chen

引用次数: 26

A Multimodal Approach to Predict Social Media Popularity 预测社交媒体流行度的多模式方法

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-05-30 DOI: 10.1109/MIPR.2018.00042

Mayank Meghawat, Satyendra Yadav, Debanjan Mahata, Yifang Yin, R. Shah, Roger Zimmermann

{"title":"A Multimodal Approach to Predict Social Media Popularity","authors":"Mayank Meghawat, Satyendra Yadav, Debanjan Mahata, Yifang Yin, R. Shah, Roger Zimmermann","doi":"10.1109/MIPR.2018.00042","DOIUrl":"https://doi.org/10.1109/MIPR.2018.00042","url":null,"abstract":"Multiple modalities represent different aspects by which information is conveyed by a data source. Modern day social media platforms are one of the primary sources of multimodal data, where users use different modes of expression by posting textual as well as multimedia content such as images and videos for sharing information. Multimodal information embedded in such posts could be useful in predicting their popularity. To the best of our knowledge, no such multimodal dataset exists for the prediction of social media photos. In this work, we propose a multimodal dataset consisiting of content, context, and social information for popularity prediction. Speci?cally, we augment the SMPT1 dataset for social media prediction in ACM Multimedia grand challenge 2017 with image content, titles, descriptions, and tags. Next, in this paper, we propose a multimodal approach which exploits visual features (i.e., content information), textual features (i.e., contextual information), and social features (e.g., average views and group counts) to predict popularity of social media photos in terms of view counts. Experimental results con?rm that despite our multimodal approach uses the half of the training dataset from SMP-T1, it achieves comparable performance with that of state-of-the-art.","PeriodicalId":320000,"journal":{"name":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126775300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 42

A Novel Approach of Multiple Objects Segmentation Based on Graph Cut 一种基于图割的多目标分割新方法

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-04-10 DOI: 10.1109/MIPR.2018.00074

Jiyang Dong, Jian Xue, Shuqiang Jiang, K. Lu

{"title":"A Novel Approach of Multiple Objects Segmentation Based on Graph Cut","authors":"Jiyang Dong, Jian Xue, Shuqiang Jiang, K. Lu","doi":"10.1109/MIPR.2018.00074","DOIUrl":"https://doi.org/10.1109/MIPR.2018.00074","url":null,"abstract":"Segmentation is a very crucial step in many applications. Actually, there are often more than one object to be segmented in an image or a video. Taking the lung images as an example, pulmonary lesions area and lung parenchyma area are both important basis for a doctor to make diagnoses. Due to the fact that lung lesion areas and lung tissues have close gray values in the image, and the diversity, irregularity and location uncertainty of pulmonary lesions, traditional segmentation methods cannot segment objects of interest accurately, nor can extract them at the same time. In this paper, a novel approach is proposed for multiple objects segmentation based on Graph Cut. The algorithm introduces a multi-layers graph structure to represent different regions from inside to outside in an image. Besides, the foreground and background are modeled by Gaussian Mixture Models (GMMs) which can describe the gray distributions of them accurately. Then the weights of parts of links in the graph can be calculated by the probability distribution of the models. To solve the problem of boundaries leakage when two objects with similar gray value are in close proximity, a shape constraint is added to the energy function. The segmentation is achieved by max-flow/min-cut and all of the objects can be obtained. Experiment results demonstrate that the proposed method in this paper can deal with the CT images of lung with pathologies, and has accuracy and robustness.","PeriodicalId":320000,"journal":{"name":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124275094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Deep Learning of Path-Based Tree Classifiers for Large-Scale Plant Species Identification 基于路径的树分类器深度学习的大规模植物物种识别

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-04-10 DOI: 10.1109/MIPR.2018.00013

Haixi Zhang, G. He, Jinye Peng, Zhenzhong Kuang, Jianping Fan

{"title":"Deep Learning of Path-Based Tree Classifiers for Large-Scale Plant Species Identification","authors":"Haixi Zhang, G. He, Jinye Peng, Zhenzhong Kuang, Jianping Fan","doi":"10.1109/MIPR.2018.00013","DOIUrl":"https://doi.org/10.1109/MIPR.2018.00013","url":null,"abstract":"In this paper, a deep learning framework is devel- oped to enable path-based tree classifier training for supporting large-scale plant species recognition, where a deep neural network and a tree classifier are jointly trained in an end-to-end fashion. First, a two-layer plant taxonomy is constructed to organize large numbers of plant species and their genus hierarchically in a coarse- to-fine fashion. Second, a deep learning framework is developed to enable path-based tree classifier training, where a tree classifier over the plant taxonomy is used to replace the flat softmax layer in traditional deep CNNs. A path-based error function is defined to optimize the joint process for learning deep CNN and tree classifier, where back propagation is used to update both the classifier parameters and the network weights simultaneously. We have also constructed a large-scale plant database of Orchid family for algorithm evaluation. Our experimental results have demonstrated that our path-based deep learning algorithm can achieve very competitive results on both the accuracy rates and the computational efficiency for large-scale plant species recognition.","PeriodicalId":320000,"journal":{"name":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115015080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Self-Attentive Feature-Level Fusion for Multimodal Emotion Detection 多模态情感检测的自关注特征级融合

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-04-10 DOI: 10.1109/MIPR.2018.00043

Devamanyu Hazarika, Sruthi Gorantla, Soujanya Poria, Roger Zimmermann

引用次数: 44

MMH: Multi-Modal Hash for Instant Mobile Video Search 即时移动视频搜索的多模态哈希

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-04-10 DOI: 10.1109/MIPR.2018.00018

Wenhui Gao, Xinchen Liu, Huadong Ma, Yanan Li, Liang Liu

{"title":"MMH: Multi-Modal Hash for Instant Mobile Video Search","authors":"Wenhui Gao, Xinchen Liu, Huadong Ma, Yanan Li, Liang Liu","doi":"10.1109/MIPR.2018.00018","DOIUrl":"https://doi.org/10.1109/MIPR.2018.00018","url":null,"abstract":"Mobile devices have been an indispensable part of human life, which enable people to search and browse what they want on the move. Mobile video search, as one of the most important services for users, still faces great challenges under mobile internet scenario, such as the limitation of computation ability, memory, and bandwidth. Therefore, this paper proposes a multi-modal hash based framework for instant mobile video search. In particular, we adopt a efficient deep convolutional neural network, MobileNet, with the hash layer to learn discriminative and compact visual features from videos. Moreover, we also consider hand-crafted local visual descriptor and audio fingerprint to build a multi-modal hash representation of videos. With the multi-modal hash code, two types of hash indexes are built on the server to achieve efficient video search. At last, the multi-modal hash codes are extracted on the mobile devices and transferred in a three- step progressive procedure during the online search stage. The experiments on the real-world dataset show that the proposed framework not only achieves the state-of-the-art accuracy but also obtains excellent efficiency.","PeriodicalId":320000,"journal":{"name":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126492861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fast Vision-Based Pedestrian Traffic Light Detection 基于快速视觉的行人交通信号灯检测

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-04-10 DOI: 10.1109/MIPR.2018.00050

Xue-Hua Wu, R. Hu, Yu‐Qing Bao

引用次数: 4

Bayesian Regularization Based ANN for the Design of Flexible Antenna for UWB Wireless Applications 基于贝叶斯正则化的超宽带无线柔性天线设计

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-04-10 DOI: 10.1109/MIPR.2018.00039

A. Hammoodi, Fadwa Al-Azzo, M. Milanova, H. Khaleel

引用次数: 6

Beyond Big Data of Human Behaviors: Modeling Human Behaviors and Deep Emotions 超越人类行为的大数据:模拟人类行为和深层情感

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-04-10 DOI: 10.1109/MIPR.2018.00065

James J. Deng, C. Leung, Yuanxi Li

{"title":"Beyond Big Data of Human Behaviors: Modeling Human Behaviors and Deep Emotions","authors":"James J. Deng, C. Leung, Yuanxi Li","doi":"10.1109/MIPR.2018.00065","DOIUrl":"https://doi.org/10.1109/MIPR.2018.00065","url":null,"abstract":"Humans possess a variety of long term or short term behaviors such as gesture, posture, and movement and so on. These readable behaviors usually convey significant emotional information, which can facilitate human-machine interactions in intelligent cognitive systems. However, there is a lack of studies on modeling such complex relationship between human behavior and emotion in a time series context. This paper attempts to pioneer such an exploration. First, huge amounts of human behaviors are suggested to be captured by various sensors. Then behaviors and emotions are modeled by deep structure of bidirectional LSTM, which can represent interactions and correlations. To avoid training difficulties, bidirectional LSTM are only located in the bottom layer, and the other layers are uni-bidirectional, while the adjacent layers use residual connections. This deep bidirectional LSTM has the advantage that it can be scaled up to larger varieties of human behaviors captured by multiple sensors. The experimental results show that our proposed deep structure for modeling human behaviors and emotions is able to achieve a high degree of accuracy than shallow representation or models.","PeriodicalId":320000,"journal":{"name":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124049105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Subjective Evaluation of Vector Representation of Emotion Flow for Music Retrieval 情感流向量表示在音乐检索中的主观评价

2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2018-04-10 DOI: 10.1109/MIPR.2018.00075

Chia-Hao Chung, Ming-I Yang, Homer H. Chen

引用次数: 0