Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval最新文献

筛选
英文 中文
Extracting and Using Medical Expert Knowledge to Advance in Video Processing for Gynecologic Endoscopy 提取和利用医学专家知识推进妇科内窥镜视频处理
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206082
Andreas Leibetseder, Klaus Schöffmann
{"title":"Extracting and Using Medical Expert Knowledge to Advance in Video Processing for Gynecologic Endoscopy","authors":"Andreas Leibetseder, Klaus Schöffmann","doi":"10.1145/3206025.3206082","DOIUrl":"https://doi.org/10.1145/3206025.3206082","url":null,"abstract":"Modern day endoscopic technology enables medical staff to conveniently document surgeries via recording raw treatment footage, which can be utilized for planning further proceedings, future case revisitations or even educational purposes. However, the prospect of manually perusing recorded media files constitutes a tedious additional workload on physicians' already packed timetables and therefore ultimately represents a burden rather than a benefit. The aim of this PhD project is to improve upon this situation by closely collaborating with medical experts in order to devise datasets and systems to facilitate semi-automatic post-surgical media processing.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132229981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Context-Aware Late-Fusion Approach for Disaster Image Retrieval from Social Media 基于上下文感知的社会媒体灾难图像检索后期融合方法
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206047
Minh-Son Dao, Pham Quang Nhat Minh, Asem Kasem, M. Nazmudeen
{"title":"A Context-Aware Late-Fusion Approach for Disaster Image Retrieval from Social Media","authors":"Minh-Son Dao, Pham Quang Nhat Minh, Asem Kasem, M. Nazmudeen","doi":"10.1145/3206025.3206047","DOIUrl":"https://doi.org/10.1145/3206025.3206047","url":null,"abstract":"Natural disasters, especially those related to flooding, are global issues that attract a lot of attention in many parts of the world. A series of research ideas focusing on combining heterogeneous data sources to monitor natural disasters have been proposed, including multi-modal image retrieval. Among these data sources, social media streams are considered of high importance due to the fast and localized updates on disaster situations. Unfortunately, the social media itself contains several factors that limit the accuracy of this process such as noisy data, unsynchronized content between image and collateral text, and untrusted information, to name a few. In this research work, we introduce a context-aware late-fusion approach for disaster image retrieval from social media. Several known techniques based on context-aware criteria are integrated, namely late fusion, tuning, ensemble learning, object detection and scene classification using deep learning. We have developed a method for image-text content synchronization and spatial-temporal-context event confirmation, and evaluated the role of using different types of features extracted from internal and external data sources. We evaluated our approach using the dataset and evaluation tool offered by MediaEval2017: Emergency Response for Flooding Events Task. We have also compared our approach with other methods introduced by MediaEval2017's participants. The experimental results show that our approach is the best one when taking the image-text content synchronization and spatial-temporal-context event confirmation into account.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131757677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Object Trajectory Proposal via Hierarchical Volume Grouping 基于分层卷分组的目标轨迹建议
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206059
Xu Sun, Yuantian Wang, Tongwei Ren, Zhi Liu, Zhengjun Zha, Gangshan Wu
{"title":"Object Trajectory Proposal via Hierarchical Volume Grouping","authors":"Xu Sun, Yuantian Wang, Tongwei Ren, Zhi Liu, Zhengjun Zha, Gangshan Wu","doi":"10.1145/3206025.3206059","DOIUrl":"https://doi.org/10.1145/3206025.3206059","url":null,"abstract":"Object trajectory proposal aims to locate category-independent object candidates in videos with a limited number of trajectories,i.e.,bounding box sequences. Most existing methods, which derive from combining object proposal with tracking, cannot handle object trajectory proposal effectively due to the lack of comprehensive objectness measurement through analyzing spatio-temporal characteristics over a whole video. In this paper, we propose a novel object trajectory proposal method using hierarchical volume grouping. Specifically, we first represent a given video with hierarchical volumes by mapping hierarchical regions with optical flow. Then, we filter the short volumes and background volumes, and combinatorially group the retained volumes into object candidates. Finally, we rank the object candidates using a multi-modal fusion scoring mechanism, which incorporates both appearance objectness and motion objectness, and generate the bounding boxes of the object candidates with the highest scores as the trajectory proposals. We validated the proposed method on a dataset consisting of 200 videos from ILSVRC2016-VID. The experimental results show that our method is superior to the state-of-the-art object trajectory proposal methods.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133554257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Interpretable Partitioned Embedding for Customized Multi-item Fashion Outfit Composition 可解释的分割嵌入,用于定制多件时装组合
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206048
Zunlei Feng, Zhenyun Yu, Yezhou Yang, Yongcheng Jing, Junxiao Jiang, Mingli Song
{"title":"Interpretable Partitioned Embedding for Customized Multi-item Fashion Outfit Composition","authors":"Zunlei Feng, Zhenyun Yu, Yezhou Yang, Yongcheng Jing, Junxiao Jiang, Mingli Song","doi":"10.1145/3206025.3206048","DOIUrl":"https://doi.org/10.1145/3206025.3206048","url":null,"abstract":"Intelligent fashion outfit composition becomes more and more popular in these years. Some deep learning based approaches reveal competitive composition recently. However, the uninterpretable characteristic makes such deep learning based approach cannot meet the designers, businesses and consumers' urge to comprehend the importance of different attributes in an outfit composition. To realize interpretable and customized multi-item fashion outfit compositions, we propose a partitioned embedding network to learn interpretable embeddings from clothing items. The network consists of two vital components: attribute partition module and partition adversarial module. In the attribute partition module, multiple attribute labels are adopted to ensure that different parts of the overall embedding correspond to different attributes. In the partition adversarial module, adversarial operations are adopted to achieve the independence of different parts. With the interpretable and partitioned embedding, we then construct an outfit composition graph and an attribute matching map. Extensive experiments demonstrate that 1) the partitioned embedding have unmingled parts which corresponding to different attributes and 2) outfits recommended by our model are more desirable in comparison with the existing methods.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116379755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Perceptually-guided Understanding of Egocentric Video Content: Recognition of Objects to Grasp 感知引导对自我中心视频内容的理解:对抓取对象的识别
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206073
I. González-Díaz, J. Benois-Pineau, J. Domenger, A. Rugy
{"title":"Perceptually-guided Understanding of Egocentric Video Content: Recognition of Objects to Grasp","authors":"I. González-Díaz, J. Benois-Pineau, J. Domenger, A. Rugy","doi":"10.1145/3206025.3206073","DOIUrl":"https://doi.org/10.1145/3206025.3206073","url":null,"abstract":"Incorporating user perception into visual content search and understanding tasks has become one of the major trends in multimedia retrieval. We tackle the problem of object recognition guided by user perception, as indicated by his gaze during visual exploration, in the application domain of assistance to upper-limb amputees. Although selecting the object to be grasped represents a task-driven visual search, human gaze recordings are noisy due to several physiological factors. Hence, since gaze does not always point to the object of interest, we use video-level weak annotations indicating the object to be grasped, and propose a video-level weak loss in classification with Deep CNNs. Our results show that the method achieves notably better performance than other approaches over a complex real-life dataset specifically recorded, with optimal performance for fixation times around 400-800ms, producing a minimal impact on subjects' behavior.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128195031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
NEC's Object Recognition Technologies and their Industrial Applications NEC的对象识别技术及其工业应用
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3210493
K. Iwamoto
{"title":"NEC's Object Recognition Technologies and their Industrial Applications","authors":"K. Iwamoto","doi":"10.1145/3206025.3210493","DOIUrl":"https://doi.org/10.1145/3206025.3210493","url":null,"abstract":"Recent advancements in image recognition technologies has enabled image recognition-based systems to be widely used in real world applications. In this talk, I will introduce NEC's image-based object recognition technologies targeted for recognizing various manufactured goods and retail products from a camera, and talk about their industrial applications which we have developed and commercialized. These image-based object recognition technologies enable highly efficient and cost-effective management of goods and products throughout their life-cycle (manufacturing, distribution, retail, and consumption), which otherwise cannot be achieved by human labor or by use of ID tags. Firstly, I will talk about a technology to recognize multiple objects from a single image using feature matching of compact local descriptors, combined with a more recent Deep Learning-based recognition. It enables large number of objects to be recognized at once, which greatly reduces human labor and time for various product inspection and checking works. Using this technology, we have developed and commercialized the product inspection system in warehouses, the planogram recognition system for retail shop shelves, and the self-service POS system for easy-to-use and fast checkout in retail stores. Secondly, I will talk about the \"Fingerprint of Things'' technology. It enables individual identification of tiny manufactured parts (e.g. bolts and nuts) by identifying images of their unique surface patterns, just like human fingerprints. We have built a prototype of mass-produced parts traceability system, which enables users to easily track down the individual parts using a mobile device. In the talk, I will explain the key issues in realizing these industrial applications of image-based object recognition technologies.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128897979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CBVMR: Content-Based Video-Music Retrieval Using Soft Intra-Modal Structure Constraint 基于软模态内结构约束的基于内容的视频音乐检索
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206046
Sungeun Hong, Woobin Im, H. Yang
{"title":"CBVMR: Content-Based Video-Music Retrieval Using Soft Intra-Modal Structure Constraint","authors":"Sungeun Hong, Woobin Im, H. Yang","doi":"10.1145/3206025.3206046","DOIUrl":"https://doi.org/10.1145/3206025.3206046","url":null,"abstract":"Up to now, only limited research has been conducted on crossmodal retrieval of suitable music for a specified video or vice versa. Moreover, much of the existing research relies on metadata such as keywords, tags, or description that must be individually produced and attached posterior. This paper introduces a new content-based, cross-modal retrieval method for video and music that is implemented through deep neural networks. We train the network via inter-modal ranking loss such that videos and music with similar semantics end up close together in the embedding space. However, if only the inter-modal ranking constraint is used for embedding, modality-specific characteristics can be lost. To address this problem, we propose a novel soft intra-modal structure loss that leverages the relative distance relationship between intra-modal samples before embedding. We also introduce reasonable quantitative and qualitative experimental protocols to solve the lack of standard protocols for less-mature video-music related tasks. All the datasets and source code can be found in our online repository (https://github.com/csehong/VM-NET).","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126772413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Multimodal Filtering of Social Media for Temporal Monitoring and Event Analysis 面向时间监测和事件分析的社交媒体多模态过滤
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206079
Po-Yao (Bernie) Huang, Junwei Liang, Jean-Baptiste Lamare, Alexander Hauptmann
{"title":"Multimodal Filtering of Social Media for Temporal Monitoring and Event Analysis","authors":"Po-Yao (Bernie) Huang, Junwei Liang, Jean-Baptiste Lamare, Alexander Hauptmann","doi":"10.1145/3206025.3206079","DOIUrl":"https://doi.org/10.1145/3206025.3206079","url":null,"abstract":"Developing an efficient and effective social media monitoring system has become one of the important steps towards improved public safety. With the explosive availability of user-generated content documenting most conflicts and human rights abuses around the world, analysts and first-responders increasingly find themselves overwhelmed with massive amounts of noisy data from social media. In this paper, we construct a large-scale public safety event dataset with retrospective automatic labeling for 4.2 million multimodal tweets from 7 public safety events occurred in 2013~2017. We propose a new multimodal social media filtering system composed of encoding, classification, and correlation networks to jointly learn shared and complementary visual and textual information to filter out the most relevant and useful items among the noisy social media influx. The proposed model is verified and achieves significant improvement over competitive baselines under the retrospective and real-time experimental protocols.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121845334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Tourism Category Classification on Image Sharing Services Through Estimation of Existence of Reliable Results 基于图像共享服务存在可靠结果估计的旅游类别分类
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval Pub Date : 2018-06-05 DOI: 10.1145/3206025.3206085
Naoki Saito, Takahiro Ogawa, Satoshi Asamizu, M. Haseyama
{"title":"Tourism Category Classification on Image Sharing Services Through Estimation of Existence of Reliable Results","authors":"Naoki Saito, Takahiro Ogawa, Satoshi Asamizu, M. Haseyama","doi":"10.1145/3206025.3206085","DOIUrl":"https://doi.org/10.1145/3206025.3206085","url":null,"abstract":"A new tourism category classification method through estimation of existence of reliable classification results is presented in this paper. The proposed method obtains two kinds of classification results by applying a convolutional neural network to tourism images and applying a Fuzzy K-nearest neighbor algorithm to geotags attached to the tourism images. Then the proposed method estimates existence of reliable classification results in the above two results. If the reliable result is included, the result is selected as the final classification result. If any reliable result is not included, the final result is obtained by another approach based on a multiple annotator logistic regression model. Consequently, the proposed method enables accurate classification based on the new estimation scheme.","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125321680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Session details: Best Paper Session 会议细节:最佳论文会议
B. Huet
{"title":"Session details: Best Paper Session","authors":"B. Huet","doi":"10.1145/3252925","DOIUrl":"https://doi.org/10.1145/3252925","url":null,"abstract":"","PeriodicalId":224132,"journal":{"name":"Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133895461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信