2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)最新文献

筛选
英文 中文
Social stances by virtual smiles 虚拟微笑的社会地位
M. Ochs, C. Pelachaud, K. Prepin
{"title":"Social stances by virtual smiles","authors":"M. Ochs, C. Pelachaud, K. Prepin","doi":"10.1109/WIAMIS.2013.6616144","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616144","url":null,"abstract":"When two persons participate to a discussion, they not only exchange about the concepts and ideas they are dis-cussing, but they also express stances with regard to content of their speech (called epistemic stances) and to convey their interpersonal relationship (called interpersonal stances). The stances can be expressed through non-verbal behaviors, for instance smiles. Stances are also co-constructed by their interactants through simultaneous or sequential behaviors such as the alignment of speaker's and listener's smiles. In this paper, we present several studies exploring the stances (epistemic, interpersonal, and co-constructed) that the social signal of smile may convey. We propose to analyze different contextual levels to highlight how users' engagement and discourse context influence their perception of the virtual characters' stances.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129470435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A nested infinite Gaussian mixture model for identifying known and unknown audio events 用于识别已知和未知音频事件的嵌套无限高斯混合模型
Y. Sasaki, Kazuyoshi Yoshii, S. Kagami
{"title":"A nested infinite Gaussian mixture model for identifying known and unknown audio events","authors":"Y. Sasaki, Kazuyoshi Yoshii, S. Kagami","doi":"10.1109/WIAMIS.2013.6616152","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616152","url":null,"abstract":"This paper presents a novel statistical method that can classify given audio events into known classes or recognize them as an unknown class. We propose a nested infinite Gaussian mixture model (iGMM) to represent varied audio events in real environment. One of the main problems of conventional classification methods is that we need to specify a fixed number of classes in advance. Therefore, all audio events are forced to be classified into known classes. To solve the problem, the proposed method formulates a infinite Gaussian mixture model (iGMM) in which the number of classes are allowed to increase without bound. Another problem is that the complexity of each audio event is different. Then, the nested iGMM using nonparametric Bayesian approach is applied to adjust the needed dimension of each audio model. Experimental results show the effectiveness for these two problems to represent the given audio events.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121392425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
On coding and resampling of video in 4:2:2 chroma format for cascaded coding applications 级联编码应用中4:2:2色度格式视频的编码和重采样
Andrea Gabriellini, M. Mrak
{"title":"On coding and resampling of video in 4:2:2 chroma format for cascaded coding applications","authors":"Andrea Gabriellini, M. Mrak","doi":"10.1109/WIAMIS.2013.6616153","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616153","url":null,"abstract":"Throughout the broadcasting chain 4:2:2 chroma format is widely used even if some parts of the chain require other formats (4:2:0 or 4:4:4). This paper presents an approach to coding video content in 4:2:2 chroma format using resampling of chroma samples. All subsequent video coding operations are then carried out at the new chroma format. The choice of filter for resampling the reconstructed video signal is sent to the decoder in the compressed bit-stream. This paper investigates choices of resampling filters and coding parameters associated with the proposed approach with a goal to minimise conversion losses. Coding performance of possible solutions are reported for two reversible resampling filter pairs when applied in the emerging HEVC video coding standard.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114025345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Event-driven retrieval in collaborative photo collections 协作照片集中的事件驱动检索
M. Brenner, E. Izquierdo
{"title":"Event-driven retrieval in collaborative photo collections","authors":"M. Brenner, E. Izquierdo","doi":"10.1109/WIAMIS.2013.6616121","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616121","url":null,"abstract":"We present an approach to retrieve photos relating to social events in collaborative photo collections. Compared to traditional approaches that typically consider only the visual features of photos as a source of information, we incorporate multiple additional contextual cues like date and time, location and usernames to improve retrieval performance. Experiments based on the MediaEval Social Event Detection Dataset demonstrate the effectiveness of our approach.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"108 1-3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131519749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Introducing motion information in dense feature classifiers 在密集特征分类器中引入运动信息
Claudiu Tanase, B. Mérialdo
{"title":"Introducing motion information in dense feature classifiers","authors":"Claudiu Tanase, B. Mérialdo","doi":"10.1109/WIAMIS.2013.6616132","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616132","url":null,"abstract":"Semantic concept detection in large scale video collections is mostly achieved through a static analysis of selected keyframes. A popular choice for representing the visual content of an image is based on the pooling of local descriptors such as Dense SIFT. However, simple motion features such as optic flow can be extracted relatively easy from such keyframes. In this paper we propose an efficient addition to the DSIFT approach by including information derived from optic flow. Based on optic flow magnitude, we can estimate for each DSIFT patch whether it is static or moving. We modify the bag of words model used traditionally with DSIFT by creating two separate occurrence histograms instead of one: one for static patches and one for dynamic patches. We further refine this method by studying different separation thresholds and soft assign-ment, as well as different normalization techniques. Classifier score fusion is used to maximize the average precision of all these variants. Experimental results on the TRECVID Semantic Indexing collection show that by means of classifier fusion our method increases overall mean average precision of the DSIFT classifier from 0.061 to 0.106.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"62 11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131027801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Affine invariant salient patch descriptors for image retrieval 用于图像检索的仿射不变显著补丁描述子
F. Isikdogan, A. A. Salah
{"title":"Affine invariant salient patch descriptors for image retrieval","authors":"F. Isikdogan, A. A. Salah","doi":"10.1109/WIAMIS.2013.6616136","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616136","url":null,"abstract":"Image description constitutes a major part of matching-based tasks in computer vision. The size of descriptors becomes more important for retrieval tasks in large datasets. In this paper, we propose a compact and robust image description algorithm for image retrieval, which consists of three main stages: salient patch extraction, affine invariant feature computation over concentric elliptical tracks on the patch, and global feature incorporation. We evaluate the performance of our algorithm for region-based image retrieval and image reuse detection, a special case of image retrieval. We present a novel synthetic image reuse dataset, which is generated by superimposing objects on different background images with systematic transformations. Our results show that the proposed descriptor is effective for this problem.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127662969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Tapped delay multiclass support vector machines for industrial workflow recognition 工业工作流程识别的抽头延迟多类支持向量机
Eftychios E. Protopapadakis, A. Doulamis, N. Doulamis
{"title":"Tapped delay multiclass support vector machines for industrial workflow recognition","authors":"Eftychios E. Protopapadakis, A. Doulamis, N. Doulamis","doi":"10.1109/WIAMIS.2013.6616141","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616141","url":null,"abstract":"In this paper, a tapped delay multiclass support vector machine scheme is used for supervised job classification, based on video data taken from Nissan factory. The procedure is based on multiclass SVMs enhanced with the time dimension by incorporating additional information of n-th previous frames and allowing for user feedback when necessary. Such methodology will support the visual supervision of industrial environments by providing essential information to the supervisors and supporting their job.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126468455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
An application framework for implicit sentiment human-centered tagging using attributed affect 一种使用归因情感的隐式情感以人为中心标注的应用框架
K. C. Apostolakis, P. Daras
{"title":"An application framework for implicit sentiment human-centered tagging using attributed affect","authors":"K. C. Apostolakis, P. Daras","doi":"10.1109/WIAMIS.2013.6616145","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616145","url":null,"abstract":"In this paper, a novel framework for implicit sentiment image tagging and retrieval is presented, based on the concept of attributed affect. The user's affective response is recorded and analyzed to provide an appropriate affective label, while eye gaze is monitored in order to identify a specific object depicted in the scene, which is attributed as the cause of the user's current state of core affect. Through this procedure, automatic tagging of content, as well as retrieval based on personal preferences is possible. Our experiments show that our framework successfully channels behavioral tags (in the form of affective labels) to the data tagging and retrieval loop, even when applied in the context of a cost-efficient, widely available hardware setup, that uses a single low resolution webcam mounted on a standard modern computer system.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121572885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Footstep detection and classification using distributed microphones 基于分布式麦克风的脚步声检测与分类
K. Nakadai, Yuta Fujii, S. Sugano
{"title":"Footstep detection and classification using distributed microphones","authors":"K. Nakadai, Yuta Fujii, S. Sugano","doi":"10.1109/WIAMIS.2013.6616127","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616127","url":null,"abstract":"This paper addresses footstep detection and classification with multiple microphones distributed on the floor. We propose to introduce geometrical features such as position and velocity of a sound source for classification which is estimated by amplitude-based localization. It does not require precise inter-microphone time synchronization unlike a conventional microphone array technique. To classify various types of sound events, we introduce four types of features, i.e., time-domain, spectral and Cepstral features in addition to the geometrical features. We constructed a prototype system for footstep detection and classification based on the proposed ideas with eight microphones aligned in a 2-by-4 grid manner. Preliminary classification experiments showed that classification accuracy for four types of sound sources such as a walking footstep, running footstep, handclap, and utterance maintains over 70% even when the signal-to-noise ratio is low, like 0 dB. We also confirmed two advantages with the proposed footstep detection and classification. One is that the proposed features can be applied to classification of other sound sources besides footsteps. The other is that the use of a multichannel approach further improves noise-robustness by selecting the best microphone among the microphones, and providing geometrical information on a sound source.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132249181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A heuristic for distance fusion in cover song identification 一种启发式距离融合的翻唱歌曲识别方法
Alessio Degani, M. Dalai, R. Leonardi, P. Migliorati
{"title":"A heuristic for distance fusion in cover song identification","authors":"Alessio Degani, M. Dalai, R. Leonardi, P. Migliorati","doi":"10.1109/WIAMIS.2013.6616128","DOIUrl":"https://doi.org/10.1109/WIAMIS.2013.6616128","url":null,"abstract":"In this paper, we propose a method to integrate the results of different cover song identification algorithms into one single measure which, on the average, gives better results than initial algorithms. The fusion of the different distance measures is made by projecting all the measures in a multi-dimensional space, where the dimensionality of this space is the number of the considered distances. In our experiments, we test two distance measures, namely the Dynamic Time Warping and the Qmax measure when applied in different combinations to two features, namely a Salience feature and a Harmonic Pitch Class Profile (HPCP). While the HPCP is meant to extract purely harmonic descriptions, in fact, the Salience allows to better discern melodic differences. It is shown that the combination of two or more distance measure improves the overall performance.","PeriodicalId":408077,"journal":{"name":"2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115701028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信