2006 IEEE International Conference on Multimedia and Expo最新文献_第3页

Emotional Speech Synthesis using Subspace Constraints in Prosody 基于韵律子空间约束的情感语音合成

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262725

Shinya Mori, T. Moriyama, S. Ozawa

{"title":"Emotional Speech Synthesis using Subspace Constraints in Prosody","authors":"Shinya Mori, T. Moriyama, S. Ozawa","doi":"10.1109/ICME.2006.262725","DOIUrl":"https://doi.org/10.1109/ICME.2006.262725","url":null,"abstract":"An efficient speech synthesis method that uses subspace constraint in prosody is proposed. Conventional unit selection methods concatenate speech segments stored in database, that require enormous number of waveforms in synthesizing various emotional expressions with arbitrary texts. The proposed method employs principal component analysis to reduce the dimensionality of prosodic components, that also allows us to generate new speech that are similar to training samples. The subspace constraint assures that the prosody of the synthesized speech including F0, power, and speech length hold their correlative relation that training samples of emotional speech have. We assume that the combination of the number of syllables and the accent type determines the correlative dynamics of prosody, for each of which we individually construct the subspace. The subspace is then linearly related to emotions by multiple regression analysis that are obtained by subjective evaluation for the training samples. Experimental results demonstrated that only 4 dimensions were sufficient for representing the prosodic changes due to emotion at over 90% of the total variance. Synthesized emotion were successfully recognized by the listeners of the synthesized speech, especially for \"anger\", \"surprise\", \"disgust\", 'sorrow\", \"boredom\", \"depression\", and \"joy\"","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114564792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Implementation and Evolution of Packet Striping for Media Streaming Over Multiple Burst-Loss Channels 多突发丢失信道媒体流的分组分条实现与发展

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262734

Gene Cheung, P. Sharma, Sung-Ju Lee

{"title":"Implementation and Evolution of Packet Striping for Media Streaming Over Multiple Burst-Loss Channels","authors":"Gene Cheung, P. Sharma, Sung-Ju Lee","doi":"10.1109/ICME.2006.262734","DOIUrl":"https://doi.org/10.1109/ICME.2006.262734","url":null,"abstract":"Modern mobile devices are multi-homed with WLAN and WWAN communication interfaces. In a community of nodes with such multi-homed devices-locally inter-connected via high-speed WLAN but each globally connected to larger networks via low-speed WWAN, striping high-volume traffic from remote large networks over a bundle of low speed WWAN links can overcome the bandwidth mismatch problem between WLAN and WWAN. In our previous work, we showed that a packet striping system for such multi-homed devices-a mapping of delay-sensitive packets by an intermediate gateway to multiple channels using combination of retransmissions (ARQ) and forward error corrections (FEC)-can dramatically enhance the overall performance. In this paper, we improve upon a previous algorithm in two respects. First, by introducing two-tier dynamic programming tables to memoize computed solutions, packet striping decisions translate to simple table lookup operations given stationary network statistics. Doing so drastically reduces striping operation complexity. Second, new weighting functions are introduced into the hybrid ARQ/FEC algorithm to drive the long-term striping system evolution away from pathological local minima that are far from the global optimum. Results show the new algorithm performs efficiently and gives improved performance by avoiding local minima compared to the previous algorithm","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121866665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Online Training-Oriented Video Shooting Navigation System Based on Real-Time Camerawork Evaluation 基于实时摄像评价的在线培训视频拍摄导航系统

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262772

Masahito Kumano, K. Uehara, Y. Ariki

引用次数: 13

A High Throughput VLSI Architecture Design for H.264 Context-Based Adaptive Binary Arithmetic Decoding with Look Ahead Parsing H.264上下文自适应二进制算术解码的高吞吐量VLSI架构设计

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262510

Yao-Chang Yang, Chien-Chang Lin, Hsui-Cheng Chang, Ching-Lung Su, Jiun-In Guo

引用次数: 17

Matching Faces with Textual Cues in Soccer Videos 足球视频中人脸与文本线索的匹配

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262444

M. Bertini, A. Bimbo, W. Nunziati

引用次数: 14

Scalability in Human Shape Analysis 人体形状分析中的可扩展性

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262651

Thomas Fourès, P. Joly

引用次数: 2

A Cross-Layered Peer-to-Peer Architecture for Wireless Mobile Networks 无线移动网络的跨层点对点体系结构

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262625

Mohammad Mursalin Akon, S. Naik, Ajit Singh, Xuemin Shen

引用次数: 3

GPU Accelerated Inverse Photon Mapping for Real-Time Surface Reflectance Modeling 用于实时表面反射建模的GPU加速逆光子映射

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262528

Takashi Machida, N. Yokoya, H. Takemura

引用次数: 0

Video News Shot Labeling Refinement via Shot Rhythm Models 视频新闻镜头标记细化通过镜头节奏模型

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262544

J. Kender, M. Naphade

{"title":"Video News Shot Labeling Refinement via Shot Rhythm Models","authors":"J. Kender, M. Naphade","doi":"10.1109/ICME.2006.262544","DOIUrl":"https://doi.org/10.1109/ICME.2006.262544","url":null,"abstract":"We present a three-step post-processing method for increasing the precision of video shot labels in the domain of television news. First, we demonstrate that news shot sequences can be characterized by rhythms of alternation (due to dialogue), repetition (due to persistent background settings), or both. Thus a temporal model is necessarily third-order Markov. Second, we demonstrate that the output of feature detectors derived from machine learning methods (in particular, from SVMs) can be converted into probabilities in a more effective way than two suggested existing methods. This is particularly true when detectors are errorful due to sparse training sets, as is common in this domain. Third, we demonstrate that a straightforward application of the Viterbi algorithm on a third-order FSM, constructed from observed transition probabilities and converted feature detector outputs, can refine feature label precision at little cost. We show that on a test corpus of TRECVID 2005 news videos annotated with 39 LSCOM-lite features, the mean increase in the measure of average precision (AP) was 4%, with some of the rarer and more difficult features having relative increases in AP of as much as 67%","PeriodicalId":339258,"journal":{"name":"2006 IEEE International Conference on Multimedia and Expo","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114623917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Joint Source-Channel Decoding of Multiple Description Quantized and Variable Length Coded Markov Sequences 多重描述量化变长马尔可夫序列的信源信道联合解码

2006 IEEE International Conference on Multimedia and Expo Pub Date : 2006-07-09 DOI: 10.1109/ICME.2006.262808

X. Wang, Xiaolin Wu

引用次数: 1