基于dnn的声学特征袋的多模态业务估计

2015 23rd European Signal Processing Conference (EUSIPCO) Pub Date : 2015-12-28 DOI:10.1109/EUSIPCO.2015.7362793

S. Tamura, Takuya Uno, Masanori Takehara, S. Hayamizu, T. Kurata

{"title":"基于dnn的声学特征袋的多模态业务估计","authors":"S. Tamura, Takuya Uno, Masanori Takehara, S. Hayamizu, T. Kurata","doi":"10.1109/EUSIPCO.2015.7362793","DOIUrl":null,"url":null,"abstract":"In service engineering it is important to estimate when and what a worker did, because they include crucial evidences to improve service quality and working environments. For Service Operation Estimation (SOE), acoustic information is one of useful and key modalities; particularly environmental or background sounds include effective cues. This paper focuses on two aspects: (1) extracting powerful and robust acoustic features by using stacked-denoising-autoencoder and bag-of-feature techniques, and (2) investigating a multi-modal SOE scheme by combining the audio features and the other sensor data as well as non-sensor information. We conducted evaluation experiments using multi-modal data recorded in a restaurant. We improved SOE performance in comparison to conventional acoustic features, and effectiveness of our multimodal SOE scheme is also clarified.","PeriodicalId":401040,"journal":{"name":"2015 23rd European Signal Processing Conference (EUSIPCO)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-modal service operation estimation using DNN-based acoustic bag-of-features\",\"authors\":\"S. Tamura, Takuya Uno, Masanori Takehara, S. Hayamizu, T. Kurata\",\"doi\":\"10.1109/EUSIPCO.2015.7362793\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In service engineering it is important to estimate when and what a worker did, because they include crucial evidences to improve service quality and working environments. For Service Operation Estimation (SOE), acoustic information is one of useful and key modalities; particularly environmental or background sounds include effective cues. This paper focuses on two aspects: (1) extracting powerful and robust acoustic features by using stacked-denoising-autoencoder and bag-of-feature techniques, and (2) investigating a multi-modal SOE scheme by combining the audio features and the other sensor data as well as non-sensor information. We conducted evaluation experiments using multi-modal data recorded in a restaurant. We improved SOE performance in comparison to conventional acoustic features, and effectiveness of our multimodal SOE scheme is also clarified.\",\"PeriodicalId\":401040,\"journal\":{\"name\":\"2015 23rd European Signal Processing Conference (EUSIPCO)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 23rd European Signal Processing Conference (EUSIPCO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EUSIPCO.2015.7362793\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUSIPCO.2015.7362793","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在服务工程中，估计一个工人什么时候做了什么是很重要的，因为它们包含了改善服务质量和工作环境的关键证据。在服务运营评估(SOE)中，声学信息是有用的和关键的模型之一;特别是环境或背景声音包含有效线索。本文主要研究了两方面的内容:(1)利用叠叠去噪自编码器和特征袋技术提取强大而鲁棒的声学特征;(2)研究了将音频特征与其他传感器数据以及非传感器信息相结合的多模态SOE方案。我们使用在一家餐厅记录的多模态数据进行了评估实验。与传统声学特征相比，我们改进了SOE的性能，并阐明了我们的多模式SOE方案的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-modal service operation estimation using DNN-based acoustic bag-of-features

In service engineering it is important to estimate when and what a worker did, because they include crucial evidences to improve service quality and working environments. For Service Operation Estimation (SOE), acoustic information is one of useful and key modalities; particularly environmental or background sounds include effective cues. This paper focuses on two aspects: (1) extracting powerful and robust acoustic features by using stacked-denoising-autoencoder and bag-of-feature techniques, and (2) investigating a multi-modal SOE scheme by combining the audio features and the other sensor data as well as non-sensor information. We conducted evaluation experiments using multi-modal data recorded in a restaurant. We improved SOE performance in comparison to conventional acoustic features, and effectiveness of our multimodal SOE scheme is also clarified.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 23rd European Signal Processing Conference (EUSIPCO)

自引率

0.00%

发文量