A Dataset for Foreground Speech Analysis With Smartwatches In Everyday Home Environments

2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) Pub Date : 2023-06-04 DOI:10.1109/ICASSPW59220.2023.10192949

Dawei Liang, Zifan Xu, Yinuo Chen, Rebecca Adaimi, David F. Harwath, Edison Thomaz

{"title":"A Dataset for Foreground Speech Analysis With Smartwatches In Everyday Home Environments","authors":"Dawei Liang, Zifan Xu, Yinuo Chen, Rebecca Adaimi, David F. Harwath, Edison Thomaz","doi":"10.1109/ICASSPW59220.2023.10192949","DOIUrl":null,"url":null,"abstract":"Acoustic sensing has proved effective as a foundation for applications in health and human behavior analysis. In this work, we focus on detecting in-person social interactions in naturalistic settings from audio captured by a smartwatch. As a first step, it is critical to distinguish the speech of the individual wearing the watch (foreground speech) from all other sounds nearby, such as speech from other individuals and ambient sounds. Given the considerable burden of collecting and annotating real-world training data and the lack of existing online data resources, this paper introduces a dataset for foreground speech detection of users wearing a smartwatch. The data is collected from 39 participants interacting with family members in real homes. We then present a benchmark study for the dataset with different test setups. Furthermore, we explore a model-free heuristic method to identify foreground instances based on transfer learning embeddings.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSPW59220.2023.10192949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Acoustic sensing has proved effective as a foundation for applications in health and human behavior analysis. In this work, we focus on detecting in-person social interactions in naturalistic settings from audio captured by a smartwatch. As a first step, it is critical to distinguish the speech of the individual wearing the watch (foreground speech) from all other sounds nearby, such as speech from other individuals and ambient sounds. Given the considerable burden of collecting and annotating real-world training data and the lack of existing online data resources, this paper introduces a dataset for foreground speech detection of users wearing a smartwatch. The data is collected from 39 participants interacting with family members in real homes. We then present a benchmark study for the dataset with different test setups. Furthermore, we explore a model-free heuristic method to identify foreground instances based on transfer learning embeddings.

查看原文本刊更多论文

智能手表在日常家庭环境中的前景语音分析数据集

声学传感已被证明是健康和人类行为分析应用的有效基础。在这项工作中，我们专注于从智能手表捕获的音频中检测自然环境中的个人社交互动。作为第一步，将佩戴手表的人的语音(前景语音)与附近的所有其他声音(例如其他人的语音和环境声音)区分开来是至关重要的。考虑到现实世界训练数据的收集和标注负担较大，且缺乏现有的在线数据资源，本文介绍了一种用于佩戴智能手表的用户前景语音检测的数据集。数据收集自39名参与者与真实家庭成员的互动。然后，我们用不同的测试设置对数据集进行基准研究。此外，我们探索了一种基于迁移学习嵌入的无模型启发式方法来识别前景实例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)

自引率

0.00%

发文量