2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)最新文献

筛选
英文 中文
Two-Point Neurons for Efficient Multimodal Speech Enhancement 两点神经元用于有效的多模态语音增强
M. Raza, Khubaib Ahmed, Junaid Muzaffar, Ahsan Adeel
{"title":"Two-Point Neurons for Efficient Multimodal Speech Enhancement","authors":"M. Raza, Khubaib Ahmed, Junaid Muzaffar, Ahsan Adeel","doi":"10.1109/ICASSPW59220.2023.10193457","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10193457","url":null,"abstract":"Here we present a two-point neuron-inspired deep convolutional net (DCN) with 18 convolutional layers for multimodal speech enhancement (MM-SE) and compare it against conventional point neuron-inspired DCN in terms of Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI). We show that the two-point neuron-driven DCN performs comparably to point-neurons driven DCN by using only ≈0.2% neurons at any time during training.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132829459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ensemble Methods For Enhanced Covid-19 CT Scan Severity Analysis 增强Covid-19 CT扫描严重程度分析的集成方法
A. Thyagachandran, H. Murthy
{"title":"Ensemble Methods For Enhanced Covid-19 CT Scan Severity Analysis","authors":"A. Thyagachandran, H. Murthy","doi":"10.1109/ICASSPW59220.2023.10193538","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10193538","url":null,"abstract":"Computed Tomography (CT) scans provide a high-resolution image of the lungs, allowing clinicians to identify the severity of infections in COVID-19 patients. This paper presents a domain knowledge-based pipeline for extracting infection regions from COVID-19 patients using a combination of image-processing algorithms and a pre-trained UNET model. Then, an infection rate-based feature vector is generated for each CT scan. The infection severity is then classified into four categories using an ensemble of three machine-learning models: Random Forest, Support Vector Machines, and Extremely Randomized Trees. The proposed system is evaluated on the validation and test datasets with a macro F1 score of 58% and 46.31%, respectively. Our proposed model has achieved $3 ^{rd}$ place in the severity detection challenge as part of the IEEE ICASSP 2023: AI-enabled Medical Image Analysis Workshop and COVID-19 Diagnosis Competition (AI-MIACOV19D). The implementation of the proposed system is available at https://github.com/aanandt/Enhancing-COVID19-Severity-Analysis-through-Ensemble-Methods.git","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131932613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Reflective Beamforming For Non-Terrestrial Networks Under Thermal Deformations 热变形下非地面网络的鲁棒反射波束形成
D. Rakhimov, Bile Peng, Eduard Axel Jorswieck, M. Haardt
{"title":"Robust Reflective Beamforming For Non-Terrestrial Networks Under Thermal Deformations","authors":"D. Rakhimov, Bile Peng, Eduard Axel Jorswieck, M. Haardt","doi":"10.1109/ICASSPW59220.2023.10193299","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10193299","url":null,"abstract":"In this paper, we present a beamforming method that is robust against thermal deformations for non-terrestrial reconfigurable intelligent surfaces (RIS). We analytically derive the expressions for the worst-case bound on perturbations of the covariance matrix and the corresponding steering vectors as functions of possible displacements of RIS elements. We apply these bounds during the optimization procedure to find the beamforming coefficients that are robust to thermal deformations. Moreover, we present a simple heuristic to obtain the constant modulus beamforming coefficients from the optimal beamforming via an array thinning operation. The simulation results confirm the robustness of the proposed solution against random but bounded perturbations caused by thermal deformations of the reflective surface.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115765038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Dataset for Foreground Speech Analysis With Smartwatches In Everyday Home Environments 智能手表在日常家庭环境中的前景语音分析数据集
Dawei Liang, Zifan Xu, Yinuo Chen, Rebecca Adaimi, David F. Harwath, Edison Thomaz
{"title":"A Dataset for Foreground Speech Analysis With Smartwatches In Everyday Home Environments","authors":"Dawei Liang, Zifan Xu, Yinuo Chen, Rebecca Adaimi, David F. Harwath, Edison Thomaz","doi":"10.1109/ICASSPW59220.2023.10192949","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10192949","url":null,"abstract":"Acoustic sensing has proved effective as a foundation for applications in health and human behavior analysis. In this work, we focus on detecting in-person social interactions in naturalistic settings from audio captured by a smartwatch. As a first step, it is critical to distinguish the speech of the individual wearing the watch (foreground speech) from all other sounds nearby, such as speech from other individuals and ambient sounds. Given the considerable burden of collecting and annotating real-world training data and the lack of existing online data resources, this paper introduces a dataset for foreground speech detection of users wearing a smartwatch. The data is collected from 39 participants interacting with family members in real homes. We then present a benchmark study for the dataset with different test setups. Furthermore, we explore a model-free heuristic method to identify foreground instances based on transfer learning embeddings.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115129508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frequency Asynchronous Noma In LEO Satellite Communication Systems 低轨道卫星通信系统中的频率异步Noma
Joohyun Son, Jehyun Heo, Hyunwook Lee, Seungwoo Sung, Minchul Hong, Hanwoong Kim, Gayeon Ahn, D. Hong
{"title":"Frequency Asynchronous Noma In LEO Satellite Communication Systems","authors":"Joohyun Son, Jehyun Heo, Hyunwook Lee, Seungwoo Sung, Minchul Hong, Hanwoong Kim, Gayeon Ahn, D. Hong","doi":"10.1109/ICASSPW59220.2023.10193732","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10193732","url":null,"abstract":"In this paper, frequency asynchronous non-orthogonal multiple access (FA-NOMA) is applied to the low earth orbit (LEO) satellite communication (SatCom) system. There have been attempts to employ power domain NOMA (P-NOMA) to support large numbers of users in SatCom systems. However, P-NOMA is not beneficial in LEO SatCom environments where the power difference between users is small. In comparison, FA-NOMA utilizes intentional frequency offsets rather than environmental characteristics, making it a suitable technique for supporting large numbers of users in the LEO satellite environments.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"313 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115445759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Translation to Sign Language Using Post-Translation Replacement Without Placeholders 不带占位符的翻译后替换手语机器翻译
Taro Miyazaki, Naoki Nakatani, Tsubasa Uchida, H. Kaneko, Masanori Sano
{"title":"Machine Translation to Sign Language Using Post-Translation Replacement Without Placeholders","authors":"Taro Miyazaki, Naoki Nakatani, Tsubasa Uchida, H. Kaneko, Masanori Sano","doi":"10.1109/ICASSPW59220.2023.10193419","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10193419","url":null,"abstract":"Sign language is typically the first language for those who are born deaf or who lose their hearing in early childhood. To provide important information for these individuals, it is better to use sign language than to transcribe spoken languages. We have been developing a system that translates Japanese into Japanese Sign Language (JSL) and then generates computer graphics (CG) animation of JSL.In this paper, we propose a machine translation method for translating Japanese into JSL. The proposed method is based on an encoder-decoder model that utilizes a pre-trained model as the encoder, and the proper names in the translation result are revised using a dictionary by means of a post-translation replacement method without placeholders. Our experimental results demonstrate that using the pre-trained model as the encoder and performing the post-translation replacement of proper names both contributed to improving the translation quality.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122990404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Survey of Datasets, Applications, and Models for IMU Sensor Signals IMU传感器信号的数据集、应用和模型综述
Aparajita Saraf, Seungwhan Moon, Andrea Madotto
{"title":"A Survey of Datasets, Applications, and Models for IMU Sensor Signals","authors":"Aparajita Saraf, Seungwhan Moon, Andrea Madotto","doi":"10.1109/ICASSPW59220.2023.10193365","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10193365","url":null,"abstract":"Inertial Measurement Units (IMUs) are small, low-cost sensors that can measure accelerations and angular velocities, making them valuable tools for a variety of applications, including robotics, virtual reality, and healthcare. With the advent of deep learning, there has been a surge of interest in using IMU data to train DNN models for various applications. In this paper, we survey the state-of-the-art ML models including deep neural network models and applications for IMU sensors. We first provide an overview of IMU sensors and the types of data they generate. We then review the most popular models for IMU data, including convolutional neural networks, recurrent neural networks, and attention-based models. We also discuss the challenges associated with training deep neural networks on IMU data, such as data scarcity, noise, and sensor drift. Finally, we present a comprehensive review of the most prominent applications of deep neural networks for IMU data, including human activity recognition, gesture recognition, gait analysis, and fall detection. Overall, this survey provides a comprehensive overview of the stateof-the-art deep neural network models and applications for IMU sensors and highlights the challenges and opportunities in this rapidly evolving field.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125153490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Motion Editing Tool for Reproducing Grammatical Elements of Japanese Sign Language Avatar Animation 动作编辑工具再现日语手语化身动画的语法元素
Tsubasa Uchida, Naoki Nakatani, Taro Miyazaki, H. Kaneko, Masanori Sano
{"title":"Motion Editing Tool for Reproducing Grammatical Elements of Japanese Sign Language Avatar Animation","authors":"Tsubasa Uchida, Naoki Nakatani, Taro Miyazaki, H. Kaneko, Masanori Sano","doi":"10.1109/ICASSPW59220.2023.10193198","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10193198","url":null,"abstract":"For deaf and hard of hearing people whose native language is sign language, it is necessary to provide information not only with subtitles but also with sign language. One means of providing information in sign language is to use animated avatars. Therefore, we have developed a system that generates a Japanese Sign Language (JSL) avatar animation from Japanese sentences utilizing Japanese-to-JSL translation and a motion-data-driven animation generation method. In this paper, we propose a motion editing tool that can adjust grammatical elements of JSL by editing each motion data forming a JSL sentence. An evaluation experiment shows that editing the motion speed and blend span of multiple words corresponding to the delimitation of phrases and clauses can reduce the error rate of understanding JSL avatar animations.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125784347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smart Selection of Useful Insights from Wearables 从可穿戴设备中选择有用的见解
Allmin Pradhap Singh Susaiyah, Aki Härmä, Simone Balloccu, E. Reiter, M. Petkovic
{"title":"Smart Selection of Useful Insights from Wearables","authors":"Allmin Pradhap Singh Susaiyah, Aki Härmä, Simone Balloccu, E. Reiter, M. Petkovic","doi":"10.1109/ICASSPW59220.2023.10193140","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10193140","url":null,"abstract":"The popularity of wearable-devices equipped with inertial measurement units (IMUs) and optical sensors has increased in recent years. These sensors provide valuable activity and heart-rate data that, when analysed across multiple users and over time, can offer profound insights into individual lifestyle habits. However, the high dimensionality of such data and user preference dynamics present significant challenges for mining useful insights. This paper proposes a novel approach that employs natural language processing to mine insights from wearable-data, utilising a neural network model that leverages end-to-end feedback from users. Results demonstrate that this approach effectively increased daily step counts among users, showcasing the potential of this method for optimising health and wellness outcomes.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130108538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lung Segmentation Enhances COVID-19 Detection 肺分割增强COVID-19检测
R. Turnbull
{"title":"Lung Segmentation Enhances COVID-19 Detection","authors":"R. Turnbull","doi":"10.1109/ICASSPW59220.2023.10193492","DOIUrl":"https://doi.org/10.1109/ICASSPW59220.2023.10193492","url":null,"abstract":"Improving automated analysis of medical imaging will provide clinicians more options in providing care for patients. The 2023 AI-enabled Medical Image Analysis Workshop and Covid-19 Diagnosis Competition (AI-MIA-COV19D) provides an opportunity to test and refine machine learning methods for detecting the presence and severity of COVID-19 in patients from CT scans. This paper presents version 2 of Cov3d, a deep learning model submitted in the 2022 competition. The model has been improved through a preprocessing step which segments the lungs in the CT scan and crops the input to this region. It results in a macro F1 score of 84.92% for predicting the presence of COVID-19 in the CT scans on the test dataset which came second place in the competition. The model achieved a macro F1 score of 59.06% on the test dataset for predicting the severity of COVID-19 which was the best performing model for that task of the competition.","PeriodicalId":158726,"journal":{"name":"2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130130743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信