ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Tail Classes Matter: Long-Tailed Object Detection Revisited 尾类很重要长尾物体探测再探究
Yinglu Zhang, Chenbo Zhang, Lu Zhang, Tianying Liu, J. Guan, Xinkai Liang, Jiajia Zhao, Shuigeng Zhou
{"title":"Tail Classes Matter: Long-Tailed Object Detection Revisited","authors":"Yinglu Zhang, Chenbo Zhang, Lu Zhang, Tianying Liu, J. Guan, Xinkai Liang, Jiajia Zhao, Shuigeng Zhou","doi":"10.1109/icassp48485.2024.10446683","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446683","url":null,"abstract":"","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"12 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140705434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Taming Prompt-Based Data Augmentation for Long-Tailed Extreme Multi-Label Text Classification 驯服基于提示的数据增强,实现长尾极端多标签文本分类
Pengyu Xu, Mingyang Song, Ziyi Li, Sijin Lu, Liping Jing, Jian Yu
{"title":"Taming Prompt-Based Data Augmentation for Long-Tailed Extreme Multi-Label Text Classification","authors":"Pengyu Xu, Mingyang Song, Ziyi Li, Sijin Lu, Liping Jing, Jian Yu","doi":"10.1109/icassp48485.2024.10446315","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446315","url":null,"abstract":"","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"223 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140704614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Invariant Motion Representation Learning for 3D Talking Face Synthesis 用于 3D 有声人脸合成的不变运动表示学习
Jiyuan Liu, Wenping Wei, Zhendong Li, Guanfeng Li, Hao Liu
{"title":"Invariant Motion Representation Learning for 3D Talking Face Synthesis","authors":"Jiyuan Liu, Wenping Wei, Zhendong Li, Guanfeng Li, Hao Liu","doi":"10.1109/icassp48485.2024.10446379","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446379","url":null,"abstract":"","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"66 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140705107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stereophonic Music Source Separation with Spatially-Informed Bridging Band-Split Network 利用空间信息桥接分带网络进行立体声音源分离
Yichen Yang, Haowen Li, Xianrui Wang, Wen Zhang, Shoji Makino, Jingdong Chen
{"title":"Stereophonic Music Source Separation with Spatially-Informed Bridging Band-Split Network","authors":"Yichen Yang, Haowen Li, Xianrui Wang, Wen Zhang, Shoji Makino, Jingdong Chen","doi":"10.1109/icassp48485.2024.10446287","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446287","url":null,"abstract":"","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"51 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140705925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Layer Relation Knowledge Distillation For Fingerprint Restoration 多层关系知识提炼用于指纹修复
Yu-Min Chiu, Ching-Te Chiu, Dao-Heng Luo
{"title":"Multi-Layer Relation Knowledge Distillation For Fingerprint Restoration","authors":"Yu-Min Chiu, Ching-Te Chiu, Dao-Heng Luo","doi":"10.1109/icassp48485.2024.10446081","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446081","url":null,"abstract":"","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"10 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140706212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing data-Driven and Handcrafted Features for Dimensional Emotion Recognition 比较数据驱动和手工制作的维度情感识别特征
Bogdan Vlasenko, Sargam Vyas, Mathew Magimai.-Doss
{"title":"Comparing data-Driven and Handcrafted Features for Dimensional Emotion Recognition","authors":"Bogdan Vlasenko, Sargam Vyas, Mathew Magimai.-Doss","doi":"10.1109/icassp48485.2024.10446134","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446134","url":null,"abstract":"Speech Emotion Recognition (SER) has garnered significant attention over the past two decades. In the early stages of SER technology, ’brute force’-based techniques led to a significant expansion in knowledge-based acoustic feature representation (FR) for modeling sparse emotional data. However, as deep learning techniques have become more powerful, their direct application has been limited by the scarcity of well-annotated emotional data. As a result, pre-trained neural embeddings on large speech corpora have gained popularity for SER tasks. These embeddings leverage existing transfer learning methods suitable for general-purpose self-supervised learning (SSL) representations. Recent studies on downstream SSL techniques for dimensional SER have shown promising results. In this research, we aim to evaluate the emotion-discriminative characteristics of neural embeddings in general cases (out-of-domain) and when fine-tuned for SER (in-domain). Given that most SSL techniques are pre-trained primarily on English speech, we plan to use speech emotion corpora in both language-matched and mismatched conditions. We will assess the discriminative characteristics of both handcrafted and standalone neural embeddings as FRs.","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"62 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140704868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantifying The Effect Of Simulator-Based Data Augmentation For Speech Recognition On Augmented Reality Glasses 量化基于模拟器的数据增强对增强现实眼镜语音识别的影响
Riku Arakawa, Mathieu Parvaix, Chiong Lai, Hakan Erdogan, Alex Olwal
{"title":"Quantifying The Effect Of Simulator-Based Data Augmentation For Speech Recognition On Augmented Reality Glasses","authors":"Riku Arakawa, Mathieu Parvaix, Chiong Lai, Hakan Erdogan, Alex Olwal","doi":"10.1109/icassp48485.2024.10446544","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446544","url":null,"abstract":"","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140706157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Residual W-Unit Learning with Semantic Embedding for Automatic Pulmonary CT Artery-Vein Separation 利用语义嵌入进行深度残差W单元学习,实现肺部CT动脉-静脉自动分离
Hao Qi, Ming Wu, Sunkui Ke, Xiangxing Chen, Hui-Qing Zeng, Yinran Chen, Xióngbiao Luó
{"title":"Deep Residual W-Unit Learning with Semantic Embedding for Automatic Pulmonary CT Artery-Vein Separation","authors":"Hao Qi, Ming Wu, Sunkui Ke, Xiangxing Chen, Hui-Qing Zeng, Yinran Chen, Xióngbiao Luó","doi":"10.1109/icassp48485.2024.10448498","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10448498","url":null,"abstract":"","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"184 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140706580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NERF-GAZE: A Head-Eye Redirection Parametric Model for Gaze Estimation NERF-GAZE:用于凝视估计的头眼重定向参数模型
Pengwei Yin, Jingjing Wang, Jiawu Dai, Xiaojun Wu
{"title":"NERF-GAZE: A Head-Eye Redirection Parametric Model for Gaze Estimation","authors":"Pengwei Yin, Jingjing Wang, Jiawu Dai, Xiaojun Wu","doi":"10.1109/icassp48485.2024.10446677","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446677","url":null,"abstract":"","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"206 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140704675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
DONE: Dynamic Neural Representation Via Hyperplane Neural ODE DONE:通过超平面神经 ODE 进行动态神经表征
Jiaxu Wang, Bo Xu, Hao Cheng, Renjing Xu
{"title":"DONE: Dynamic Neural Representation Via Hyperplane Neural ODE","authors":"Jiaxu Wang, Bo Xu, Hao Cheng, Renjing Xu","doi":"10.1109/icassp48485.2024.10446247","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10446247","url":null,"abstract":"","PeriodicalId":517764,"journal":{"name":"ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"200 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140704705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信