基于先验知识的说话人划分与连接识别共同说话人

Tsun-Yat Leung, Lahiru Samarakoon, Albert Y. S. Lam
{"title":"基于先验知识的说话人划分与连接识别共同说话人","authors":"Tsun-Yat Leung, Lahiru Samarakoon, Albert Y. S. Lam","doi":"10.1109/ASRU46091.2019.9003731","DOIUrl":null,"url":null,"abstract":"Speaker Diarization and Linking discovers “who spoke when” across recordings without any speaker enrollment. Diarization is performed on each recording separately, and the linking combines clusters of the same speaker across recordings. It is a two-step approach, however it suffers from propagating the error from diarization step to the linking step. In a situation where a unique speaker appears in a given set of recordings, this paper aims at locating the common speaker using the prior knowledge of his or her existence. That means there is no enrollment data for this common speaker. We propose Pairwise Common Speaker Identification (PCSI) method that takes the existence of a common speaker into account in contrast to the two-step approach. We further show that PCSI can be used to reduce the errors that are introduced in the diarization step of the two-step approach. Our experiments are performed on a corpus synthesised from the AMI corpus and also on a in-house conversational telephony Sichuanese corpus that is mixed with Mandarin. We show up to 7.68% relative improvements of time-weighted equal error rate over a state-of-art x-vector diarization and linking system.","PeriodicalId":150913,"journal":{"name":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"108-109 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Incorporating Prior Knowledge into Speaker Diarization and Linking for Identifying Common Speaker\",\"authors\":\"Tsun-Yat Leung, Lahiru Samarakoon, Albert Y. S. Lam\",\"doi\":\"10.1109/ASRU46091.2019.9003731\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speaker Diarization and Linking discovers “who spoke when” across recordings without any speaker enrollment. Diarization is performed on each recording separately, and the linking combines clusters of the same speaker across recordings. It is a two-step approach, however it suffers from propagating the error from diarization step to the linking step. In a situation where a unique speaker appears in a given set of recordings, this paper aims at locating the common speaker using the prior knowledge of his or her existence. That means there is no enrollment data for this common speaker. We propose Pairwise Common Speaker Identification (PCSI) method that takes the existence of a common speaker into account in contrast to the two-step approach. We further show that PCSI can be used to reduce the errors that are introduced in the diarization step of the two-step approach. Our experiments are performed on a corpus synthesised from the AMI corpus and also on a in-house conversational telephony Sichuanese corpus that is mixed with Mandarin. We show up to 7.68% relative improvements of time-weighted equal error rate over a state-of-art x-vector diarization and linking system.\",\"PeriodicalId\":150913,\"journal\":{\"name\":\"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"volume\":\"108-109 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU46091.2019.9003731\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU46091.2019.9003731","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

扬声器拨号和链接发现“谁说什么时候”在录音中没有任何扬声器登记。在每个录音上分别执行拨号,并且连接将同一扬声器的录音组合在一起。这是一种两步方法,但它的缺点是将误差从初始化步骤传播到链接步骤。在一组给定的录音中出现一个独特的说话人的情况下,本文旨在利用他或她存在的先验知识来定位共同的说话人。这意味着没有这个普通讲话者的注册数据。我们提出了双共同说话人识别(PCSI)方法,该方法考虑了共同说话人的存在,而不是两步方法。我们进一步表明,PCSI可以用来减少在两步方法的双化步骤中引入的误差。我们的实验是在一个由AMI语料库合成的语料库上进行的,也在一个内部电话会话四川语料库上进行的,该语料库与普通话混合。结果表明,与最先进的x向量化和链接系统相比,时间加权等错误率的相对提高高达7.68%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Incorporating Prior Knowledge into Speaker Diarization and Linking for Identifying Common Speaker
Speaker Diarization and Linking discovers “who spoke when” across recordings without any speaker enrollment. Diarization is performed on each recording separately, and the linking combines clusters of the same speaker across recordings. It is a two-step approach, however it suffers from propagating the error from diarization step to the linking step. In a situation where a unique speaker appears in a given set of recordings, this paper aims at locating the common speaker using the prior knowledge of his or her existence. That means there is no enrollment data for this common speaker. We propose Pairwise Common Speaker Identification (PCSI) method that takes the existence of a common speaker into account in contrast to the two-step approach. We further show that PCSI can be used to reduce the errors that are introduced in the diarization step of the two-step approach. Our experiments are performed on a corpus synthesised from the AMI corpus and also on a in-house conversational telephony Sichuanese corpus that is mixed with Mandarin. We show up to 7.68% relative improvements of time-weighted equal error rate over a state-of-art x-vector diarization and linking system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信