Logistic Similarity Metric Learning via Affinity Matrix for Text-Independent Speaker Verification

Junyi Peng, Rongzhi Gu, Yuexian Zou
{"title":"Logistic Similarity Metric Learning via Affinity Matrix for Text-Independent Speaker Verification","authors":"Junyi Peng, Rongzhi Gu, Yuexian Zou","doi":"10.1109/ASRU46091.2019.9003995","DOIUrl":null,"url":null,"abstract":"This paper proposes a novel objective function, called Logistic Affinity Loss (Logistic-AL), to optimize the end-to-end speaker verification model. Specifically, firstly, the cosine similarities of all pairs in a mini-batch of speaker embeddings are passed through a learnable logistic regression layer and the probability estimation of all pairs is obtained. Then, the supervision information for each pair is formed by their corresponding one-hot speaker labels, which indicates whether the pair belongs to the same speaker. Finally, the model is optimized by the binary cross entropy between predicted probability and target. In contrast to the other distance metric learning methods that push the distance of similar/dissimilar pairs to a pre-defined target, Logistic-AL builds a learnable decision boundary to distinguish the similar pairs and dissimilar pairs. Experimental results on the VoxCeleb1 dataset show that the x-vector feature extractor optimized by Logistic-AL achieves state-of-the-art performance.","PeriodicalId":150913,"journal":{"name":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU46091.2019.9003995","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This paper proposes a novel objective function, called Logistic Affinity Loss (Logistic-AL), to optimize the end-to-end speaker verification model. Specifically, firstly, the cosine similarities of all pairs in a mini-batch of speaker embeddings are passed through a learnable logistic regression layer and the probability estimation of all pairs is obtained. Then, the supervision information for each pair is formed by their corresponding one-hot speaker labels, which indicates whether the pair belongs to the same speaker. Finally, the model is optimized by the binary cross entropy between predicted probability and target. In contrast to the other distance metric learning methods that push the distance of similar/dissimilar pairs to a pre-defined target, Logistic-AL builds a learnable decision boundary to distinguish the similar pairs and dissimilar pairs. Experimental results on the VoxCeleb1 dataset show that the x-vector feature extractor optimized by Logistic-AL achieves state-of-the-art performance.
基于亲和矩阵的逻辑相似度度量学习用于文本无关说话人验证
本文提出了一种新的目标函数Logistic亲和损失(Logistic- al)来优化端到端说话人验证模型。具体而言,首先通过可学习逻辑回归层传递小批量说话人嵌入中所有对的余弦相似度,得到所有对的概率估计;然后,对每一对的监督信息由其对应的单热扬声器标签组成,该标签表示对是否属于同一扬声器。最后,利用预测概率与目标间的二值交叉熵对模型进行优化。与其他距离度量学习方法将相似/不相似对的距离推到预定目标不同,Logistic-AL建立了一个可学习的决策边界来区分相似对和不相似对。在VoxCeleb1数据集上的实验结果表明,Logistic-AL优化的x向量特征提取器达到了最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信