Towards Controlling False Alarm - Miss Trade-Off in Perceptual Speaker Comparison via Non-Neutral Listening Task Framing

Rosa González Hautamäki, T. Kinnunen
{"title":"Towards Controlling False Alarm - Miss Trade-Off in Perceptual Speaker Comparison via Non-Neutral Listening Task Framing","authors":"Rosa González Hautamäki, T. Kinnunen","doi":"10.1109/ASRU46091.2019.9003978","DOIUrl":null,"url":null,"abstract":"Speaker comparison by listening is a valuable resource, for instance, in human voice discrimination studies, and voice conversion (VC) systems evaluations. Usually, listeners are provided with application-neutral guidelines that encourage retaining overall high speaker discrimination accuracy. Nonetheless, listeners are subject to misses (declaring same-speaker trial as different-speaker) and false alarms (vice versa) with possibly non-symmetric outcomes. In automatic speaker verification (ASV) applications, the consequences of a miss and a false alarm are rarely equal, and decision making policy is adjusted towards a given application with a desired miss/false alarm trade-off. We study whether listener decisions could similarly be controlled to provoke more accept (or reject) decisions, by framing the voice comparison task in different ways. Our neutral, forensic, user-convenient bank and secure bank scenarios are played by disjoint panels (through Amazon's Mechanical Turk), all judging the same speaker trials originated from RedDots and 2018 Voice Conversion Challenge (VCC 2018) data. Our results indicate that listener decisions can be influenced by modifying the task framing. As a subjective task, the challenge is how to drive the panel decisions to the desired direction (to reduce miss or false alarm rate). Our preliminary results suggest potential for novel, application-directed speaker discrimination designs.","PeriodicalId":150913,"journal":{"name":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU46091.2019.9003978","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Speaker comparison by listening is a valuable resource, for instance, in human voice discrimination studies, and voice conversion (VC) systems evaluations. Usually, listeners are provided with application-neutral guidelines that encourage retaining overall high speaker discrimination accuracy. Nonetheless, listeners are subject to misses (declaring same-speaker trial as different-speaker) and false alarms (vice versa) with possibly non-symmetric outcomes. In automatic speaker verification (ASV) applications, the consequences of a miss and a false alarm are rarely equal, and decision making policy is adjusted towards a given application with a desired miss/false alarm trade-off. We study whether listener decisions could similarly be controlled to provoke more accept (or reject) decisions, by framing the voice comparison task in different ways. Our neutral, forensic, user-convenient bank and secure bank scenarios are played by disjoint panels (through Amazon's Mechanical Turk), all judging the same speaker trials originated from RedDots and 2018 Voice Conversion Challenge (VCC 2018) data. Our results indicate that listener decisions can be influenced by modifying the task framing. As a subjective task, the challenge is how to drive the panel decisions to the desired direction (to reduce miss or false alarm rate). Our preliminary results suggest potential for novel, application-directed speaker discrimination designs.
利用非中立聆听任务框架控制感知说话人比较中的虚警缺失权衡
通过听来比较说话者是一种宝贵的资源,例如,在人类声音识别研究和语音转换(VC)系统评估中。通常,为听众提供了与应用无关的指导方针,以鼓励保持总体上较高的说话人识别准确性。尽管如此,听众还是有可能出现漏听(将同一讲话者试验声明为不同讲话者试验)和假警报(反之亦然),结果可能不对称。在自动说话人验证(ASV)应用中,漏报和误报的后果很少是相等的,决策制定策略根据给定的应用进行调整,以实现期望的漏报/误报权衡。我们通过以不同的方式构建语音比较任务,研究听者的决定是否可以同样被控制,以激发更多的接受(或拒绝)决定。我们的中立、法医学、用户便捷的银行和安全银行场景由不相关的面板(通过亚马逊的土耳其机器人)扮演,所有这些面板都对来自RedDots和2018年语音转换挑战(VCC 2018)数据的相同扬声器试验进行判断。我们的研究结果表明,听者的决策可以通过修改任务框架来影响。作为一项主观任务,挑战是如何将面板决策驱动到期望的方向(以减少误报或误报率)。我们的初步结果表明,有可能出现新颖的、应用导向的说话人识别设计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信