语音生物识别研究中伦理挑战的数据视角

IF 5

IEEE transactions on biometrics, behavior, and identity science Pub Date : 2024-08-21 DOI:10.1109/TBIOM.2024.3446846

Anna Leschanowsky;Casandra Rusti;Carolyn Quinlan;Michaela Pnacek;Lauriane Gorce;Wiebke Hutiri

{"title":"语音生物识别研究中伦理挑战的数据视角","authors":"Anna Leschanowsky;Casandra Rusti;Carolyn Quinlan;Michaela Pnacek;Lauriane Gorce;Wiebke Hutiri","doi":"10.1109/TBIOM.2024.3446846","DOIUrl":null,"url":null,"abstract":"Speaker recognition technology, deployed in sectors like banking, education, recruitment, immigration, law enforcement, and healthcare, relies heavily on biometric data. However, the ethical implications and biases inherent in the datasets driving this technology have not been fully explored. Through a longitudinal study of close to 700 papers published at the ISCA Interspeech Conference in the years 2012 to 2021, we investigate how dataset use has evolved alongside the widespread adoption of deep neural networks. Our study identifies the most commonly used datasets in the field and examines their usage patterns. The analysis reveals significant shifts in data practices since the advent of deep learning: a small number of datasets dominate speaker recognition training and evaluation, and the majority of studies evaluate their systems on a single dataset. For four key datasets–Switchboard, Mixer, VoxCeleb, and ASVspoof–we conduct a detailed analysis of metadata and collection methods to assess ethical concerns and privacy risks. Our study highlights numerous challenges related to sampling bias, re-identification, consent, disclosure of sensitive information and security risks in speaker recognition datasets, and emphasizes the need for more representative, fair, and privacy-aware data collection in this domain.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 1","pages":"118-131"},"PeriodicalIF":5.0000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Data Perspective on Ethical Challenges in Voice Biometrics Research\",\"authors\":\"Anna Leschanowsky;Casandra Rusti;Carolyn Quinlan;Michaela Pnacek;Lauriane Gorce;Wiebke Hutiri\",\"doi\":\"10.1109/TBIOM.2024.3446846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speaker recognition technology, deployed in sectors like banking, education, recruitment, immigration, law enforcement, and healthcare, relies heavily on biometric data. However, the ethical implications and biases inherent in the datasets driving this technology have not been fully explored. Through a longitudinal study of close to 700 papers published at the ISCA Interspeech Conference in the years 2012 to 2021, we investigate how dataset use has evolved alongside the widespread adoption of deep neural networks. Our study identifies the most commonly used datasets in the field and examines their usage patterns. The analysis reveals significant shifts in data practices since the advent of deep learning: a small number of datasets dominate speaker recognition training and evaluation, and the majority of studies evaluate their systems on a single dataset. For four key datasets–Switchboard, Mixer, VoxCeleb, and ASVspoof–we conduct a detailed analysis of metadata and collection methods to assess ethical concerns and privacy risks. Our study highlights numerous challenges related to sampling bias, re-identification, consent, disclosure of sensitive information and security risks in speaker recognition datasets, and emphasizes the need for more representative, fair, and privacy-aware data collection in this domain.\",\"PeriodicalId\":73307,\"journal\":{\"name\":\"IEEE transactions on biometrics, behavior, and identity science\",\"volume\":\"7 1\",\"pages\":\"118-131\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biometrics, behavior, and identity science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10643178/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10643178/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

语音识别技术在银行、教育、招聘、移民、执法和医疗保健等领域的应用严重依赖生物识别数据。然而，驱动这项技术的数据集中固有的伦理影响和偏见尚未得到充分探索。通过对2012年至2021年ISCA Interspeech会议上发表的近700篇论文的纵向研究，我们调查了随着深度神经网络的广泛采用，数据集的使用是如何演变的。我们的研究确定了该领域最常用的数据集，并检查了它们的使用模式。分析显示，自深度学习出现以来，数据实践发生了重大变化：少数数据集主导着说话人识别训练和评估，大多数研究在单个数据集上评估他们的系统。对于四个关键数据集——switchboard、Mixer、VoxCeleb和asvspoon——我们进行了详细的元数据分析和收集方法，以评估道德问题和隐私风险。我们的研究强调了说话人识别数据集中与抽样偏差、重新识别、同意、敏感信息披露和安全风险相关的众多挑战，并强调了在这一领域中需要更具代表性、公平性和隐私意识的数据收集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Data Perspective on Ethical Challenges in Voice Biometrics Research

Speaker recognition technology, deployed in sectors like banking, education, recruitment, immigration, law enforcement, and healthcare, relies heavily on biometric data. However, the ethical implications and biases inherent in the datasets driving this technology have not been fully explored. Through a longitudinal study of close to 700 papers published at the ISCA Interspeech Conference in the years 2012 to 2021, we investigate how dataset use has evolved alongside the widespread adoption of deep neural networks. Our study identifies the most commonly used datasets in the field and examines their usage patterns. The analysis reveals significant shifts in data practices since the advent of deep learning: a small number of datasets dominate speaker recognition training and evaluation, and the majority of studies evaluate their systems on a single dataset. For four key datasets–Switchboard, Mixer, VoxCeleb, and ASVspoof–we conduct a detailed analysis of metadata and collection methods to assess ethical concerns and privacy risks. Our study highlights numerous challenges related to sampling bias, re-identification, consent, disclosure of sensitive information and security risks in speaker recognition datasets, and emphasizes the need for more representative, fair, and privacy-aware data collection in this domain.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on biometrics, behavior, and identity science

CiteScore

10.90

自引率

0.00%

发文量