{"title":"在法医语音比对中平衡有效性和可靠性,因为取样存在差异","authors":"Bruce Xiao Wang , Vincent Hughes","doi":"10.1016/j.scijus.2024.10.002","DOIUrl":null,"url":null,"abstract":"<div><div>In forensic comparison sciences, experts are required to compare samples of known and unknown origin to evaluate the strength of the evidence assuming they came from the same- and different-sources. The application of <strong>valid</strong> (if the method measures what it is intended to) and <strong>reliable</strong> (if that method produces consistent results) forensic methods is required across many jurisdictions, such as the England & Wales Criminal Practice Directions 19A and UK Crown Prosecution Service and highlighted in the 2009 National Academy of Sciences report and by the President’s Council of Advisors on Science and Technology in 2016. The current study uses simulation to examine the effect of number of speakers and sampling variability and on the evaluation of validity and reliability using different generations of automatic speaker recognition (ASR) systems in forensic voice comparison (FVC). The results show that the <em>state-of-the-art</em> system had better overall validity compared with less advanced systems. However, better validity does not necessarily lead to high reliability, and very often the opposite is true. Better system validity and higher discriminability have the potential of leading to a higher degree of uncertainty and inconsistency in the output (i.e. poorer reliability). This is particularly the case when dealing with small number of speakers, where the observed data does not adequately support density estimation, resulting in extrapolation, as is commonly expected in FVC casework.</div></div>","PeriodicalId":49565,"journal":{"name":"Science & Justice","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Balancing validity and reliability as a function of sampling variability in forensic voice comparison\",\"authors\":\"Bruce Xiao Wang , Vincent Hughes\",\"doi\":\"10.1016/j.scijus.2024.10.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In forensic comparison sciences, experts are required to compare samples of known and unknown origin to evaluate the strength of the evidence assuming they came from the same- and different-sources. The application of <strong>valid</strong> (if the method measures what it is intended to) and <strong>reliable</strong> (if that method produces consistent results) forensic methods is required across many jurisdictions, such as the England & Wales Criminal Practice Directions 19A and UK Crown Prosecution Service and highlighted in the 2009 National Academy of Sciences report and by the President’s Council of Advisors on Science and Technology in 2016. The current study uses simulation to examine the effect of number of speakers and sampling variability and on the evaluation of validity and reliability using different generations of automatic speaker recognition (ASR) systems in forensic voice comparison (FVC). The results show that the <em>state-of-the-art</em> system had better overall validity compared with less advanced systems. However, better validity does not necessarily lead to high reliability, and very often the opposite is true. Better system validity and higher discriminability have the potential of leading to a higher degree of uncertainty and inconsistency in the output (i.e. poorer reliability). This is particularly the case when dealing with small number of speakers, where the observed data does not adequately support density estimation, resulting in extrapolation, as is commonly expected in FVC casework.</div></div>\",\"PeriodicalId\":49565,\"journal\":{\"name\":\"Science & Justice\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science & Justice\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1355030624001023\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, LEGAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science & Justice","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1355030624001023","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, LEGAL","Score":null,"Total":0}
Balancing validity and reliability as a function of sampling variability in forensic voice comparison
In forensic comparison sciences, experts are required to compare samples of known and unknown origin to evaluate the strength of the evidence assuming they came from the same- and different-sources. The application of valid (if the method measures what it is intended to) and reliable (if that method produces consistent results) forensic methods is required across many jurisdictions, such as the England & Wales Criminal Practice Directions 19A and UK Crown Prosecution Service and highlighted in the 2009 National Academy of Sciences report and by the President’s Council of Advisors on Science and Technology in 2016. The current study uses simulation to examine the effect of number of speakers and sampling variability and on the evaluation of validity and reliability using different generations of automatic speaker recognition (ASR) systems in forensic voice comparison (FVC). The results show that the state-of-the-art system had better overall validity compared with less advanced systems. However, better validity does not necessarily lead to high reliability, and very often the opposite is true. Better system validity and higher discriminability have the potential of leading to a higher degree of uncertainty and inconsistency in the output (i.e. poorer reliability). This is particularly the case when dealing with small number of speakers, where the observed data does not adequately support density estimation, resulting in extrapolation, as is commonly expected in FVC casework.
期刊介绍:
Science & Justice provides a forum to promote communication and publication of original articles, reviews and correspondence on subjects that spark debates within the Forensic Science Community and the criminal justice sector. The journal provides a medium whereby all aspects of applying science to legal proceedings can be debated and progressed. Science & Justice is published six times a year, and will be of interest primarily to practising forensic scientists and their colleagues in related fields. It is chiefly concerned with the publication of formal scientific papers, in keeping with its international learned status, but will not accept any article describing experimentation on animals which does not meet strict ethical standards.
Promote communication and informed debate within the Forensic Science Community and the criminal justice sector.
To promote the publication of learned and original research findings from all areas of the forensic sciences and by so doing to advance the profession.
To promote the publication of case based material by way of case reviews.
To promote the publication of conference proceedings which are of interest to the forensic science community.
To provide a medium whereby all aspects of applying science to legal proceedings can be debated and progressed.
To appeal to all those with an interest in the forensic sciences.