{"title":"Glottal fry and voice disguise: a case study in forensic phonetics","authors":"A. Hirson , M. Duckworth","doi":"10.1016/0141-5425(93)90115-F","DOIUrl":null,"url":null,"abstract":"<div><p>In recent legal proceedings, forensic phoneticians were called upon to analyse a tape-recorded message intended for the blackmail of a bank manager following the kidnap of his wife. The brief was to establish the likelihood that the tape recording may have been made by any one of three suspects, samples of whose speech were also made available. The comparison was greatly complicated by voice disguise employed by the speaker who recorded the kidnap tape. This disguise comprised a form of phonation described phonetically as ‘glottal fry’ or vocal ‘creak’. This form of phonation occurs normally in normal speech, but it has received most attention in relation to voice pathologies. On the other hand there are few references to its use as a form of voice disguise. This paper discusses the nature of the creak, and examines its effectiveness as voice disguise. In addition, a method is described for speaker identification regardless of the disguise. Results indicate that trained listeners without repeated presentations or instrumentation are able to match speakers with 65% accuracy when one voice is creaky, compared with 90% accuracy for undisguised voices. Using a Euclidean metric to compare the power spectra of the [s] sound, we find that creaky disguised voices may be correctly matched with the undisguised voice of the same speaker (9 distracters) in 5 cases out of 10. However, when the computer's task is made more similar to the perceptual task, selecting one speaker out of two, it achieves an accuracy of 81%. Implications for forensic phonetics are discussed.</p></div>","PeriodicalId":75992,"journal":{"name":"Journal of biomedical engineering","volume":"15 3","pages":"Pages 193-200"},"PeriodicalIF":0.0000,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0141-5425(93)90115-F","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of biomedical engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/014154259390115F","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27
Abstract
In recent legal proceedings, forensic phoneticians were called upon to analyse a tape-recorded message intended for the blackmail of a bank manager following the kidnap of his wife. The brief was to establish the likelihood that the tape recording may have been made by any one of three suspects, samples of whose speech were also made available. The comparison was greatly complicated by voice disguise employed by the speaker who recorded the kidnap tape. This disguise comprised a form of phonation described phonetically as ‘glottal fry’ or vocal ‘creak’. This form of phonation occurs normally in normal speech, but it has received most attention in relation to voice pathologies. On the other hand there are few references to its use as a form of voice disguise. This paper discusses the nature of the creak, and examines its effectiveness as voice disguise. In addition, a method is described for speaker identification regardless of the disguise. Results indicate that trained listeners without repeated presentations or instrumentation are able to match speakers with 65% accuracy when one voice is creaky, compared with 90% accuracy for undisguised voices. Using a Euclidean metric to compare the power spectra of the [s] sound, we find that creaky disguised voices may be correctly matched with the undisguised voice of the same speaker (9 distracters) in 5 cases out of 10. However, when the computer's task is made more similar to the perceptual task, selecting one speaker out of two, it achieves an accuracy of 81%. Implications for forensic phonetics are discussed.